In this quick tutorial, we will see how to scroll slowly to the bottom of a web page, using Puppeteer Java and Playwright. Playwright is a new browser automation framework developed by Microsoft. It is based on JavaScript Puppeteer. Look here for a quick tutorial on how to use Playwright in Java.
1) Compute the scroll height
To compute the page scroll height, we can evaluate the following JavaScript code:
document.documentElement.scrollHeight
Playwright allows us to do it easily in Java:
com.microsoft.playwright.Page page;
int scrollHeight = (int) page.evaluate("document.documentElement.scrollHeight");
(where page is a Playwright Page, as obtained here).
2) Scroll until page bottom is reached
To scroll to a given height, we can evaluate the following JavaScript code:
window.scrollTo(0, height)
Then we can scroll step by step in a Java while loop as follows:
int scrollTo = 0;
while (scrollTo < scrollHeight) {
page.evaluate("window.scrollTo(0, " + scrollTo + ")");
scrollTo += SCROLL_STEP;
Thread.sleep(SLEEP_MILLISEC);
}
(replace SCROLL_STEP and SLEEP_MILLISEC with your own values). SCROLL_STEP represents the number of pixels that we skip at each scroll, and SLEEP_MILLISEC is the number of millisecond we wait between each scroll. It is important to wait between scrolls, so that we are sure all lazy resources are loaded by Playwright.
In the code above, we increment the variable <scrollTo > after each scroll, and we stop once it reaches <scrollHeight>.
That’s it for this tutorial ! If you have questions, please leave a reply below, we answer within 24h.