Skip to content

Fetching pages with WebDriver

Radu Ursache edited this page Dec 28, 2021 · 37 revisions

Javascript support / Fetching via WebDriver

The backend can be configured to fetch pages via Chrome using the built in WebDriver network interface, this is mainly used where the pages you are watching are using Javascript to render the page content, The easiest way is to enable it is to uncomment the following in docker-compose.yml and restart your docker-compose.

Note: The selenium/standalone-chrome-debug:3.141.59 image does NOT support ARM/RaspberryPi or Windows type devices, please don't make a support request that it does not work when rPi - it will be deleted.

    browser-chrome:
        hostname: browser-chrome
        image: selenium/standalone-chrome-debug:3.141.59
        volumes:
            # Workaround to avoid the browser crashing inside a docker container
            # See https://github.com/SeleniumHQ/docker-selenium#quick-start
            - /dev/shm:/dev/shm
        restart: unless-stopped

If using docker (instead of docker-compose) the following will get ChangeDetection.io and the chromium WebDriver up and running:

docker run -d \
  --name changedetection.io \
  --restart always \
  -p 5000:5000 \
  -e WEBDRIVER_URL="http://localhost:4444/wd/hub" \
  -v datastore-volume:/datastore \
  dgtlmoon/changedetection.io

docker run -d \
  --name selenium \
  -p 4444:4444 \
  --shm-size="2g" \
  seleniarm/standalone-chromium

Then visit /settings and [Fetching] tab and enable the WebDriver/Chrome option

The URL for the WebDriver interface is set with the WEBDRIVER_URL environment variable (http://browser-chrome:4444/wd/hub by default)

There's a few things left todo

  • Make tests use the full docker stack with the chromedriver + relevant settings
  • Lower the number of workers (make it configurable per backend?) 10 is too many (is it?), but this is totally fine for requests/plaintext
  • Some kind of dom-wait-until-fully loaded or other
  • Handle failures better (better feedback so we know if its from chromedriver, or from the website)
  • Save last screenshot? (This extends to saving an entire page screenshot, abstracting out the handler for diff-management, then supplying an image-diff handler)