Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable rescue & retry in Ferrum::Frame::Runtime#call on existing nodes #360

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

stevschmid
Copy link

Cuprite/Ferrum is an amazing project, thank you for all the hard work! Our migration from Selenium so far has been painless. One spec gave us troubly though. The spec tests a report page which contains a table. The spec clicks on filters which causes the page to re-fetch the table and replace it. After each click, we test for the new table content.

click_on 'Filter'
expect(page).to have_selector(:td, text: 'Filtered Foo')

This heavy spec generally takes around 15 seconds to execute. Yet, with Cuprite, it fails half the time ("could not find td with text: 'Filtered Foo'"). Although increasing the Capyara.default_max_wait_time to 30 seconds makes make this spec more robust, it also occasionally increases the runtime to over a minute. Given that the AJAX request to fetch the new table only takes a few milliseconds, the network requests does not appear to be the cause of this sporadic slowdown issue.

I tracked the cause down to the retry in Ferrum::Frame::Runtime#call

errors = [NodeNotFoundError, NoExecutionContextError]
Utils::Attempt.with_retry(errors: errors, max: INTERMITTENT_ATTEMPTS, wait: INTERMITTENT_SLEEP) do

After Capybara collects all the td nodes, it queries these nodes for a potential text match. If these nodes were replaced in the meantime (i.e. after the collection but before the execution of the match query), the query on each of these stale, non-existent tds would cause the Ferrum::Frame::Runtime#call to retry & finally timeout (roughly 0.5s for each node). Depending on the size of the collection, the query can take multiple seconds (accumulation of timeouts), causing the spec to fail (or a heavy increase in runtime given a high enough Capybara.default_max_wait_time).

In my limited understanding, the retry upon encountering NodeNotFoundError on a known node seems unnecessary. Chrome tells us that the node with the given id is not around anymore (code 32000, "Could not find node with given id"), so a retry will never fix this.

The PR disables the rescuing and retrying when querying on an existing node. With this change, our troublesome spec passes consistently and with a stable runtime.

Please let me know if I've overlooked any potential side effects or if there are additional changes needed, such as more tests.

…odes

Fixes intermittent failures and improves performance in tests with dynamic content by disabling rescue and retry in Ferrum::Frame::Runtime#call
@stevschmid
Copy link
Author

A side note: With the change above, I encountered some flakiness in a few specs using an older version of Turbo. I found some reports about intermittent failing specs with Capybara using Turbo. Turns out, the with_retry had the unexpected benefit to mask this issue (since it introduces a small "sleep" after a node has been removed). Luckily the issue has been resolved. Upgrading from 7.1.0 to the latest version 7.3.0 solved the flakiness.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant