Skip to content

Conversation

@Frank-Jie
Copy link

@Frank-Jie Frank-Jie commented Oct 16, 2025

…ental navigation

Summary

  1. Simulate realistic mouse movement (by dynamic generate smooth, variable-speed cursor)
  2. remove mouse click actions because they may trigger page navigation
    when set simulate_user=True

mouse click at 100,100 may trigger page navigation which may raise exception if process_iframes=Ture
(at : page.query_selector_all("iframe"))

Summary by CodeRabbit

  • New Features
    • Improved crawler user-simulation: mouse movements now follow smooth, curved, randomized trajectories with slight timing variations and viewport-aware start/end points, producing more natural, human-like interaction patterns instead of straight-line motions.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 16, 2025

Walkthrough

Replaces straight-line mouse moves in the async crawler strategy with randomized quadratic Bezier trajectories: viewport-aware start/end points, a randomized control point, stepped movement along the curve, and small random delays between steps. Overall user-simulation flow is preserved.

Changes

Cohort / File(s) Summary
Enhanced mouse simulation with Bezier curves
crawl4ai/async_crawler_strategy.py
Added random import and refactored mouse interaction logic to compute randomized quadratic Bezier control points, move the cursor along the curved path in multiple steps, and introduce tiny randomized delays between steps to simulate human-like motion

Sequence Diagram(s)

sequenceDiagram
  participant Crawler
  participant Strategy as AsyncCrawlerStrategy
  participant Browser
  participant Mouse

  rect #F0F9FF
    Crawler->>Strategy: trigger user-simulation
    Strategy->>Browser: get viewport size (width,height)
    note right of Strategy #F6FFF0: choose random start/end\nwithin viewport and control offsets
    Strategy->>Strategy: compute quadratic Bezier control point
    Strategy->>Mouse: for each step -> move to next (x,y)
    Mouse-->>Browser: dispatch move events (with tiny random delays)
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I hop and draw a curving trail,
From corner start to soft-detail,
A gentle bend, small pauses too—
The crawler moves as rabbits do. 🥕✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description Check ⚠️ Warning The pull request description is largely incomplete and does not follow the required template structure. While a "Summary" section is present, it lacks proper issue linking (no "Fixes #" reference) and the explanation is somewhat vague. The description is missing the "List of files changed and why" section (only crawl4ai/async_crawler_strategy.py was modified but is not explicitly documented), the "How Has This Been Tested?" section, and the entire checklist with the six required items. The description does convey the general intent of the changes but fails to meet the repository's documentation standards. Complete the pull request description by adding the missing sections: explicitly list which files were changed and why (e.g., crawl4ai/async_crawler_strategy.py - to implement realistic mouse movement simulation and remove click events), include a "How Has This Been Tested?" section describing the testing performed, and add the full checklist with all six items. Additionally, consider linking any related GitHub issues using the "Fixes #123" format if applicable. This will ensure the description meets the repository's template requirements and provides reviewers with complete context.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The pull request title "Simulate realistic mouse movement; remove click events to avoid accid..." (truncated at 70 characters) clearly and specifically summarizes the two main changes in this pull request: the implementation of realistic mouse movement via curved Bezier trajectories and the removal of mouse click actions. The title is directly related to the changeset as evidenced by both the raw summary showing the addition of curved mouse path simulation and the PR objectives confirming the removal of click events to prevent accidental navigation. The phrasing is concise and specific enough for a teammate scanning the repository history to understand the primary improvements without being vague or generic.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
crawl4ai/async_crawler_strategy.py (1)

928-930: Consider slower delays for more human-like movement.

The current delay range of 0.001-0.003 seconds (1-3ms) between steps results in very fast mouse movements (15-75ms total for 15-25 steps). Typical human cursor movements take 300-1000ms. While fast movements might be acceptable for anti-detection purposes, consider whether a slightly longer delay range (e.g., 0.005-0.015 seconds) would better simulate natural human behavior without compromising the goal of avoiding detection.

If you'd like to adjust the timing, consider:

                 await page.mouse.move(x, y)
                 # Random small delay between movements to simulate human behavior
-                await asyncio.sleep(random.uniform(0.001, 0.003))
+                await asyncio.sleep(random.uniform(0.005, 0.015))
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e651e04 and df1fb26.

📒 Files selected for processing (1)
  • crawl4ai/async_crawler_strategy.py (2 hunks)
🧰 Additional context used
🪛 Ruff (0.14.0)
crawl4ai/async_crawler_strategy.py

908-908: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)


909-909: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)


910-910: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)


911-911: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)


915-915: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)


916-916: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)


919-919: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)


930-930: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

🔇 Additional comments (1)
crawl4ai/async_crawler_strategy.py (1)

6-6: LGTM! Import is appropriate for the use case.

The random module is correctly used here for simulating human-like mouse movements. The static analysis warnings (S311) about cryptographic suitability are false positives in this context, as the randomness is for UI simulation, not security purposes.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
crawl4ai/async_crawler_strategy.py (1)

906-914: Good fix: viewport_size None guard added.

Prevents TypeError and resolves the earlier review concern. Looks solid.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between df1fb26 and cb7421d.

📒 Files selected for processing (1)
  • crawl4ai/async_crawler_strategy.py (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
crawl4ai/async_crawler_strategy.py (1)
tests/async/test_dispatchers.py (1)
  • browser_config (17-18)
🪛 Ruff (0.14.0)
crawl4ai/async_crawler_strategy.py

918-918: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)


919-919: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)


920-920: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)


921-921: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)


925-925: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)


926-926: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)


929-929: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)


940-940: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant