Skip to content

Conversation

@AkosLukacs
Copy link

@AkosLukacs AkosLukacs commented Sep 17, 2025

Summary

There is an error in the docstring of AsyncWebCrawler.arun: the parameter is called config, not crawler_config.

List of files changed and why

crawl4ai/async_webcrawler.py - see summary

How Has This Been Tested?

This is a docstring change, I assume it will work :)

Summary by CodeRabbit

  • Refactor
    • Aligned internal crawler invocation to use a standardized configuration parameter for consistency; no behavior or API changes.
  • Documentation
    • Updated docstrings to reflect current configuration parameter naming and usage.
  • Style
    • Minor formatting cleanup (newline at end of file).

No user-facing changes or breaking changes in this release.

The parameter is called `config`, not `crawler_config`.
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 17, 2025

Walkthrough

Updated an internal call to pass a config parameter to arun, adjusted the docstring to reflect the parameter name, and made a newline-at-EOF formatting change. No public API signatures were changed.

Changes

Cohort / File(s) Summary
Async crawler invocation & docs
crawl4ai/async_webcrawler.py
Replaced arun call argument from crawler_config to config; updated docstring Args section to match; no behavioral change; minor EOF newline formatting adjustment.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

I thump my foot at configs past,
Now “config” hops in, clean and fast.
No APIs drift, no fields astray,
Just tidy trails along the way.
Carrot commit, nibble-sized and neat—
Code fields trimmed, a perfect treat. 🥕🐇

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title Check ✅ Passed The pull request title succinctly and accurately describes the primary change: correcting the parameter name in AsyncWebCrawler.arun's docstring. It is specific, focused, and appropriate for a teammate scanning the commit history.
Description Check ✅ Passed The PR description includes the required Summary and List of files changed sections and correctly states that only a docstring parameter name in crawl4ai/async_webcrawler.py was updated, so it aligns with the template's main structure. The "How Has This Been Tested?" section is present but insufficiently detailed ("I assume it will work") and the repository Checklist from the template is not included. Because this is a non-functional documentation fix the description is otherwise acceptable, but it should document verification steps and include the checklist.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  • Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
  • Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
crawl4ai/async_webcrawler.py (2)

230-237: Docstring nit: annotate param type and correct return type.

arun returns a CrawlResultContainer, not CrawlResult. Also add the param type for clarity.

Apply:

-            config: Configuration object controlling crawl behavior
+            config (CrawlerRunConfig): Controls crawl behavior (cache, UA, filters, extraction, etc.)
@@
-        Returns:
-            CrawlResult: The result of crawling and processing
+        Returns:
+            CrawlResultContainer: Container with the crawl result and related metadata.

852-852: Use TypeError for invalid type and tighten message.

Aligns with Ruff TRY004/TRY003 and clarifies accepted types (code supports list/tuple).

Apply:

-            raise ValueError("`domain_or_domains` must be a string or a list of strings.")
+            raise TypeError(
+                f"`domain_or_domains` must be str or list/tuple[str]; got {type(domain_or_domains).__name__}."
+            )
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e651e04 and f25d208.

📒 Files selected for processing (1)
  • crawl4ai/async_webcrawler.py (2 hunks)
🧰 Additional context used
🪛 Ruff (0.12.2)
crawl4ai/async_webcrawler.py

852-852: Prefer TypeError exception for invalid type

(TRY004)


852-852: Avoid specifying long messages outside the exception class

(TRY003)

...
)
result = await crawler.arun(url="https://example.com", crawler_config=config)
result = await crawler.arun(url="https://example.com", config=config)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

🧩 Analysis chain

Docstring example fix matches the API. LGTM.

The example now correctly uses config=config and aligns with the arun signature.

Run this to catch any remaining references to the old name in docs/comments:


🏁 Script executed:

#!/bin/bash
rg -n --hidden -S -C2 '\bcrawler_config\b'

Length of output: 165130


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "1) All occurrences of the token 'crawler_config=' (kwarg or assignment):"
rg -n --hidden -S -C2 '\bcrawler_config\s*=' || true

echo
echo "2) Calls using 'crawler_config=' as a keyword argument (arun / arun_many / crawl / client.crawl):"
rg -n --hidden -S -C2 -P 'arun\([^)]*\bcrawler_config\s*=' || true
rg -n --hidden -S -C2 -P 'arun_many\([^)]*\bcrawler_config\s*=' || true
rg -n --hidden -S -C2 -P '\bcrawl\([^)]*\bcrawler_config\s*=' || true

echo
echo "3) JSON/dict keys named 'crawler_config':"
rg -n --hidden -S -C2 '"?crawler_config"?:' || true

Length of output: 114100


LGTM — docstring example updated; fix remaining docs using legacy 'crawler_config' kwarg

crawl4ai/async_webcrawler.py example matches arun(config=...). Update remaining examples that pass crawler_config= to use config= — notably deploy/docker/c4ai-code-context.md:1578 (change crawler.arun(..., crawler_config=config) → crawler.arun(..., config=config)).

🤖 Prompt for AI Agents
In crawl4ai/async_webcrawler.py around line 228, the example call uses the new
keyword name config (result = await crawler.arun(url="https://example.com",
config=config)); update any remaining examples that still use the legacy
crawler_config keyword to instead pass config, e.g., change crawler.arun(...,
crawler_config=config) → crawler.arun(..., config=config) in
deploy/docker/c4ai-code-context.md at or near line 1578 and any other docs or
examples across the repo so all usages are consistent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants