-
-
Couldn't load subscription status.
- Fork 5.5k
Docs: Wrong parameter name in AsyncWebCrawler.arun docstring #1494
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
The parameter is called `config`, not `crawler_config`.
WalkthroughUpdated an internal call to pass a config parameter to arun, adjusted the docstring to reflect the parameter name, and made a newline-at-EOF formatting change. No public API signatures were changed. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests
Tip 👮 Agentic pre-merge checks are now available in preview!Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.
Please see the documentation for more information. Example: reviews:
pre_merge_checks:
custom_checks:
- name: "Undocumented Breaking Changes"
mode: "warning"
instructions: |
Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).Please share your feedback with us on this Discord post. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
crawl4ai/async_webcrawler.py (2)
230-237: Docstring nit: annotate param type and correct return type.
arunreturns aCrawlResultContainer, notCrawlResult. Also add the param type for clarity.Apply:
- config: Configuration object controlling crawl behavior + config (CrawlerRunConfig): Controls crawl behavior (cache, UA, filters, extraction, etc.) @@ - Returns: - CrawlResult: The result of crawling and processing + Returns: + CrawlResultContainer: Container with the crawl result and related metadata.
852-852: Use TypeError for invalid type and tighten message.Aligns with Ruff TRY004/TRY003 and clarifies accepted types (code supports list/tuple).
Apply:
- raise ValueError("`domain_or_domains` must be a string or a list of strings.") + raise TypeError( + f"`domain_or_domains` must be str or list/tuple[str]; got {type(domain_or_domains).__name__}." + )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
crawl4ai/async_webcrawler.py(2 hunks)
🧰 Additional context used
🪛 Ruff (0.12.2)
crawl4ai/async_webcrawler.py
852-852: Prefer TypeError exception for invalid type
(TRY004)
852-852: Avoid specifying long messages outside the exception class
(TRY003)
| ... | ||
| ) | ||
| result = await crawler.arun(url="https://example.com", crawler_config=config) | ||
| result = await crawler.arun(url="https://example.com", config=config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
🧩 Analysis chain
Docstring example fix matches the API. LGTM.
The example now correctly uses config=config and aligns with the arun signature.
Run this to catch any remaining references to the old name in docs/comments:
🏁 Script executed:
#!/bin/bash
rg -n --hidden -S -C2 '\bcrawler_config\b'Length of output: 165130
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "1) All occurrences of the token 'crawler_config=' (kwarg or assignment):"
rg -n --hidden -S -C2 '\bcrawler_config\s*=' || true
echo
echo "2) Calls using 'crawler_config=' as a keyword argument (arun / arun_many / crawl / client.crawl):"
rg -n --hidden -S -C2 -P 'arun\([^)]*\bcrawler_config\s*=' || true
rg -n --hidden -S -C2 -P 'arun_many\([^)]*\bcrawler_config\s*=' || true
rg -n --hidden -S -C2 -P '\bcrawl\([^)]*\bcrawler_config\s*=' || true
echo
echo "3) JSON/dict keys named 'crawler_config':"
rg -n --hidden -S -C2 '"?crawler_config"?:' || trueLength of output: 114100
LGTM — docstring example updated; fix remaining docs using legacy 'crawler_config' kwarg
crawl4ai/async_webcrawler.py example matches arun(config=...). Update remaining examples that pass crawler_config= to use config= — notably deploy/docker/c4ai-code-context.md:1578 (change crawler.arun(..., crawler_config=config) → crawler.arun(..., config=config)).
🤖 Prompt for AI Agents
In crawl4ai/async_webcrawler.py around line 228, the example call uses the new
keyword name config (result = await crawler.arun(url="https://example.com",
config=config)); update any remaining examples that still use the legacy
crawler_config keyword to instead pass config, e.g., change crawler.arun(...,
crawler_config=config) → crawler.arun(..., config=config) in
deploy/docker/c4ai-code-context.md at or near line 1578 and any other docs or
examples across the repo so all usages are consistent.
Summary
There is an error in the docstring of AsyncWebCrawler.arun: the parameter is called
config, notcrawler_config.List of files changed and why
crawl4ai/async_webcrawler.py - see summary
How Has This Been Tested?
This is a docstring change, I assume it will work :)
Summary by CodeRabbit
No user-facing changes or breaking changes in this release.