feat: Added a way to pass both BrowserConfig and CrawlerConfig through /crawler api endpoint #1266

arturbacilla · 2025-07-03T00:55:30Z

Summary

Considering the following scenario:
A page to be crawled must have been logged in beforehand. The login information is stored (usually) in cookies, that can be provided in BrowserConfig and that works like a charm. But if one need to filter some tags by using excluded_tags, taget_elements or css_selector (that are provided in CrawlerRunConfig) it is not possible.
That can be managed through scripting, of course, in a custom file. But not if one is using the crawler through docker on the default endpoint.
This PR focus on changing that behaviour. Now you can pass in body both.

List of files changed and why

docker/server.py - Made changes on API /crawl and /crawl/stream endpoints to handle cases where no .configs where passed on body so as no browser_config and/nor crawler_config inside configs.

docker/schemas.py - Changed type to fit new schema

How Has This Been Tested?

Tested calling the API using Postman. Cases:

no configs;
configs only with browser_config or crawler_config
configs with both

Checklist:

My code follows the style guidelines of this project
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have added/updated unit tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

Summary by CodeRabbit

New Features
- Added endpoints to list and run custom scripts directly through the API.
- Enabled specifying optional browser and crawler configuration settings when submitting crawl requests.
Refactor
- Improved the structure of crawl requests by grouping configuration options into a dedicated section, simplifying advanced crawling settings management.
Chores
- Updated Docker Compose configuration with enhanced container runtime, networking settings, and script volume mounting for improved local development.

…ugh /crawler api endpoint

coderabbitai · 2025-07-03T00:55:36Z

"""

Walkthrough

The changes add new Pydantic models CrawlConfigs and ScriptRequest to support enhanced crawl configurations and script execution requests. The FastAPI server gains two new endpoints: /scripts to list available scripts and /scripts/run to execute a specified script asynchronously. The Docker Compose setup is updated with refined service runtime settings, volume mounts for scripts, and external network configuration.

Changes

File(s)	Change Summary
deploy/docker/schemas.py	Added `CrawlConfigs` model with optional `browser_config` and `crawler_config` dicts; updated `CrawlRequest` to include optional `configs` field; added `ScriptRequest` model with `script_name` field.
deploy/docker/server.py	Added `/scripts` GET endpoint to list script files; added `/scripts/run` POST endpoint to execute scripts asynchronously with validation and error handling; refactored `/crawl` and `/crawl/stream` to extract configs from nested `configs` field; added imports for `os` and `subprocess`.
docker-compose.yml	Removed version and environment variables from base config; added container runtime options (`container_name`, `privileged`, `user`, `hostname`), external network attachment, and volume mount for scripts in `crawl4ai` service; declared external network `demo`.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Server
    participant Subprocess

    Client->>Server: GET /scripts
    Server->>Server: List files in /app/scripts
    Server-->>Client: Return list of script filenames

    Client->>Server: POST /scripts/run (script_name)
    Server->>Server: Validate script_name and existence
    Server->>Subprocess: Run script asynchronously
    Subprocess-->>Server: Return stdout/stderr
    alt Output or error present
        Server-->>Client: HTTP 400 with output/error
    else No output or error
        Server-->>Client: HTTP 200 with output
    end

Poem

🐰✨ Scripts to run and configs to share,
New endpoints hop in with flair!
List all the scripts, then run with a cheer,
Errors or outputs, all will appear.
Docker tunes up the network and space,
Our crawl bunny’s ready to race!
🚀🐇
"""

✨ Finishing Touches

📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

deploy/docker/schemas.py (1)
7-9: Fix formatting issue: Add blank line before class definition.

Static analysis correctly identified a PEP 8 formatting issue.
 
+
 class CrawlConfigs(BaseModel):
     browser_config: Optional[Dict] = Field(default_factory=dict)
     crawler_config: Optional[Dict] = Field(default_factory=dict)
deploy/docker/server.py (1)
441-446: Consider extracting config extraction logic to reduce duplication.

The same config extraction logic is duplicated between both endpoints. Consider creating a helper function to improve maintainability.
+def extract_crawl_configs(crawl_request: CrawlRequest) -> tuple[dict, dict]:
+    """Extract browser and crawler configs from request."""
+    if crawl_request.configs:
+        return crawl_request.configs.browser_config, crawl_request.configs.crawler_config
+    return {}, {}
Then use it in both endpoints:
-    browser_config = {}
-    crawler_config = {}
-    if crawl_request.configs:
-        browser_config = crawl_request.configs.browser_config if crawl_request.configs.browser_config else {}
-        crawler_config = crawl_request.configs.crawler_config if crawl_request.configs.crawler_config else {}
+    browser_config, crawler_config = extract_crawl_configs(crawl_request)
Also applies to: 466-471

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 02f3127 and 79c7110.

📒 Files selected for processing (2)

deploy/docker/schemas.py (1 hunks)
deploy/docker/server.py (2 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (1)

deploy/docker/server.py (1)

deploy/docker/api.py (2)

handle_crawl_request (393-476)

handle_stream_crawl_request (478-527)

🪛 Flake8 (7.2.0)

deploy/docker/schemas.py

[error] 11-11: expected 2 blank lines, found 1

(E302)

🔇 Additional comments (1)

deploy/docker/schemas.py (1)

11-13: LGTM! Well-structured schema design.

The nested configuration approach allows flexible API usage while maintaining backward compatibility. The use of default_factory=dict for mutable defaults is appropriate.

deploy/docker/server.py

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

deploy/docker/schemas.py (1)
7-7: Fix formatting: Remove extra blank lines.

The static analysis tool flagged too many blank lines before the class definition.
-

-
+

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 79c7110 and b37afe2.

📒 Files selected for processing (2)

deploy/docker/schemas.py (1 hunks)
deploy/docker/server.py (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

deploy/docker/server.py

🧰 Additional context used

🪛 Flake8 (7.2.0)

deploy/docker/schemas.py

[error] 8-8: too many blank lines (3)

(E303)

[error] 12-12: expected 2 blank lines, found 1

(E302)

🔇 Additional comments (1)

deploy/docker/schemas.py (1)

8-10: Well-designed configuration grouping.

The CrawlConfigs class effectively groups related configuration options and uses proper Pydantic patterns:

Field(default_factory=dict) correctly handles mutable defaults

Optional typing allows flexibility in providing configurations

Clear separation of browser vs crawler configurations

coderabbitai · 2025-07-03T01:12:58Z

deploy/docker/schemas.py

+class CrawlRequest(BaseModel):
+    urls: List[str] = Field(min_length=1, max_length=100)
+    configs: Optional[CrawlConfigs] = None


💡 Verification agent

🧩 Analysis chain

Fix formatting and verify backward compatibility.

The modification to CrawlRequest looks good functionally, but there's a formatting issue and a potential compatibility concern.

-class CrawlRequest(BaseModel): + +class CrawlRequest(BaseModel):

Please verify that existing clients can still use the API without providing the configs field, and that the server properly handles requests with the old structure if any exist:

🏁 Script executed:

#!/bin/bash # Description: Check how CrawlRequest is used in server endpoints to ensure backward compatibility # Search for CrawlRequest usage in server files rg -A 10 -B 5 "CrawlRequest" --type py # Look for direct access to browser_config or crawler_config fields that might break ast-grep --pattern 'request.browser_config' ast-grep --pattern 'request.crawler_config'

Length of output: 4032

Default configs to maintain backward compatibility

The server code assumes crawl_request.configs is always a CrawlConfigs instance—if a client omits configs, accessing .browser_config or .crawler_config will raise an error. To fix:

• In deploy/docker/schemas.py, replace the optional configs field with a default factory so that missing configs still yield an empty CrawlConfigs.
• You can drop the Optional import once this is updated.

--- deploy/docker/schemas.py @@ -class CrawlRequest(BaseModel): - urls: List[str] = Field(min_length=1, max_length=100) - configs: Optional[CrawlConfigs] = None +class CrawlRequest(BaseModel): + urls: List[str] = Field(min_length=1, max_length=100) + configs: CrawlConfigs = Field(default_factory=CrawlConfigs)

No changes are needed in server.py once configs always defaults to an empty model.
Ensure you keep two blank lines between top-level classes per PEP8 for consistent formatting.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

class CrawlRequest(BaseModel):

urls: List[str] = Field(min_length=1, max_length=100)

configs: Optional[CrawlConfigs] = None

class CrawlRequest(BaseModel):

urls: List[str] = Field(min_length=1, max_length=100)

configs: CrawlConfigs = Field(default_factory=CrawlConfigs)

🧰 Tools

🪛 Flake8 (7.2.0)

[error] 12-12: expected 2 blank lines, found 1

(E302)

🤖 Prompt for AI Agents

In deploy/docker/schemas.py around lines 12 to 14, the configs field in CrawlRequest is optional but the server expects it always to be a CrawlConfigs instance, causing errors if omitted. Fix this by removing Optional from configs and setting its default using a default_factory that returns an empty CrawlConfigs instance. Also, remove the now unused Optional import and ensure there are two blank lines between top-level classes to comply with PEP8 formatting.

arturbacilla · 2025-07-03T01:47:24Z

I have to change tests aswell.
And I realized that applying those changes breaks the playground, because its request should also be adapted.
I'll make the changes ASAP

coderabbitai

Actionable comments posted: 5

♻️ Duplicate comments (2)

deploy/docker/server.py (2)

443-444: Simplify the conditional logic for config extraction.

The current logic has redundant conditional checks since the schema should already provide defaults.

-    browser_config = crawl_request.configs.browser_config if crawl_request.configs.browser_config else {}
-    crawler_config = crawl_request.configs.crawler_config if crawl_request.configs.crawler_config else {}
+    browser_config = crawl_request.configs.browser_config if crawl_request.configs else {}
+    crawler_config = crawl_request.configs.crawler_config if crawl_request.configs else {}

465-466: Simplify the conditional logic for config extraction.

Same redundant conditional checks as in the /crawl endpoint.

-    browser_config = crawl_request.configs.browser_config if crawl_request.configs.browser_config else {}
-    crawler_config = crawl_request.configs.crawler_config if crawl_request.configs.crawler_config else {}
+    browser_config = crawl_request.configs.browser_config if crawl_request.configs else {}
+    crawler_config = crawl_request.configs.crawler_config if crawl_request.configs else {}

🧹 Nitpick comments (4)

docker-compose.yml (2)
42-43: CPU limit seems restrictive for script execution.

The CPU limit of 0.1 cores may be insufficient for script execution, especially if scripts perform intensive operations. Consider adjusting based on expected script workload.

49-49: Fix the missing newline at end of file.
+
deploy/docker/server.py (2)
19-19: Remove duplicate import.

The os module is imported twice (lines 19 and 40), which is redundant.
-import os
40-40: Clean up import statement.

This line redefines the os import and also imports subprocess. Remove the os part since it's already imported.
-import os, subprocess
+import subprocess

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b37afe2 and 9c0d82b.

📒 Files selected for processing (3)

deploy/docker/schemas.py (2 hunks)
deploy/docker/server.py (5 hunks)
docker-compose.yml (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

deploy/docker/schemas.py

🧰 Additional context used

🪛 YAMLlint (1.37.1)

docker-compose.yml

[error] 49-49: no new line character at the end of file

(new-line-at-end-of-file)

🪛 Ruff (0.11.9)

deploy/docker/server.py

38-38: utils.FilterType imported but unused

Remove unused import: utils.FilterType

(F401)

40-40: Redefinition of unused os from line 19

Remove definition: os

(F811)

499-499: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

534-534: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🔇 Additional comments (3)

docker-compose.yml (1)

26-26: Verify the necessity of privileged mode.

The privileged: true setting grants the container extensive host system access. Ensure this is required for the script execution functionality, as it significantly reduces container isolation.

Consider if specific capabilities would be sufficient instead of full privileged access.
deploy/docker/server.py (2)

499-499: Fix function call in argument defaults.

The Depends(token_dep) call in the function signature should be avoided in argument defaults.
-async def scripts_run(
-    request: Request,
-    body: ScriptRequest,
-    _td: Dict = Depends(token_dep),
-):
+async def scripts_run(
+    request: Request,
+    body: ScriptRequest,
+    _td: Dict = Depends(token_dep),
+):
Note: This appears to be a false positive from the static analysis tool, as this is the correct way to use FastAPI dependencies.

Likely an incorrect or invalid review comment.

484-492: Security concern: Directory traversal vulnerability.

The /scripts endpoint lists files without proper validation, which could potentially expose sensitive files if the scripts directory is compromised or misconfigured.

Consider adding proper validation and restricting file extensions:
-    return JSONResponse([file for file in os.listdir(script_folder) if os.path.isfile(os.path.join(script_folder, file))])
+    python_files = [file for file in os.listdir(script_folder) 
+                   if file.endswith('.py') and os.path.isfile(os.path.join(script_folder, file))]
+    return JSONResponse(python_files)
Likely an incorrect or invalid review comment.

coderabbitai · 2025-07-07T21:15:41Z

docker-compose.yml

+    volumes:
+      - ~/scripts:/app/scripts:rw


🛠️ Refactor suggestion

Review the security implications of the scripts volume mount.

The volume mount ~/scripts:/app/scripts:rw with read-write permissions allows the container to modify host files. Consider if read-only access would be sufficient for script execution, or implement proper access controls to prevent potential security issues.

Consider using read-only mount if scripts don't need to be modified:

- - ~/scripts:/app/scripts:rw + - ~/scripts:/app/scripts:ro

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

volumes:

- ~/scripts:/app/scripts:rw

volumes:

- ~/scripts:/app/scripts:ro

🤖 Prompt for AI Agents

In docker-compose.yml at lines 30-31, the volume mount '~/scripts:/app/scripts:rw' grants read-write access, allowing the container to modify host files. To improve security, change the mount to read-only by replacing ':rw' with ':ro' if the container only needs to execute scripts without modifying them. Alternatively, implement stricter access controls to limit potential risks.

coderabbitai · 2025-07-07T21:15:41Z

deploy/docker/server.py

+    except subprocess.CalledProcessError as e:
+        # Return the command execution error details
+        response_data = {
+            "Status": "Failed to execute the script",
+            "Details": str(e)
+        }
+        raise HTTPException(400, response_data)     


🛠️ Refactor suggestion

Fix exception handling with proper chaining.

The exception handling should preserve the original exception chain.

- except subprocess.CalledProcessError as e: - # Return the command execution error details - response_data = { - "Status": "Failed to execute the script", - "Details": str(e) - } - raise HTTPException(400, response_data) + except subprocess.CalledProcessError as e: + response_data = { + "Status": "Failed to execute the script", + "Details": str(e) + } + raise HTTPException(400, response_data) from e

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

except subprocess.CalledProcessError as e:

# Return the command execution error details

response_data = {

"Status": "Failed to execute the script",

"Details": str(e)

}

raise HTTPException(400, response_data)

except subprocess.CalledProcessError as e:

response_data = {

"Status": "Failed to execute the script",

"Details": str(e)

}

raise HTTPException(400, response_data) from e

🧰 Tools

🪛 Ruff (0.11.9)

534-534: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🤖 Prompt for AI Agents

In deploy/docker/server.py around lines 528 to 534, the exception handling for subprocess.CalledProcessError currently raises an HTTPException without preserving the original exception context. Modify the raise statement to include "from e" to properly chain the original exception, ensuring the traceback and error context are preserved for debugging.

coderabbitai · 2025-07-07T21:15:41Z

deploy/docker/server.py

+@app.post("/scripts/run")
+@limiter.limit(config["rate_limiting"]["default_limit"])
+async def scripts_run(
+    request: Request,
+    body: ScriptRequest,
+    _td: Dict = Depends(token_dep),
+):
+    scripts_directory = '/app/scripts'  # Update this path as needed
+
+    if not body or not body.script_name:
+        raise HTTPException(400, "You have to provide an script name")
+
+    if body.script_name.lower().endswith('.py'):
+        raise HTTPException(400, "Provide only the name, without the extension")
+
+    script_name = f'{body.script_name}.py'
+    script_path = os.path.join(scripts_directory, script_name)
+
+    if not os.path.isfile(script_path):
+        raise HTTPException(400, "No script found with that name. Check available in /scripts")
+
+    try:
+        result = await subprocess.run(['python', script_path], check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
+        output = result.stdout.decode('utf-8')
+        error_output = result.stderr.decode('utf-8')
+
+        if output or error_output:
+            response_data = {
+                "Output": f"{output}{error_output}",
+                "Status": "Script executed with output"
+            }
+            raise HTTPException(400, response_data) 
+
+        return JSONResponse(output)
+    except subprocess.CalledProcessError as e:
+        # Return the command execution error details
+        response_data = {
+            "Status": "Failed to execute the script",
+            "Details": str(e)
+        }
+        raise HTTPException(400, response_data)     
+


⚠️ Potential issue

Critical security vulnerability: Arbitrary script execution.

The /scripts/run endpoint executes arbitrary Python files without proper validation or sandboxing. This poses a significant security risk as it allows execution of any Python code in the scripts directory.

Consider implementing proper security measures:

Validate script content before execution

Use a sandboxed environment

Implement proper input validation

Add logging for security monitoring

Consider using a task queue for async execution

+ # Add proper validation and sandboxing + if not re.match(r'^[a-zA-Z0-9_-]+$', body.script_name): + raise HTTPException(400, "Invalid script name format")

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools

🪛 Ruff (0.11.9)

499-499: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

534-534: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🤖 Prompt for AI Agents

In deploy/docker/server.py around lines 494 to 535, the /scripts/run endpoint executes arbitrary Python scripts from the scripts directory without sufficient validation or sandboxing, creating a critical security risk. To fix this, implement strict validation to allow only a predefined whitelist of script names, avoid executing scripts directly from user input, and run scripts in a sandboxed or isolated environment. Add input validation to reject any unexpected or malicious input, include logging for all script execution attempts for security monitoring, and consider offloading script execution to a controlled async task queue to limit resource abuse and improve security.

coderabbitai · 2025-07-07T21:15:41Z

deploy/docker/server.py

+                "Output": f"{output}{error_output}",
+                "Status": "Script executed with output"
+            }
+            raise HTTPException(400, response_data) 


🛠️ Refactor suggestion

Improve exception handling.

The current exception handling raises HTTPException with output, which may not be the intended behavior. Also, the exception chain should be preserved.

- raise HTTPException(400, response_data) + return JSONResponse(response_data, status_code=400)

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In deploy/docker/server.py at line 525, the HTTPException is raised without preserving the original exception context, which can obscure the root cause. Modify the exception handling to include the original exception as the cause by using the "from" keyword when raising HTTPException, ensuring the exception chain is preserved for better debugging.

coderabbitai · 2025-07-07T21:15:41Z

deploy/docker/server.py

+        raise HTTPException(400, "No script found with that name. Check available in /scripts")
+
+    try:
+        result = await subprocess.run(['python', script_path], check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)


🛠️ Refactor suggestion

Fix subprocess usage for async context.

Using subprocess.run in an async function blocks the event loop. Use asyncio.create_subprocess_exec instead.

- result = await subprocess.run(['python', script_path], check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE) + proc = await asyncio.create_subprocess_exec( + 'python', script_path, + stdout=asyncio.subprocess.PIPE, + stderr=asyncio.subprocess.PIPE + ) + stdout, stderr = await proc.communicate() + if proc.returncode != 0: + raise subprocess.CalledProcessError(proc.returncode, ['python', script_path])

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

result = await subprocess.run(['python', script_path], check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

proc = await asyncio.create_subprocess_exec(

'python', script_path,

stdout=asyncio.subprocess.PIPE,

stderr=asyncio.subprocess.PIPE

)

stdout, stderr = await proc.communicate()

if proc.returncode != 0:

raise subprocess.CalledProcessError(proc.returncode, ['python', script_path])

🤖 Prompt for AI Agents

In deploy/docker/server.py at line 516, the use of subprocess.run in an async function blocks the event loop. Replace subprocess.run with asyncio.create_subprocess_exec to run the subprocess asynchronously. Adjust the code to await the subprocess completion and capture stdout and stderr using the appropriate asyncio subprocess methods.

arturbacilla · 2025-07-08T17:00:19Z

I'm closing this PR due to new ideas I wish to implement

feat: ✨ Added a way to pass both BrowserConfig and CrawlerConfig thro…

79c7110

…ugh /crawler api endpoint

coderabbitai bot reviewed Jul 3, 2025

View reviewed changes

deploy/docker/server.py Outdated Show resolved Hide resolved

deploy/docker/server.py Outdated Show resolved Hide resolved

style: 🎨 fix based on bot suggestions

b37afe2

coderabbitai bot reviewed Jul 3, 2025

View reviewed changes

added scripts endpoint

9c0d82b

coderabbitai bot reviewed Jul 7, 2025

View reviewed changes

arturbacilla closed this Jul 8, 2025

-        result = await subprocess.run(['python', script_path], check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
+        proc = await asyncio.create_subprocess_exec(
+            'python', script_path,
+            stdout=asyncio.subprocess.PIPE,
+            stderr=asyncio.subprocess.PIPE
+        )
+        stdout, stderr = await proc.communicate()
+        if proc.returncode != 0:
+            raise subprocess.CalledProcessError(proc.returncode, ['python', script_path])

feat: Added a way to pass both BrowserConfig and CrawlerConfig through /crawler api endpoint #1266

feat: Added a way to pass both BrowserConfig and CrawlerConfig through /crawler api endpoint #1266

Uh oh!

Conversation

arturbacilla commented Jul 3, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

List of files changed and why

How Has This Been Tested?

Checklist:

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jul 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 3, 2025

Choose a reason for hiding this comment

Uh oh!

arturbacilla commented Jul 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

arturbacilla commented Jul 8, 2025

Uh oh!

Uh oh!

arturbacilla commented Jul 3, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jul 3, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

arturbacilla commented Jul 3, 2025 •

edited

Loading