Skip to content

feat: Added a way to pass both BrowserConfig and CrawlerConfig through /crawler api endpoint #1266

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

arturbacilla
Copy link

@arturbacilla arturbacilla commented Jul 3, 2025

Summary

Considering the following scenario:
A page to be crawled must have been logged in beforehand. The login information is stored (usually) in cookies, that can be provided in BrowserConfig and that works like a charm. But if one need to filter some tags by using excluded_tags, taget_elements or css_selector (that are provided in CrawlerRunConfig) it is not possible.
That can be managed through scripting, of course, in a custom file. But not if one is using the crawler through docker on the default endpoint.
This PR focus on changing that behaviour. Now you can pass in body both.

List of files changed and why

docker/server.py - Made changes on API /crawl and /crawl/stream endpoints to handle cases where no .configs where passed on body so as no browser_config and/nor crawler_config inside configs.

docker/schemas.py - Changed type to fit new schema

How Has This Been Tested?

Tested calling the API using Postman. Cases:

  • no configs;
  • configs only with browser_config or crawler_config
  • configs with both

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added/updated unit tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Summary by CodeRabbit

  • New Features

    • Added endpoints to list and run custom scripts directly through the API.
    • Enabled specifying optional browser and crawler configuration settings when submitting crawl requests.
  • Refactor

    • Improved the structure of crawl requests by grouping configuration options into a dedicated section, simplifying advanced crawling settings management.
  • Chores

    • Updated Docker Compose configuration with enhanced container runtime, networking settings, and script volume mounting for improved local development.

Copy link
Contributor

coderabbitai bot commented Jul 3, 2025

"""

Walkthrough

The changes add new Pydantic models CrawlConfigs and ScriptRequest to support enhanced crawl configurations and script execution requests. The FastAPI server gains two new endpoints: /scripts to list available scripts and /scripts/run to execute a specified script asynchronously. The Docker Compose setup is updated with refined service runtime settings, volume mounts for scripts, and external network configuration.

Changes

File(s) Change Summary
deploy/docker/schemas.py Added CrawlConfigs model with optional browser_config and crawler_config dicts; updated CrawlRequest to include optional configs field; added ScriptRequest model with script_name field.
deploy/docker/server.py Added /scripts GET endpoint to list script files; added /scripts/run POST endpoint to execute scripts asynchronously with validation and error handling; refactored /crawl and /crawl/stream to extract configs from nested configs field; added imports for os and subprocess.
docker-compose.yml Removed version and environment variables from base config; added container runtime options (container_name, privileged, user, hostname), external network attachment, and volume mount for scripts in crawl4ai service; declared external network demo.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Server
    participant Subprocess

    Client->>Server: GET /scripts
    Server->>Server: List files in /app/scripts
    Server-->>Client: Return list of script filenames

    Client->>Server: POST /scripts/run (script_name)
    Server->>Server: Validate script_name and existence
    Server->>Subprocess: Run script asynchronously
    Subprocess-->>Server: Return stdout/stderr
    alt Output or error present
        Server-->>Client: HTTP 400 with output/error
    else No output or error
        Server-->>Client: HTTP 200 with output
    end
Loading

Poem

🐰✨ Scripts to run and configs to share,
New endpoints hop in with flair!
List all the scripts, then run with a cheer,
Errors or outputs, all will appear.
Docker tunes up the network and space,
Our crawl bunny’s ready to race!
🚀🐇
"""

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
deploy/docker/schemas.py (1)

7-9: Fix formatting issue: Add blank line before class definition.

Static analysis correctly identified a PEP 8 formatting issue.

 
+
 class CrawlConfigs(BaseModel):
     browser_config: Optional[Dict] = Field(default_factory=dict)
     crawler_config: Optional[Dict] = Field(default_factory=dict)
deploy/docker/server.py (1)

441-446: Consider extracting config extraction logic to reduce duplication.

The same config extraction logic is duplicated between both endpoints. Consider creating a helper function to improve maintainability.

+def extract_crawl_configs(crawl_request: CrawlRequest) -> tuple[dict, dict]:
+    """Extract browser and crawler configs from request."""
+    if crawl_request.configs:
+        return crawl_request.configs.browser_config, crawl_request.configs.crawler_config
+    return {}, {}

Then use it in both endpoints:

-    browser_config = {}
-    crawler_config = {}
-    if crawl_request.configs:
-        browser_config = crawl_request.configs.browser_config if crawl_request.configs.browser_config else {}
-        crawler_config = crawl_request.configs.crawler_config if crawl_request.configs.crawler_config else {}
+    browser_config, crawler_config = extract_crawl_configs(crawl_request)

Also applies to: 466-471

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 02f3127 and 79c7110.

📒 Files selected for processing (2)
  • deploy/docker/schemas.py (1 hunks)
  • deploy/docker/server.py (2 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
deploy/docker/server.py (1)
deploy/docker/api.py (2)
  • handle_crawl_request (393-476)
  • handle_stream_crawl_request (478-527)
🪛 Flake8 (7.2.0)
deploy/docker/schemas.py

[error] 11-11: expected 2 blank lines, found 1

(E302)

🔇 Additional comments (1)
deploy/docker/schemas.py (1)

11-13: LGTM! Well-structured schema design.

The nested configuration approach allows flexible API usage while maintaining backward compatibility. The use of default_factory=dict for mutable defaults is appropriate.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
deploy/docker/schemas.py (1)

7-7: Fix formatting: Remove extra blank lines.

The static analysis tool flagged too many blank lines before the class definition.

-

-
+
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 79c7110 and b37afe2.

📒 Files selected for processing (2)
  • deploy/docker/schemas.py (1 hunks)
  • deploy/docker/server.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • deploy/docker/server.py
🧰 Additional context used
🪛 Flake8 (7.2.0)
deploy/docker/schemas.py

[error] 8-8: too many blank lines (3)

(E303)


[error] 12-12: expected 2 blank lines, found 1

(E302)

🔇 Additional comments (1)
deploy/docker/schemas.py (1)

8-10: Well-designed configuration grouping.

The CrawlConfigs class effectively groups related configuration options and uses proper Pydantic patterns:

  • Field(default_factory=dict) correctly handles mutable defaults
  • Optional typing allows flexibility in providing configurations
  • Clear separation of browser vs crawler configurations

Comment on lines +12 to +14
class CrawlRequest(BaseModel):
urls: List[str] = Field(min_length=1, max_length=100)
configs: Optional[CrawlConfigs] = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Fix formatting and verify backward compatibility.

The modification to CrawlRequest looks good functionally, but there's a formatting issue and a potential compatibility concern.

-class CrawlRequest(BaseModel):
+
+class CrawlRequest(BaseModel):

Please verify that existing clients can still use the API without providing the configs field, and that the server properly handles requests with the old structure if any exist:


🏁 Script executed:

#!/bin/bash
# Description: Check how CrawlRequest is used in server endpoints to ensure backward compatibility

# Search for CrawlRequest usage in server files
rg -A 10 -B 5 "CrawlRequest" --type py

# Look for direct access to browser_config or crawler_config fields that might break
ast-grep --pattern 'request.browser_config'
ast-grep --pattern 'request.crawler_config'

Length of output: 4032


Default configs to maintain backward compatibility

The server code assumes crawl_request.configs is always a CrawlConfigs instance—if a client omits configs, accessing .browser_config or .crawler_config will raise an error. To fix:

• In deploy/docker/schemas.py, replace the optional configs field with a default factory so that missing configs still yield an empty CrawlConfigs.
• You can drop the Optional import once this is updated.

--- deploy/docker/schemas.py
@@
-class CrawlRequest(BaseModel):
-    urls: List[str] = Field(min_length=1, max_length=100)
-    configs: Optional[CrawlConfigs] = None
+class CrawlRequest(BaseModel):
+    urls: List[str] = Field(min_length=1, max_length=100)
+    configs: CrawlConfigs = Field(default_factory=CrawlConfigs)

No changes are needed in server.py once configs always defaults to an empty model.
Ensure you keep two blank lines between top-level classes per PEP8 for consistent formatting.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
class CrawlRequest(BaseModel):
urls: List[str] = Field(min_length=1, max_length=100)
configs: Optional[CrawlConfigs] = None
class CrawlRequest(BaseModel):
urls: List[str] = Field(min_length=1, max_length=100)
configs: CrawlConfigs = Field(default_factory=CrawlConfigs)
🧰 Tools
🪛 Flake8 (7.2.0)

[error] 12-12: expected 2 blank lines, found 1

(E302)

🤖 Prompt for AI Agents
In deploy/docker/schemas.py around lines 12 to 14, the configs field in
CrawlRequest is optional but the server expects it always to be a CrawlConfigs
instance, causing errors if omitted. Fix this by removing Optional from configs
and setting its default using a default_factory that returns an empty
CrawlConfigs instance. Also, remove the now unused Optional import and ensure
there are two blank lines between top-level classes to comply with PEP8
formatting.

@arturbacilla
Copy link
Author

arturbacilla commented Jul 3, 2025

I have to change tests aswell.
And I realized that applying those changes breaks the playground, because its request should also be adapted.
I'll make the changes ASAP

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (2)
deploy/docker/server.py (2)

443-444: Simplify the conditional logic for config extraction.

The current logic has redundant conditional checks since the schema should already provide defaults.

-    browser_config = crawl_request.configs.browser_config if crawl_request.configs.browser_config else {}
-    crawler_config = crawl_request.configs.crawler_config if crawl_request.configs.crawler_config else {}
+    browser_config = crawl_request.configs.browser_config if crawl_request.configs else {}
+    crawler_config = crawl_request.configs.crawler_config if crawl_request.configs else {}

465-466: Simplify the conditional logic for config extraction.

Same redundant conditional checks as in the /crawl endpoint.

-    browser_config = crawl_request.configs.browser_config if crawl_request.configs.browser_config else {}
-    crawler_config = crawl_request.configs.crawler_config if crawl_request.configs.crawler_config else {}
+    browser_config = crawl_request.configs.browser_config if crawl_request.configs else {}
+    crawler_config = crawl_request.configs.crawler_config if crawl_request.configs else {}
🧹 Nitpick comments (4)
docker-compose.yml (2)

42-43: CPU limit seems restrictive for script execution.

The CPU limit of 0.1 cores may be insufficient for script execution, especially if scripts perform intensive operations. Consider adjusting based on expected script workload.


49-49: Fix the missing newline at end of file.

+
deploy/docker/server.py (2)

19-19: Remove duplicate import.

The os module is imported twice (lines 19 and 40), which is redundant.

-import os

40-40: Clean up import statement.

This line redefines the os import and also imports subprocess. Remove the os part since it's already imported.

-import os, subprocess
+import subprocess
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b37afe2 and 9c0d82b.

📒 Files selected for processing (3)
  • deploy/docker/schemas.py (2 hunks)
  • deploy/docker/server.py (5 hunks)
  • docker-compose.yml (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • deploy/docker/schemas.py
🧰 Additional context used
🪛 YAMLlint (1.37.1)
docker-compose.yml

[error] 49-49: no new line character at the end of file

(new-line-at-end-of-file)

🪛 Ruff (0.11.9)
deploy/docker/server.py

38-38: utils.FilterType imported but unused

Remove unused import: utils.FilterType

(F401)


40-40: Redefinition of unused os from line 19

Remove definition: os

(F811)


499-499: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)


534-534: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🔇 Additional comments (3)
docker-compose.yml (1)

26-26: Verify the necessity of privileged mode.

The privileged: true setting grants the container extensive host system access. Ensure this is required for the script execution functionality, as it significantly reduces container isolation.

Consider if specific capabilities would be sufficient instead of full privileged access.

deploy/docker/server.py (2)

499-499: Fix function call in argument defaults.

The Depends(token_dep) call in the function signature should be avoided in argument defaults.

-async def scripts_run(
-    request: Request,
-    body: ScriptRequest,
-    _td: Dict = Depends(token_dep),
-):
+async def scripts_run(
+    request: Request,
+    body: ScriptRequest,
+    _td: Dict = Depends(token_dep),
+):

Note: This appears to be a false positive from the static analysis tool, as this is the correct way to use FastAPI dependencies.

Likely an incorrect or invalid review comment.


484-492: Security concern: Directory traversal vulnerability.

The /scripts endpoint lists files without proper validation, which could potentially expose sensitive files if the scripts directory is compromised or misconfigured.

Consider adding proper validation and restricting file extensions:

-    return JSONResponse([file for file in os.listdir(script_folder) if os.path.isfile(os.path.join(script_folder, file))])
+    python_files = [file for file in os.listdir(script_folder) 
+                   if file.endswith('.py') and os.path.isfile(os.path.join(script_folder, file))]
+    return JSONResponse(python_files)

Likely an incorrect or invalid review comment.

Comment on lines +30 to +31
volumes:
- ~/scripts:/app/scripts:rw
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Review the security implications of the scripts volume mount.

The volume mount ~/scripts:/app/scripts:rw with read-write permissions allows the container to modify host files. Consider if read-only access would be sufficient for script execution, or implement proper access controls to prevent potential security issues.

Consider using read-only mount if scripts don't need to be modified:

-      - ~/scripts:/app/scripts:rw
+      - ~/scripts:/app/scripts:ro
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
volumes:
- ~/scripts:/app/scripts:rw
volumes:
- ~/scripts:/app/scripts:ro
🤖 Prompt for AI Agents
In docker-compose.yml at lines 30-31, the volume mount
'~/scripts:/app/scripts:rw' grants read-write access, allowing the container to
modify host files. To improve security, change the mount to read-only by
replacing ':rw' with ':ro' if the container only needs to execute scripts
without modifying them. Alternatively, implement stricter access controls to
limit potential risks.

Comment on lines +528 to +534
except subprocess.CalledProcessError as e:
# Return the command execution error details
response_data = {
"Status": "Failed to execute the script",
"Details": str(e)
}
raise HTTPException(400, response_data)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Fix exception handling with proper chaining.

The exception handling should preserve the original exception chain.

-    except subprocess.CalledProcessError as e:
-        # Return the command execution error details
-        response_data = {
-            "Status": "Failed to execute the script",
-            "Details": str(e)
-        }
-        raise HTTPException(400, response_data)     
+    except subprocess.CalledProcessError as e:
+        response_data = {
+            "Status": "Failed to execute the script",
+            "Details": str(e)
+        }
+        raise HTTPException(400, response_data) from e
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
except subprocess.CalledProcessError as e:
# Return the command execution error details
response_data = {
"Status": "Failed to execute the script",
"Details": str(e)
}
raise HTTPException(400, response_data)
except subprocess.CalledProcessError as e:
response_data = {
"Status": "Failed to execute the script",
"Details": str(e)
}
raise HTTPException(400, response_data) from e
🧰 Tools
🪛 Ruff (0.11.9)

534-534: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🤖 Prompt for AI Agents
In deploy/docker/server.py around lines 528 to 534, the exception handling for
subprocess.CalledProcessError currently raises an HTTPException without
preserving the original exception context. Modify the raise statement to include
"from e" to properly chain the original exception, ensuring the traceback and
error context are preserved for debugging.

Comment on lines +494 to +535
@app.post("/scripts/run")
@limiter.limit(config["rate_limiting"]["default_limit"])
async def scripts_run(
request: Request,
body: ScriptRequest,
_td: Dict = Depends(token_dep),
):
scripts_directory = '/app/scripts' # Update this path as needed

if not body or not body.script_name:
raise HTTPException(400, "You have to provide an script name")

if body.script_name.lower().endswith('.py'):
raise HTTPException(400, "Provide only the name, without the extension")

script_name = f'{body.script_name}.py'
script_path = os.path.join(scripts_directory, script_name)

if not os.path.isfile(script_path):
raise HTTPException(400, "No script found with that name. Check available in /scripts")

try:
result = await subprocess.run(['python', script_path], check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
output = result.stdout.decode('utf-8')
error_output = result.stderr.decode('utf-8')

if output or error_output:
response_data = {
"Output": f"{output}{error_output}",
"Status": "Script executed with output"
}
raise HTTPException(400, response_data)

return JSONResponse(output)
except subprocess.CalledProcessError as e:
# Return the command execution error details
response_data = {
"Status": "Failed to execute the script",
"Details": str(e)
}
raise HTTPException(400, response_data)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Critical security vulnerability: Arbitrary script execution.

The /scripts/run endpoint executes arbitrary Python files without proper validation or sandboxing. This poses a significant security risk as it allows execution of any Python code in the scripts directory.

Consider implementing proper security measures:

  1. Validate script content before execution
  2. Use a sandboxed environment
  3. Implement proper input validation
  4. Add logging for security monitoring
  5. Consider using a task queue for async execution
+    # Add proper validation and sandboxing
+    if not re.match(r'^[a-zA-Z0-9_-]+$', body.script_name):
+        raise HTTPException(400, "Invalid script name format")

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools
🪛 Ruff (0.11.9)

499-499: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)


534-534: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🤖 Prompt for AI Agents
In deploy/docker/server.py around lines 494 to 535, the /scripts/run endpoint
executes arbitrary Python scripts from the scripts directory without sufficient
validation or sandboxing, creating a critical security risk. To fix this,
implement strict validation to allow only a predefined whitelist of script
names, avoid executing scripts directly from user input, and run scripts in a
sandboxed or isolated environment. Add input validation to reject any unexpected
or malicious input, include logging for all script execution attempts for
security monitoring, and consider offloading script execution to a controlled
async task queue to limit resource abuse and improve security.

"Output": f"{output}{error_output}",
"Status": "Script executed with output"
}
raise HTTPException(400, response_data)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve exception handling.

The current exception handling raises HTTPException with output, which may not be the intended behavior. Also, the exception chain should be preserved.

-            raise HTTPException(400, response_data) 
+            return JSONResponse(response_data, status_code=400)

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In deploy/docker/server.py at line 525, the HTTPException is raised without
preserving the original exception context, which can obscure the root cause.
Modify the exception handling to include the original exception as the cause by
using the "from" keyword when raising HTTPException, ensuring the exception
chain is preserved for better debugging.

raise HTTPException(400, "No script found with that name. Check available in /scripts")

try:
result = await subprocess.run(['python', script_path], check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Fix subprocess usage for async context.

Using subprocess.run in an async function blocks the event loop. Use asyncio.create_subprocess_exec instead.

-        result = await subprocess.run(['python', script_path], check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
+        proc = await asyncio.create_subprocess_exec(
+            'python', script_path,
+            stdout=asyncio.subprocess.PIPE,
+            stderr=asyncio.subprocess.PIPE
+        )
+        stdout, stderr = await proc.communicate()
+        if proc.returncode != 0:
+            raise subprocess.CalledProcessError(proc.returncode, ['python', script_path])
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
result = await subprocess.run(['python', script_path], check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
proc = await asyncio.create_subprocess_exec(
'python', script_path,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE
)
stdout, stderr = await proc.communicate()
if proc.returncode != 0:
raise subprocess.CalledProcessError(proc.returncode, ['python', script_path])
🤖 Prompt for AI Agents
In deploy/docker/server.py at line 516, the use of subprocess.run in an async
function blocks the event loop. Replace subprocess.run with
asyncio.create_subprocess_exec to run the subprocess asynchronously. Adjust the
code to await the subprocess completion and capture stdout and stderr using the
appropriate asyncio subprocess methods.

@arturbacilla
Copy link
Author

I'm closing this PR due to new ideas I wish to implement

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant