-
Notifications
You must be signed in to change notification settings - Fork 46.1k
#10006 Adding images to the prompt using OCR #11379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
#10006 Adding images to the prompt using OCR #11379
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the β¨ Finishing touchesπ§ͺ Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Ferko seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
|
This PR targets the Automatically setting the base branch to |
PR Reviewer Guide πHere are some key observations to aid the review process:
|
β Deploy Preview for auto-gpt-docs ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
Thank you for your PR adding OCR capabilities to the AIStructuredResponseGeneratorBlock! Here's some feedback to help get this PR ready for merging:
Please address these items and we'll be happy to review again. |
|
Here's the code health analysis summary for commits Analysis Summary
|
|
Thanks for your PR adding OCR capabilities to process images in prompts! Here are some items that need to be addressed before this can be merged:
Your implementation of OCR functionality looks promising, but we need to ensure it meets all our PR requirements before merging. Let me know if you need any clarification on these items! |
| from typing import Any, Iterable, List, Literal, NamedTuple, Optional | ||
|
|
||
|
|
||
| import pytesseract |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: pytesseract is unconditionally imported in llm.py but is missing from pyproject.toml, leading to ModuleNotFoundError at startup.
Severity: CRITICAL | Confidence: 1.00
π Detailed Analysis
The application will crash at startup with a ModuleNotFoundError: No module named 'pytesseract' because pytesseract is imported unconditionally in llm.py at line 13, but it is not declared as a permanent dependency in pyproject.toml. The poetry add pytesseract --no-ansi || true command in the Dockerfile is an unreliable installation method that does not guarantee the dependency is always present, especially in non-Docker environments.
π‘ Suggested Fix
Add pytesseract as a formal dependency to pyproject.toml. Remove the unreliable poetry add pytesseract --no-ansi || true from the Dockerfile, allowing Poetry to manage dependencies correctly.
π€ Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: autogpt_platform/backend/backend/blocks/llm.py#L13
Potential issue: The application will crash at startup with a `ModuleNotFoundError: No
module named 'pytesseract'` because `pytesseract` is imported unconditionally in
`llm.py` at line 13, but it is not declared as a permanent dependency in
`pyproject.toml`. The `poetry add pytesseract --no-ansi || true` command in the
Dockerfile is an unreliable installation method that does not guarantee the dependency
is always present, especially in non-Docker environments.
Did we get this right? π / π to inform future reviews.
Reference_id: 2669854
|
This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request. |
Changes ποΈ
AIStructuredResponseGeneratorBlockinLLM.py.Tesseract OCRinside the relevant method to enable text extraction from images.Dockerfileto installTesseract OCRfor proper functionality of the new feature.Reason for changes:
These changes allow the
AIStructuredResponseGeneratorBlockto optionally process images using OCR, enabling structured responses from image content. The Dockerfile update ensures that the necessary OCR engine is available in all deployment environments.