forked from microsoft/OmniParser
-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
-
Objective:
- Implement fine-tuning for OmniParser’s YOLO model to enhance detection accuracy on small icons and UI elements.
-
Context:
- Current limitations in detecting small or densely packed icons due to model sensitivity thresholds.
-
Proposed Solution:
- Data Collection: Assemble a labeled dataset of small icons/UI elements, including bounding boxes in YOLO format.
- Training Configuration: Use YOLO-specific parameters, adjusting image size (e.g., 640x640) and hyperparameters to improve small object sensitivity.
- Integration Steps:
- Modify
get_yolo_modelto support loading the fine-tuned model. - Update
configto reference the fine-tuned model. - Provide a
train_yolofunction to manage the fine-tuning process.
- Modify
- Testing: Evaluate detection accuracy on new test images containing small icons/UI elements, adjusting
BOX_THRESHOLDas needed.
-
Expected Outcome:
- More accurate small icon detection, fewer missed icons in dense layouts, and reduced reliance on preprocessing.
Metadata
Metadata
Assignees
Labels
No labels