Skip to content

Finetuning for Improved Small Icon Detection in OmniParser #3

@abrichr

Description

@abrichr
  1. Objective:

    • Implement fine-tuning for OmniParser’s YOLO model to enhance detection accuracy on small icons and UI elements.
  2. Context:

    • Current limitations in detecting small or densely packed icons due to model sensitivity thresholds.
  3. Proposed Solution:

    • Data Collection: Assemble a labeled dataset of small icons/UI elements, including bounding boxes in YOLO format.
    • Training Configuration: Use YOLO-specific parameters, adjusting image size (e.g., 640x640) and hyperparameters to improve small object sensitivity.
    • Integration Steps:
      • Modify get_yolo_model to support loading the fine-tuned model.
      • Update config to reference the fine-tuned model.
      • Provide a train_yolo function to manage the fine-tuning process.
    • Testing: Evaluate detection accuracy on new test images containing small icons/UI elements, adjusting BOX_THRESHOLD as needed.
  4. Expected Outcome:

    • More accurate small icon detection, fewer missed icons in dense layouts, and reduced reliance on preprocessing.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions