Skip to content

[Bug] observed inaccurate bbox with offset in grounding task #1084

Open
@lilyzhng

Description

@lilyzhng

Checklist

  • 1. I have searched related issues but cannot get the expected help.
  • 2. The bug has not been fixed in the latest version.
  • 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.

Describe the bug

Image

When using opengvlab/internvl3-14b for 2d grounding task, I have observed inaccurate bbox with offset, is this expected? I am aware internvl uses 1000x1000 pixel coordinate system and performed below scaling

            abs_x1 = int(min(max(x1 / 1000 * width, 0), width))
            abs_y1 = int(min(max(y1 / 1000 * height, 0), height))
            abs_x2 = int(min(max(x2 / 1000 * width, 0), width))
            abs_y2 = int(min(max(y2 / 1000 * height, 0), height))

Reproduction

Model: opengvlab/internvl3-14b
Prompt: "Outline the position of every vehicle in this image and output all the coordinates in JSON format with the structure: [{"bbox_2d": [x1, y1, x2, y2], "label": "vehicle_type"}]. Include cars, trucks, motorcycles, buses, and any other vehicles you can see."
Response:
To outline the position of every vehicle in the image, I'll identify each vehicle and provide their bounding boxes in the format [x1, y1, x2, y2], where (x1, y1) is the top-left corner and (x2, y2) is the bottom-right corner of the bounding box. Here's the JSON output:

[
    {
        "bbox_2d": [10, 270, 200, 550],
        "label": "car"
    },
    {
        "bbox_2d": [230, 360, 360, 460],
        "label": "car"
    },
    {
        "bbox_2d": [290, 390, 350, 440],
        "label": "motorcycle"
    },
    {
        "bbox_2d": [330, 260, 920, 830],
        "label": "car"
    },
]


### Environment

```Shell
NA

Error traceback

NA

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions