Skip to content

Is XComposer2-4KHD capable of REC and detection? #489

@zihui-debug

Description

@zihui-debug

Hi, I try to evaluate XComposer2-4KHD on RefCOCO for REC task refer to #261. The result is quite poor. Does the coordinate in response need to be post-processed like other MLLMs (eg. for qwen2.5vl, the coordinates should be resized from the input resolution to actual resolution of image)?
Moreover, I’m wondering whether XComposer2-4KHD supports detection tasks. If so, could you please provide guidance on how such evaluation should be performed?

Image

Image

Image

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions