Update README.md : Add RoboRefer, a 3D spatial reasoning model with advanced dataset & benchmark #254

Anjingkun · 2025-07-19T03:38:55Z

Hello! We would like to request the inclusion of our paper, "RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics", into your awesome list.

Our contributions include:

🤖 A 3D Spatial Reasoning Model: We propose RoboRefer, a 3D-aware reasoning VLM trained using a sequential SFT-RFT strategy with metric-sensitive process reward functions to achieve spatial referring.
📚 A Large-Scale Open-Source Dataset (RefSpatial): To support this research, we have released the RefSpatial dataset, a large-scale collection of 2.5 million samples with 20 million QA pairs. It features fine-grained annotations for 31 distinct spatial relations. A key feature of our simulated data is the inclusion of detailed, step-by-step reasoning processes that show how to utilize spatial constraints
🏆 A Challenging Benchmark (RefSpatial-Bench): We also introduce RefSpatial-Bench, a new benchmark that fills the gap in evaluating spatial referring with multi-step reasoning. Over 70% of the tasks require multi-step reasoning (up to 5 steps).

The open-source assets we released, including our models, dataset and benchmark, have been downloaded over 4000 times, demonstrating their high value and impact. Therefore, we believe that our model, combined with our advanced dataset and benchmark, constitutes a significant contribution that will be a valuable resource for the community.

Homepage: https://zhoues.github.io/RoboRefer
GitHub: https://github.com/Zhoues/RoboRefer
Hugging Face Dataset (RefSpatial): https://huggingface.co/datasets/JingkunAn/RefSpatial
Hugging Face Benchmark (RefSpatial-Bench): https://huggingface.co/datasets/BAAI/RefSpatial-Bench

Citation:

@article{zhou2025roborefer,
  title={RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics},
  author={Zhou, Enshen and An, Jingkun and Chi, Cheng and Han, Yi and Rong, Shanyu and Zhang, Chi and Wang, Pengwei and Wang, Zhongyuan and Huang, Tiejun and Sheng, Lu and others},
  journal={arXiv preprint arXiv:2506.04308},
  year={2025}
}

Thank you for your consideration!

Update README.md

904f423

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update README.md : Add RoboRefer, a 3D spatial reasoning model with advanced dataset & benchmark #254

Update README.md : Add RoboRefer, a 3D spatial reasoning model with advanced dataset & benchmark #254

Uh oh!

Anjingkun commented Jul 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Update README.md : Add RoboRefer, a 3D spatial reasoning model with advanced dataset & benchmark #254

Are you sure you want to change the base?

Update README.md : Add RoboRefer, a 3D spatial reasoning model with advanced dataset & benchmark #254

Uh oh!

Conversation

Anjingkun commented Jul 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant