Skip to content

bytedance/Q-Insight

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Q-Insight: Understanding Image Quality via Visual Reinforcement Learning

Q-Insight Paper on arXiv Q-Insight Model

Weiqi Li, Xuanyu Zhang, Shijie Zhao, Yabin Zhang, Junlin Li, Li Zhang and Jian Zhang

🚩 Updates

  • 09.19 Q-Insight has been accepted at NeurIPS 2025 as a spotlight (Top 3%)!
  • 05.30 Released training and testing code, along with the pretrained model.
  • 05.26 Released our v2 paper.
  • 03.28 Released the Q-Insight technical report.

🔥 Introduction

PLCC comparisons between our proposed Q-Insight and existing IQA metrics (left) and three example applications of our Q-Insight (right) are presented. Q-Insight demonstrates significantly improved performance compared to existing methods, especially on out-of-domain datasets. Additionally, Q-Insight effectively supports quality score regression, image degradation perception, and zero-shot image comparison reasoning tasks.

🔧 Dependencies and Installation

git clone https://github.com/bytedance/Q-Insight.git
bash setup.sh

⚡ Quick Inference

Supported Tasks

Score Regression

cd src/eval/
python demo_score.py

Degradation Perception

cd src/eval/
python demo_dist.py

Image Comparison Reasoning

cd src/eval/
python demo_comparison.py

📖 Dataset Preparation for Training

Score Regression

Download meta files from Data-DeQA-Score and the source images from the KONIQ dataset. Arrange the folders in ./src/open-r1-multimodal/dataas follows:

|-- Data-DeQA-Score
  |-- KONIQ
    |-- images/*.jpg
    |-- metas

Degradation Perception

Download the refA_sd_brief subset from KADIS-700K. Arrange the folders in ./src/open-r1-multimodal/data as follows:

|-- KADIS-700K
  |-- refA_sd_brief
    |-- dist_imgs/*.jpg
    |-- metas/train_dist.json

Image Comparison Reasoning

Download the validation dataset of DiffIQA. Arrange the folders in ./src/open-r1-multimodal/data as follows:

|-- DiffIQA
  |-- ValidationImage
    |-- images
    |-- train_comparison.json

Training

Score Regression and Degradation Perception

cd src/open-r1-multimodal/
bash run_qinsight_score_and_dist.sh

Image Comparison Reasoning

cd src/open-r1-multimodal/
bash run_qinsight_comparison.sh

✏️ To Do List

  • Release the code and model of VQ-Insight
  • Add support for LoRA fine-tuning
  • Provide a Gradio demo
  • Release inference code and weights
  • Release training code
  • Release the paper

Acknowledgement

We appreciate the releasing codes and data of VLM-R1, DepictQA and DeQA-Score.

Citation

If Q-Insight is helpful, please help to ⭐ the repo.

If you find the code helpful in your research or work, please cite the following papers:

@article{li2025qinsight,
  title={Q-Insight: Understanding Image Quality via Visual Reinforcement Learning},
  author={Li, Weiqi and Zhang, Xuanyu and Zhao, Shijie and Zhang, Yabin and Li, Junlin and Zhang, Li and Zhang, Jian},
  journal={Proceedings of the Advances in Neural Information Processing Systems (NeurIPS)},
  year={2025}
}
@article{zhang2025vqinsight,
  title={VQ-Insight: Teaching VLMs for AI-Generated Video Quality Understanding via Progressive Visual Reinforcement Learning},
  author={Zhang, Xuanyu and Li, Weiqi and Zhao, Shijie and Li, Junlin and Zhang, Li and Zhang, Jian},
  journal={arXiv preprint arXiv:2506.18564},
  year={2025}
}

About

[NeurIPS 2025 Spotlight] Q-Insight: Understanding Image Quality via Visual Reinforcement Learning

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published