arXiv: https://arxiv.org/abs/2405.02793
Please visit the webpage for all the information about the IIW project, data, visualizations, and much more. The data can be downloaded directly from the datasets/
folder, as well as from Huggingface (see below).
Please reach out to [email protected] for thoughts/feedback/questions/collaborations.
License: CC-BY-4.0
from datasets import load_dataset
# `name` can be one of: IIW-400, DCI_Test, DOCCI_Test, CM_3600, LocNar_Eval
# refer: https://github.com/google/imageinwords/blob/main/datasets/README.md
dataset = load_dataset('google/imageinwords', token=None, name="IIW-400", trust_remote_code=True)
If you use our data or refer to our work, please include the following citation
@misc{garg2024imageinwords,
title={ImageInWords: Unlocking Hyper-Detailed Image Descriptions},
author={Roopal Garg and Andrea Burns and Burcu Karagol Ayan and Yonatan Bitton and Ceslee Montgomery and Yasumasa Onoe and Andrew Bunner and Ranjay Krishna and Jason Baldridge and Radu Soricut},
year={2024},
eprint={2405.02793},
archivePrefix={arXiv},
primaryClass={cs.CV}
}