Prepare Data

Download datasets as you need, and organize them as following:

code_root/
└── data/
   ├── conceptual-captions/
   │   ├── train_image/
   │   ├── val_image/
   │   ├── train_frcnn/
   │   ├── val_frcnn/
   │   ├── train.json
   │   ├── val.json
   │   ├── train_frcnn.json
   │   └── val_frcnn.json
   ├── en_corpus/
   │   ├── wiki.doc
   │   └── bc1g.doc
   ├── vcr/
   │   ├── vcr1images/
   │   ├── train.jsonl
   │   ├── val.jsonl
   │   └── test.jsonl
   └── coco/
       ├── train2014/
       ├── val2014/
       ├── test2015/
       ├── annotations/
       ├── vqa/
       ├── refcoco+/
       │   └── proposal/
       └── vgbua_res101_precomputed/
           ├── trainval2014_resnet101_faster_rcnn_genome
           └── test2015_resnet101_faster_rcnn_genome

Pre-training Data

Conceptual Captions

See ReadMe.txt.

English Wikipedia & BooksCorpus

Wikipedia: GoogleDrive / BaiduPan
BooksCorpus: GoogleDrive / BaiduPan

Fine-tuning Data

VCR

Download and unzip images & annotations from here.

VQA & RefCOCO+

Common

Download and unzip COCO 2014 images & annotations from here.

VQA

Download and unzip annotations from here (including "VQA Annotations" and "VQA Input Questions"), place all these files directly under ./data/coco/vqa.
Download and unzip following precomputed boxes & features into ./data/coco/vgbua_res101_precomputed.
- train2014 + val2014: GoogleDrive / BaiduPan
- test2015: GoogleDrive / BaiduPan
Download answer vocabulary from GoogleDrive / BaiduPan, place it under the folder ./data/coco/vqa/.

RefCOCO+

Download and unzip annotations, place all files in refcoco+/ directly under ./data/coco/refcoco+.
Download region proposals, place all files in detections/refcoco+_unc directly under ./data/coco/refcoco+/proposal.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PREPARE_DATA.md

PREPARE_DATA.md

Prepare Data

Pre-training Data

Conceptual Captions

English Wikipedia & BooksCorpus

Fine-tuning Data

VCR

VQA & RefCOCO+

Common

VQA

RefCOCO+

Files

PREPARE_DATA.md

Latest commit

History

PREPARE_DATA.md

File metadata and controls

Prepare Data

Pre-training Data

Conceptual Captions

English Wikipedia & BooksCorpus

Fine-tuning Data

VCR

VQA & RefCOCO+

Common

VQA

RefCOCO+