Skip to content

uncbiag/SimpleClick

Repository files navigation

University of North Carolina at Chapel Hill

Qin Liu, Zhenlin Xu, Gedas Bertasius, Marc Niethammer

ICCV 2023

drawing

Environment

Training and evaluation environment: Python3.8.8, PyTorch 1.11.0, Ubuntu 20.4, CUDA 11.0. Run the following command to install required packages.

pip3 install -r requirements.txt

You can build a container with the configured environment using our Dockerfiles. Our Dockerfiles only support CUDA 11.0/11.4/11.6. If you use different CUDA drivers, you need to modify the base image in the Dockerfile (This is annoying that you need a matched image in Dockerfile for your CUDA driver, otherwise the gpu doesn't work in the container. Any better solutions?). You also need to configue the paths to the datasets in config.yml before training or testing.

Demo

drawing

An example script to run the demo.

python3 demo.py --checkpoint=./weights/simpleclick_models/cocolvis_vit_huge.pth --gpu 0

Some test images can be found here.

Evaluation

Before evaluation, please download the datasets and models, and then configure the path in config.yml.

Use the following code to evaluate the huge model.

python scripts/evaluate_model.py NoBRS \
--gpu=0 \
--checkpoint=./weights/simpleclick_models/cocolvis_vit_huge.pth \
--eval-mode=cvpr \
--datasets=GrabCut,Berkeley,DAVIS,PascalVOC,SBD,COCO_MVal,ssTEM,BraTS,OAIZIB

Training

Before training, please download the MAE pretrained weights (click to download: ViT-Base, ViT-Large, ViT-Huge) and configure the dowloaded path in config.yml.

Use the following code to train a huge model on C+L:

python train.py models/iter_mask/plainvit_huge448_cocolvis_itermask.py \
--batch-size=32 \
--ngpus=4

Model weights

SimpleClick models: Google Drive

Datasets

We train all our models on SBD and COCO+LVIS and evaluate them on GrabCut, Berkeley, DAVIS, SBD and PascalVOC. We also provide links to additional datasets: ADE20k and OpenImages, that are used in ablation study.

Dataset Description Download Link
ADE20k 22k images with 434k instances (total) official site
OpenImages 944k images with 2.6M instances (total) official site
MS COCO 118k images with 1.2M instances (train) official site
LVIS v1.0 100k images with 1.2M instances (total) official site
COCO+LVIS* 99k images with 1.5M instances (train) original LVIS images +
our combined annotations
SBD 8498 images with 20172 instances for (train)
2857 images with 6671 instances for (test)
official site
Grab Cut 50 images with one object each (test) GrabCut.zip (11 MB)
Berkeley 96 images with 100 instances (test) Berkeley.zip (7 MB)
DAVIS 345 images with one object each (test) DAVIS.zip (43 MB)
Pascal VOC 1449 images with 3417 instances (validation) official site
COCO_MVal 800 images with 800 instances (test) COCO_MVal.zip (127 MB)
BraTS 369 cases (test) BraTS20.zip (4.2 MB)
OAI-ZIB 150 cases (test) OAI-ZIB.zip (27 MB)

Don't forget to change the paths to the datasets in config.yml after downloading and unpacking.

(*) To prepare COCO+LVIS, you need to download original LVIS v1.0, then download and unpack our pre-processed annotations that are obtained by combining COCO and LVIS dataset into the folder with LVIS v1.0.

Notes

[03/11/2023] Add an xTiny model.

[10/25/2022] Add docker files.

[10/02/2022] Release the main models. This repository is still under active development.

License

The code is released under the MIT License. It is a short, permissive software license. Basically, you can do whatever you want as long as you include the original copyright and license notice in any copy of the software/source.

Citation

@InProceedings{Liu_2023_ICCV,
    author    = {Liu, Qin and Xu, Zhenlin and Bertasius, Gedas and Niethammer, Marc},
    title     = {SimpleClick: Interactive Image Segmentation with Simple Vision Transformers},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2023},
    pages     = {22290-22300}
}

Acknowledgement

Our project is developed based on RITM. Thanks for the nice demo GUI :)

About

SimpleClick: Interactive Image Segmentation with Simple Vision Transformers (ICCV 2023)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •