Skip to content

Commit cb3cfb4

Browse files
committed
add demo for downstream tasks
1 parent c1ee07f commit cb3cfb4

18 files changed

+1181
-5
lines changed

.gitignore

+1-1
Original file line numberDiff line numberDiff line change
@@ -111,7 +111,7 @@ venv.bak/
111111
*.pkl
112112
*.pkl.json
113113
*.log.json
114-
*.jpg
114+
# *.jpg
115115
bash
116116
data
117117
data_set

README.md

+9-4
Original file line numberDiff line numberDiff line change
@@ -42,12 +42,17 @@ We plan to release implementations of MogaNet in a few months. Please watch us f
4242

4343
- [x] **ImageNet-1K** Training and Validation Code with [timm](https://github.com/rwightman/pytorch-image-models) [[code](#image-classification)] [[models](https://github.com/Westlake-AI/MogaNet/releases/tag/moganet-in1k-weights)] [[Hugging Face 🤗](https://huggingface.co/MogaNet)]
4444
- [x] **ImageNet-1K** Training and Validation Code in [OpenMixup](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/moganet) / [MMPretrain (TODO)](https://github.com/open-mmlab/mmpretrain)
45-
- [x] Downstream Transfer to **Object Detection and Instance Segmentation on COCO** [[code](detection/)] [[models](https://github.com/Westlake-AI/MogaNet/releases/tag/moganet-det-weights)]
46-
- [x] Downstream Transfer to **Semantic Segmentation on ADE20K** [[code](segmentation/)] [[models](https://github.com/Westlake-AI/MogaNet/releases/tag/moganet-seg-weights)]
47-
- [x] Downstream Transfer to **2D Human Pose Estimation on COCO** [[code](pose_estimation/)] (baseline models are supported)
45+
- [x] Downstream Transfer to **Object Detection and Instance Segmentation on COCO** [[code](detection/)] [[models](https://github.com/Westlake-AI/MogaNet/releases/tag/moganet-det-weights)] [[demo](detection/demo/)]
46+
- [x] Downstream Transfer to **Semantic Segmentation on ADE20K** [[code](segmentation/)] [[models](https://github.com/Westlake-AI/MogaNet/releases/tag/moganet-seg-weights)] [[demo](segmentation/demo/)]
47+
- [x] Downstream Transfer to **2D Human Pose Estimation on COCO** [[code](pose_estimation/)] (baseline models are supported) [[models](https://github.com/Westlake-AI/MogaNet/releases/tag/moganet-pose-weights)] [[demo](pose_estimation/demo/)]
4848
- [ ] Downstream Transfer to **3D Human Pose Estimation** (baseline models will be supported) <!--[[code](human_pose_3d/)] (baseline models will be supported) -->
4949
- [x] Downstream Transfer to **Video Prediction on MMNIST** [[code](video_prediction/)] (baseline models are supported)
50-
- [x] Image Classification on Google Colab and Notebook Demo [[here](demo.ipynb)]
50+
- [x] Image Classification on Google Colab and Notebook Demo [[demo](demo.ipynb)]
51+
52+
<p align="center">
53+
<img src="https://github-production-user-asset-6210df.s3.amazonaws.com/44519745/239330216-a93e71ee-7909-485d-8257-1b34abcd61c6.jpg" width=100% height=100%
54+
class="center">
55+
</p>
5156

5257

5358
## Image Classification

detection/README.md

+8
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,14 @@ python get_flops.py /path/to/config --shape 1280 800
7171
| Mask R-CNN | MogaNet-B | ImageNet-1K | 63.4M | 373.1G | 1x | 49.0 | 43.8 | [config](configs/mask_rcnn_moganet_base_fpn_1x_coco.py) | [log](https://github.com/Westlake-AI/MogaNet/releases/download/moganet-det-weights/mask_rcnn_moganet_base_fpn_1x_coco.log.json) / [model](https://github.com/Westlake-AI/MogaNet/releases/download/moganet-det-weights/mask_rcnn_moganet_base_fpn_1x_coco.pth) |
7272
| Mask R-CNN | MogaNet-L | ImageNet-1K | 102.1M | 495.3G | 1x | 49.4 | 44.2 | [config](configs/mask_rcnn_moganet_large_fpn_1x_coco.py) | [log](https://github.com/Westlake-AI/MogaNet/releases/download/moganet-det-weights/mask_rcnn_moganet_large_fpn_1x_coco.log.json) / [model](https://github.com/Westlake-AI/MogaNet/releases/download/moganet-det-weights/mask_rcnn_moganet_large_fpn_1x_coco.pth) |
7373

74+
## Demo
75+
76+
We provide some demos according to [MMDetection](https://github.com/open-mmlab/mmdetection/demo). Please use [inference_demo](./demo/inference_demo.ipynb) or run the following script:
77+
```bash
78+
cd demo
79+
python image_demo.py demo.png ../configs/moganet/mask_rcnn_moganet_small_fpn_1x_coco.py ../../work_dirs/checkpoints/mask_rcnn_moganet_small_fpn_1x_coco.pth --out-file pred.png
80+
```
81+
7482
## Training
7583

7684
We train the model on a single node with 8 GPUs (a batch size of 16) by default. Start training with the config as:
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
_base_ = [
2+
'../_base_/models/mask_rcnn_r50_fpn.py',
3+
'../_base_/datasets/coco_instance.py',
4+
'../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
5+
]
6+
pretrained = 'https://github.com/SwinTransformer/storage/releases/download/v1.0.0/swin_tiny_patch4_window7_224.pth' # noqa
7+
model = dict(
8+
type='MaskRCNN',
9+
backbone=dict(
10+
_delete_=True,
11+
type='SwinTransformer',
12+
embed_dims=96,
13+
depths=[2, 2, 6, 2],
14+
num_heads=[3, 6, 12, 24],
15+
window_size=7,
16+
mlp_ratio=4,
17+
qkv_bias=True,
18+
qk_scale=None,
19+
drop_rate=0.,
20+
attn_drop_rate=0.,
21+
drop_path_rate=0.2,
22+
patch_norm=True,
23+
out_indices=(0, 1, 2, 3),
24+
with_cp=False,
25+
convert_weights=True,
26+
init_cfg=dict(type='Pretrained', checkpoint=pretrained)),
27+
neck=dict(in_channels=[96, 192, 384, 768]))
28+
29+
optimizer = dict(
30+
_delete_=True,
31+
type='AdamW',
32+
lr=0.0001,
33+
betas=(0.9, 0.999),
34+
weight_decay=0.05,
35+
paramwise_cfg=dict(
36+
custom_keys={
37+
'absolute_pos_embed': dict(decay_mult=0.),
38+
'relative_position_bias_table': dict(decay_mult=0.),
39+
'norm': dict(decay_mult=0.)
40+
}))
41+
lr_config = dict(warmup_iters=1000, step=[8, 11])
42+
runner = dict(max_epochs=12)

detection/demo/demo.png

254 KB
Loading

detection/demo/image_demo.py

+71
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
# Copyright (c) OpenMMLab. All rights reserved.
2+
import asyncio
3+
from argparse import ArgumentParser
4+
5+
from mmdet.apis import (async_inference_detector, inference_detector,
6+
init_detector, show_result_pyplot)
7+
import sys
8+
sys.path.append('../../')
9+
import models # register_model for MogaNet
10+
11+
12+
def parse_args():
13+
parser = ArgumentParser()
14+
parser.add_argument('img', help='Image file')
15+
parser.add_argument('config', help='Config file')
16+
parser.add_argument('checkpoint', help='Checkpoint file')
17+
parser.add_argument('--out-file', default=None, help='Path to output file')
18+
parser.add_argument(
19+
'--device', default='cuda:0', help='Device used for inference')
20+
parser.add_argument(
21+
'--palette',
22+
default='coco',
23+
choices=['coco', 'voc', 'citys', 'random'],
24+
help='Color palette used for visualization')
25+
parser.add_argument(
26+
'--score-thr', type=float, default=0.3, help='bbox score threshold')
27+
parser.add_argument(
28+
'--async-test',
29+
action='store_true',
30+
help='whether to set async options for async inference.')
31+
args = parser.parse_args()
32+
return args
33+
34+
35+
def main(args):
36+
# build the model from a config file and a checkpoint file
37+
model = init_detector(args.config, args.checkpoint, device=args.device)
38+
# test a single image
39+
result = inference_detector(model, args.img)
40+
# show the results
41+
show_result_pyplot(
42+
model,
43+
args.img,
44+
result,
45+
palette=args.palette,
46+
score_thr=args.score_thr,
47+
out_file=args.out_file)
48+
49+
50+
async def async_main(args):
51+
# build the model from a config file and a checkpoint file
52+
model = init_detector(args.config, args.checkpoint, device=args.device)
53+
# test a single image
54+
tasks = asyncio.create_task(async_inference_detector(model, args.img))
55+
result = await asyncio.gather(tasks)
56+
# show the results
57+
show_result_pyplot(
58+
model,
59+
args.img,
60+
result[0],
61+
palette=args.palette,
62+
score_thr=args.score_thr,
63+
out_file=args.out_file)
64+
65+
66+
if __name__ == '__main__':
67+
args = parse_args()
68+
if args.async_test:
69+
asyncio.run(async_main(args))
70+
else:
71+
main(args)

detection/demo/inference_demo.ipynb

+202
Large diffs are not rendered by default.

detection/demo/pred.png

574 KB
Loading

pose_estimation/README.md

+8
Original file line numberDiff line numberDiff line change
@@ -66,6 +66,14 @@ We provide results of MogaNet and popular architectures (Swin, ConvNeXt, and Uni
6666
| UniFormer-B | 256x192 | 53.5M | 9.2G | 75.0 | 90.6 | 83.0 | 80.4 | 67.8 | 77.7 | [config](https://github.com/Westlake-AI/MogaNet/tree/main/pose_estimation/configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/coco/uniformer_b_coco_256x192.py) | [log](https://github.com/Westlake-AI/MogaNet/releases/download/moganet-pose-weights/uniformer_b_coco_256x192.log.json) \| [model](https://github.com/Westlake-AI/MogaNet/releases/download/moganet-pose-weights/uniformer_b_coco_256x192.pth) |
6767
| UniFormer-B | 384x288 | 53.5M | 14.8G | 76.7 | 90.8 | 84.0 | 81.4 | 69.3 | 79.7 | [config](https://github.com/Westlake-AI/MogaNet/tree/main/pose_estimation/configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/coco/uniformer_b_coco_384x288.py) | [log](https://github.com/Westlake-AI/MogaNet/releases/download/moganet-pose-weights/uniformer_b_coco_384x288.log.json) \| [model](https://github.com/Westlake-AI/MogaNet/releases/download/moganet-pose-weights/uniformer_b_coco_384x288.pth) |
6868

69+
## Demo
70+
71+
We provide some demos according to [MMPose](https://github.com/open-mmlab/mmpose/demo). Please use [inference_demo](./demo/inference_demo.ipynb) or run the python tools with following script:
72+
```bash
73+
cd demo
74+
python top_down_img_demo.py path/to/config path/to/checkpoint --img-root coco2017_val --json-file ../data/coco/annotations/person_keypoints_val2017.json --show
75+
```
76+
6977
## Training
7078

7179
We train the model on a single node with 8 GPUs by default (a batch size of 32 $\times$ 8 for Top-Down). Start training with the config as:
Loading

pose_estimation/demo/inference_demo.ipynb

+217
Large diffs are not rendered by default.
+134
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,134 @@
1+
# Copyright (c) OpenMMLab. All rights reserved.
2+
import os
3+
import warnings
4+
from argparse import ArgumentParser
5+
6+
import mmcv
7+
from xtcocotools.coco import COCO
8+
9+
from mmpose.apis import (inference_top_down_pose_model, init_pose_model,
10+
vis_pose_result)
11+
from mmpose.datasets import DatasetInfo
12+
13+
import sys
14+
sys.path.append('../../')
15+
import models # register_model for MogaNet
16+
17+
18+
def main():
19+
"""Visualize the demo images.
20+
21+
Require the json_file containing boxes.
22+
"""
23+
parser = ArgumentParser()
24+
parser.add_argument('pose_config', help='Config file for detection')
25+
parser.add_argument('pose_checkpoint', help='Checkpoint file')
26+
parser.add_argument('--img-root', type=str, default='', help='Image root')
27+
parser.add_argument(
28+
'--json-file',
29+
type=str,
30+
default='',
31+
help='Json file containing image info.')
32+
parser.add_argument(
33+
'--show',
34+
action='store_true',
35+
default=False,
36+
help='whether to show img')
37+
parser.add_argument(
38+
'--out-img-root',
39+
type=str,
40+
default='',
41+
help='Root of the output img file. '
42+
'Default not saving the visualization images.')
43+
parser.add_argument(
44+
'--device', default='cuda:0', help='Device used for inference')
45+
parser.add_argument(
46+
'--kpt-thr', type=float, default=0.3, help='Keypoint score threshold')
47+
parser.add_argument(
48+
'--radius',
49+
type=int,
50+
default=4,
51+
help='Keypoint radius for visualization')
52+
parser.add_argument(
53+
'--thickness',
54+
type=int,
55+
default=1,
56+
help='Link thickness for visualization')
57+
58+
args = parser.parse_args()
59+
60+
assert args.show or (args.out_img_root != '')
61+
62+
coco = COCO(args.json_file)
63+
# build the pose model from a config file and a checkpoint file
64+
pose_model = init_pose_model(
65+
args.pose_config, args.pose_checkpoint, device=args.device.lower())
66+
67+
dataset = pose_model.cfg.data['test']['type']
68+
dataset_info = pose_model.cfg.data['test'].get('dataset_info', None)
69+
if dataset_info is None:
70+
warnings.warn(
71+
'Please set `dataset_info` in the config.'
72+
'Check https://github.com/open-mmlab/mmpose/pull/663 for details.',
73+
DeprecationWarning)
74+
else:
75+
dataset_info = DatasetInfo(dataset_info)
76+
77+
img_keys = list(coco.imgs.keys())
78+
79+
# optional
80+
return_heatmap = False
81+
82+
# e.g. use ('backbone', ) to return backbone feature
83+
output_layer_names = None
84+
85+
# process each image
86+
for i in mmcv.track_iter_progress(range(len(img_keys))):
87+
# get bounding box annotations
88+
image_id = img_keys[i]
89+
image = coco.loadImgs(image_id)[0]
90+
image_name = os.path.join(args.img_root, image['file_name'])
91+
ann_ids = coco.getAnnIds(image_id)
92+
93+
# make person bounding boxes
94+
person_results = []
95+
for ann_id in ann_ids:
96+
person = {}
97+
ann = coco.anns[ann_id]
98+
# bbox format is 'xywh'
99+
person['bbox'] = ann['bbox']
100+
person_results.append(person)
101+
102+
# test a single image, with a list of bboxes
103+
pose_results, returned_outputs = inference_top_down_pose_model(
104+
pose_model,
105+
image_name,
106+
person_results,
107+
bbox_thr=None,
108+
format='xywh',
109+
dataset=dataset,
110+
dataset_info=dataset_info,
111+
return_heatmap=return_heatmap,
112+
outputs=output_layer_names)
113+
114+
if args.out_img_root == '':
115+
out_file = None
116+
else:
117+
os.makedirs(args.out_img_root, exist_ok=True)
118+
out_file = os.path.join(args.out_img_root, f'vis_{i}.jpg')
119+
120+
vis_pose_result(
121+
pose_model,
122+
image_name,
123+
pose_results,
124+
dataset=dataset,
125+
dataset_info=dataset_info,
126+
kpt_score_thr=args.kpt_thr,
127+
radius=args.radius,
128+
thickness=args.thickness,
129+
show=args.show,
130+
out_file=out_file)
131+
132+
133+
if __name__ == '__main__':
134+
main()

0 commit comments

Comments
 (0)