Dimensions of extracted features

Hi, thanks for sharing codes.
Two questions here:
1. I extracted video features by using this pre-trained model (resnet-34-kinetics-cpu.pth) and I checked the outputs that the dimension of extracted features for each segment (16 frames) is 512 dims. However, in your paper, for this model, it should be 512/2=256 dims after global average pooling. Please correct me if I am wrong.

2. For the pre-trained models provided by you, there are "resnet-34-kinetics-cpu.pth" and "resnext-101-kinetics.pth". I would ask - why is the latter model size smaller than the former's? To my understanding, the latter model should have more parameters to be trained (more filters/feature channels).

Looking forward to your reply. Thanks in advance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Dimensions of extracted features #53

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Dimensions of extracted features #53

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions