Skip to content

Dimensions of extracted features #53

@kaiqiangh

Description

@kaiqiangh

Hi, thanks for sharing codes.
Two questions here:

  1. I extracted video features by using this pre-trained model (resnet-34-kinetics-cpu.pth) and I checked the outputs that the dimension of extracted features for each segment (16 frames) is 512 dims. However, in your paper, for this model, it should be 512/2=256 dims after global average pooling. Please correct me if I am wrong.

  2. For the pre-trained models provided by you, there are "resnet-34-kinetics-cpu.pth" and "resnext-101-kinetics.pth". I would ask - why is the latter model size smaller than the former's? To my understanding, the latter model should have more parameters to be trained (more filters/feature channels).

Looking forward to your reply. Thanks in advance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions