GitHub - kiradiso/cpnet: Learning Video Representations from Correspondence Proposals (CVPR 2019 Oral)

Learning Video Representations from Correspondence Proposals

Created by Xingyu Liu, Joon-Young Lee and Hailin Jin from Stanford University and Adobe Research.

Citation

If you find our work useful in your research, please cite:

    @article{liu:2019:cpnet,
      title={Learning Video Representations from Correspondence Proposals},
      author={Xingyu Liu and Joon-Young Lee and Hailin Jin},
      journal={CVPR},
      year={2019}
    }

Abstract

Correspondences between frames encode rich information about dynamic content in videos. However, it is challenging to effectively capture and learn those due to their irregular structure and complex dynamics. In this paper, we propose a novel neural network that learns video representations by aggregating information from potential correspondences. This network, named CPNet, can learn evolving 2D fields with temporal consistency. In particular, it can effectively learn representations for videos by mixing appearance and long-range motion with an RGB-only input. We provide extensive ablation experiments to validate our model. CPNet shows stronger performance than existing methods on Kinetics and achieves the state-of-the-art performance on Something-Something and Jester. We provide analysis towards the behavior of our model and show its robustness to errors in proposals.

Installation

Install TensorFlow. The code is tested under TF1.9.0 GPU version, g++ 5.4.0, CUDA 9.0 and Python 3.5 on Ubuntu 16.04. There are also some dependencies for a few Python libraries for data processing and visualizations like cv2. It's highly recommended that you have access to GPUs.

Compile Customized TF Operators

The TF operators are included under tf_ops, you need to compile them first by make under each ops subfolder (check Makefile). Update arch in the Makefiles for different CUDA Compute Capability that suits your GPU if necessary.

Usage

Jester Experiments

The data preprocessing scripts are included in utils/data_preparation. To process the raw data, first download Jester dataset. Then extract the files in, for example /raid/datasets/jester/20bn-jester-v1, such that the directory looks like

/raid/datasets/jester/
  20bn-jester-v1
    1/
    2/
    ...
    148092/
  jester-v1-test.csv
  jester-v1-train.csv
  jester-v1-validation.csv

Then cd into directory utils/data_preparation/jester. Suppose the default directory containing the output processed files is /datasets/jester/gulp_128, then execute commands following the README.md in that directory to generate gulped files of video data. For other output directories other than the default one, directories in utils/data_preparation/jester/gen_gulp.sh will also need to be changed. The output processed data directory should look like

/datasets/jester/gulp_128/
  train/
    Doing other things/
      100018.gmeta
      100018.gulp
      ...
    Drumming Fingers/
      100022.gmeta
      100022.gulp
      ...
    ...
    label2idx.json
    gulp_log.csv
    opts.json
  val/
    Doing other things/
      100090.gmeta
      100090.gulp
      ...
    Drumming Fingers/
      100001.gmeta
      100001.gulp
      ...
    ...
    label2idx.json
    gulp_log.csv
    opts.json
  test/
    0/
      100005.gmeta
      100005.gulp
      ...
    label2idx.json
    gulp_log.csv
    opts.json

Training and Evaluation

First download the ImageNet pretrained ResNet model from here and put it in pretrained_models/ImageNet-ResNet34.npz.

To train the model, rename command_train.sh.jester.experiment to be command_train.sh and simply execute the shell script command_train.sh. Batch size, learning rate etc are adjustable.

sh command_train.sh

To evaluate the model, rename command_evaluate.sh.jester.experiment to be command_evaluate.sh and simply execute the shell script command_evaluate.sh.

sh command_evaluate.sh

To test the model, rename command_test.sh.jester.experiment to be command_test.sh and simply execute the shell script command_test.sh.

sh command_test.sh

Something-Something Experiment

Similar to Jester experiment. To be released. Stay Tuned.

License

Our code is released under CC BY-NC-SA-4.0 License (see LICENSE file for details).

Related Projects

FlowNet3D: Learning Scene Flow in 3D Point Clouds by Liu et al. (CVPR 2019). Code and data released in GitHub.
PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation by Qi et al. (CVPR 2017 Oral Presentation). Code and data released in GitHub.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
doc		doc
gulp_scripts		gulp_scripts
models		models
tf_ops		tf_ops
utils		utils
LICENSE		LICENSE
README.md		README.md
command_evaluate.sh.jester.experiment		command_evaluate.sh.jester.experiment
command_test.sh.jester.experiment		command_test.sh.jester.experiment
command_train.sh.jester.experiment		command_train.sh.jester.experiment
evaluate.py		evaluate.py
test.py		test.py
tf_util.py		tf_util.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Learning Video Representations from Correspondence Proposals

Citation

Abstract

Installation

Compile Customized TF Operators

Usage

Jester Experiments

Training and Evaluation

Something-Something Experiment

License

Related Projects

About

Uh oh!

Releases

Packages

Languages

License

kiradiso/cpnet

Folders and files

Latest commit

History

Repository files navigation

Learning Video Representations from Correspondence Proposals

Citation

Abstract

Installation

Compile Customized TF Operators

Usage

Jester Experiments

Training and Evaluation

Something-Something Experiment

License

Related Projects

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages