['25 VLDB] Déjà Vu: Efficient Video-Language Query Engine with Learning-based Inter-Frame Computation Reuse

This repository contains artifact for the paper "Déjà Vu: Efficient Video-Language Query Engine with Learning-based Inter-Frame Computation Reuse".

Preparing Datasets

2. Downloading videos

Download the NextQA dataset from the official NextQA repository. Place them under ${data_dir}/datasets/nextqa/videos.

mkdir -p ${data_dir}/datasets/nextqa/videos
cd ${data_dir}/datasets/nextqa/videos
unzip NExTVideo.zip
ls # Should show 0000/ 0001/ 0002/

3. Transcode videos

cd /workspace
python -m src.scripts.transcode dry_run=false
python -m src.scripts.transcode split=val dry_run=false
python -m src.scripts.transcode split=test dry_run=false

4. Extracting input features for training

cd /workspace
python -m src.scripts.extract dry_run=false
python -m src.scripts.extract split=val dry_run=false
python -m src.scripts.extract split=test target_features=\'i,o\' num_gpus=4 num_workers=32 dry_run=false

5. Extracting compressed features

cd /workspace
python -m src.scripts.extract_compressed dry_run=false
python -m src.scripts.extract_compressed split=val dry_run=false
python -m src.scripts.extract_compressed split=test dry_run=false

Check if dataset info is populated correctly

python -m src.data.components.nextqa +split=train +fps=2 +base_model_name=openai/clip-vit-large-patch14 +return_compressed=true +regenerate_dataset_info=false
python -m src.data.components.nextqa +split=val +fps=2 +base_model_name=openai/clip-vit-large-patch14 +return_compressed=true +regenerate_dataset_info=false
python -m src.data.components.nextqa +split=test +fps=2 +base_model_name=openai/clip-vit-large-patch14 +return_compressed=true +regenerate_dataset_info=false

6. Training ReuseViT

One can train the ReuseViT oneself, or use the checkpoint we provide.

6-1. Training from scratch

items=(0.95 0.9 0.85 0.8)
for item in "${items[@]}"; do
    python src/train.py model.compile=true experiment=nextqa model.loss.target_reuse_rate=${item} model.dry_run=false
done

6-2. Checkpoints

ReuseViT checkpoints

Place the checkpoints under ${data_dir}/checkpoints/nextqa.

mkdir -p ${data_dir}/checkpoints/nextqa
# Download the checkpoints and place them under ${data_dir}/checkpoints/nextqa
ls ${data_dir}/checkpoints/nextqa # should look like nexqa-68.ckpt nextqa-85.ckpt nextqa-93.ckpt nextqa-95.ckpt

7. Extracting ReuseViT and baselines for Testing

7-1. ReuseViT

cd /workspace
items=(68 85 93 95)
for item in "${items[@]}"; do
    python src/eval.py experiment=nextqa-hard-${item} model.dry_run=false data.test_split=test &
done

7-2. DiffRate

For running the DiffRate, one should train the model to determine the pruning and merging decision. For now, we provide the log we trained here, and plan on sharing the code to train DiffRate.

mkdir -p ${data_dir}/checkpoints/diffrate
tar xvzf nextqa.tar.gz -C ${data_dir}/checkpoints/diffrate

items=('31.0' '34.7' '38.5' '42.3' '46.1')
for item in "${items[@]}"; do
    python -m src.scripts.extract mode=diffrate split=test target_features=o num_gpus=4 num_workers=24 batch_size=16 +diffrate_flops=${item} dry_run=false
done

7-3. Eventful

for k in $(seq 1 15); do
    python src/eval.py experiment=eventful reuse_module.decision.k=$((k*10)) model.dry_run=false &
done

7-4. CMC

for k in $(seq 3 3 45); do
    python src/eval.py experiment=cmc reuse_module.decision.threshold=-$k model.dry_run=false data.test_split=test &
done

8. Running end tasks

First, build docker container for running the end tasks.

cd docker/next-gqa
./build.sh
./launch.sh

We assume the following commands are run inside the docker container.

8-1. Original

cd /workspace/third_parties/NExT-GQA/code/TempGQA

# Original
CUDA_VISIBLE_DEVICES=0 ./shells/next_test_dual.sh original 2>&1 | tee log_original.txt &

8-2. ReuseViT

items=('68' '85' '93' '95')
for item in "${items[@]}"; do
    ./shells/next_test_dual.sh reuse-$item 2>&1 | tee log_reuse-$item.txt &
done

8-4. Eventful

items=$(seq 1 15)
for item in "${items[@]}"; do
    ./shells/next_test_dual.sh eventful-$((item*10)) 2>&1 | tee log_eventful_$item.txt &
done

8-5. CMC

items=$(seq 3 3 45)
for k in "${items[@]}"; do
    ./shells/next_test_dual.sh cmc-$threshold 2>&1 | tee log_cmc_$threshold.txt &
done

9. Plotting graph

cd /workspace/scripts
python plot_nextqa.py

After running the code, one should be able to see the results like the following.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github		.github
assets		assets
configs		configs
dataset/nextqa		dataset/nextqa
docker		docker
scripts		scripts
src		src
third_parties		third_parties
.env.example		.env.example
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
.project-root		.project-root
Makefile		Makefile
README.md		README.md
environment.yaml		environment.yaml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

['25 VLDB] Déjà Vu: Efficient Video-Language Query Engine with Learning-based Inter-Frame Computation Reuse

Table of Contents

1. Prerequisites

1-1. Clone and build docker image

1-2. Lanuch docker container

1-3. Set root of the path

Preparing Datasets

2. Downloading videos

3. Transcode videos

4. Extracting input features for training

5. Extracting compressed features

6. Training ReuseViT

6-1. Training from scratch

6-2. Checkpoints

7. Extracting ReuseViT and baselines for Testing

7-1. ReuseViT

7-2. DiffRate

7-3. Eventful

7-4. CMC

8. Running end tasks

8-1. Original

8-2. ReuseViT

8-4. Eventful

8-5. CMC

9. Plotting graph

About

Uh oh!

Releases

Packages

Languages

casys-kaist/DejaVu

Folders and files

Latest commit

History

Repository files navigation

['25 VLDB] Déjà Vu: Efficient Video-Language Query Engine with Learning-based Inter-Frame Computation Reuse

Table of Contents

1. Prerequisites

1-1. Clone and build docker image

1-2. Lanuch docker container

1-3. Set root of the path

Preparing Datasets

2. Downloading videos

3. Transcode videos

4. Extracting input features for training

5. Extracting compressed features

6. Training ReuseViT

6-1. Training from scratch

6-2. Checkpoints

7. Extracting ReuseViT and baselines for Testing

7-1. ReuseViT

7-2. DiffRate

7-3. Eventful

7-4. CMC

8. Running end tasks

8-1. Original

8-2. ReuseViT

8-4. Eventful

8-5. CMC

9. Plotting graph

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages