[EMNLP23 Poster]GazeVQA

This is the official repository which provides a baseline model for our proposed task: GazeVQA: A Video Question Answering Dataset for Multiview Eye-Gaze Task-Oriented Collaborations.

[Paper]

Model Architecture (see [Paper] for details):

Install

(1) PyTorch. See https://pytorch.org/ for instruction. For example,

conda install pytorch torchvision torchtext cudatoolkit=11.3 -c pytorch

(2) PyTorch Lightning. See https://www.pytorchlightning.ai/ for instruction. For example,

python -m pip install lightning

Data

The released dataset is under this repository [Dataset]

The processed data can be downloaded from the link [processed_data]

Encoding

Before starting, you should encode the instructional videos, scripts, QAs.

Training & Evaluation

Just run the code with single GPU. The code will automatically process training and evalutaion process.

python train.py

Contact

Feel free to contact us if you have any problems: [email protected], or leave an issue in this repo.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
Architecture.pdf		Architecture.pdf
README.md		README.md
architecture.png		architecture.png
modeling.py		modeling.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

[EMNLP23 Poster]GazeVQA

Install

Data

Encoding

Training & Evaluation

Contact

About

Releases

Packages

Languages

showlab/AssistGaze

Folders and files

Latest commit

History

Repository files navigation

[EMNLP23 Poster]GazeVQA

Install

Data

Encoding

Training & Evaluation

Contact

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages