TorchBeastPopArt

PopArt extension to TorchBeast, the PyTorch implementation of IMPALA.

Experiments

The PopArt extension was used to train a multi-task agent for six Atari games (AirRaid, Carnival, DemonAttack, Pong, SpaceInvaders, all with the NoFrameskip-v4 variant) and compared to the corresponding single-task agents and to a simpler mulit-task agent without PopArt normalisation. More details on these experiments can be found in the report.

Movies

Single-task:

Multi-task (clipped):

Multi-task PopArt:

The different games plans learned by these three models, can be illustrated with the help of saliency maps (here red is the policy saliency and green is the baseline saliency). More details on these experiments can be found in the report.

Saliency:

Trained models

The following trained models can be downloaded from the models directory:

Name	Environments (NoFrameskip-v4)	Steps (millions)
AirRaid	AirRaid	50
Carnival	Carnival	50
DemonAttack	DemonAttack	50
NameThisGame	NameThisGame	50
Pong	Pong	50
SpaceInvaders	SpaceInvaders	50
MultiTask	AirRaid, Carnival, DemonAttack, NameThisGame, Pong, SpaceInvaders	300
MultiTaskPopArt	AirRaid, Carnival, DemonAttack, NameThisGame, Pong, SpaceInvaders	300

Running the code

Preparation

For our experiments we used the faster PolyBeast implementation of TorchBeast and refer the reader to the installation instructions in the original repository. However, since we have encountered problems getting this version to work, we also added multi-task training functionality and PopArt to the MonoBeast implementation of TorchBeast. However, some of the testing functionality is not implemented for this version, but PolyBeast can be used for this if the imports for nest and libtorchbeast are commented out.

Since it is more convenient to get PolyBeast to run, these are the platforms on which we managed to install and use it:

Ubuntu 18.04
MacOS (CPU only)
Google Cloud Platform (Standard machine with NVIDIA Tesla P100 GPUs)

Training a model

python -m torchbeast.polybeast --mode train --xpid MultiTaskPopArt --env AirRaidNoFrameskip-v4,CarnivalNoFrameskip-v4,DemonAttackNoFrameskip-v4,NameThisGameNoFrameskip-v4,PongNoFrameskip-v4,SpaceInvadersNoFrameskip-v4 --total_steps 300000000 --use_popart

There are the following additional flags, as compared to the original TorchBeast implementation:

use_popart, to enable to PopArt extension
save_model_every_nsteps, to save intermediate models during training

With MonoBeast

python -m torchbeast.monobeast --mode train --xpid MultiTaskPopArt --env AirRaidNoFrameskip-v4,CarnivalNoFrameskip-v4,DemonAttackNoFrameskip-v4,NameThisGameNoFrameskip-v4,PongNoFrameskip-v4,SpaceInvadersNoFrameskip-v4 --total_steps 300000000 --use_popart

In addition MonoBeast can also be used to run two other models: a small CNN (optionally with an LSTM) and an Attention-Augmented Agent (models selected with the flag agent_type). Unfortunately we did not get this model to train properly, but for the sake of completeness and possible future reference, here are the additional flags that can be used with this model:

frame_height and frame_width, which set the dimensions to which frames are rescaled (in the original paper the original size is used as opposed to the rescaling done in TorchBeast)
aaa_input_format (with choices gray_stack, rgb_last, rgb_stack), which decides how frames are formatted as input for the network (where rgb_last only feeds one of every four frames in RGB, as is done in the original paper)

Testing a model

python -m torchbeast.polybeast --mode test --xpid MultiTaskPopArt --env PongNoFrameskip-v4 --savedir=./models
python -m torchbeast.polybeast --mode test_render --xpid MultiTaskPopArt --env PongNoFrameskip-v4 --savedir=./models

Saliency

python -m torchbeast.saliency --xpid MultiTaskPopArt --env PongNoFrameskip-v4 --first_frame 0 --num_frames 100 --savedir=./models

Note that compared to the original saliency code, the extension does not produce a movie directly, but saves the frames as individual images. Animated gifs can subsequently be produced with a Jupyter notebook.

CNN filter comparisons

NOTE: it is assumed that a) intermediate model checkpoints have been saved (flag save_model_every_nsteps) and b) the results for all models are saved in the same parent directory and have the exact names used in our experiments (see in the table)

python -m torchbeast.analysis.analyze_resnet --model_load_path /path/to/directory --mode filter_comp --comp_num_models 10

The different comparisons presented in the report can be set with the flag comp_between. By default the only comparisons done are between the vanilla multi-task model and the multi-task PopArt model, as well as between each of these models and all single-task models.

For plotting the following command can be used (saving the figures in the same directory that the data generated by the previous command was loaded from):

python -m torchbeast.analysis.analyze_resnet --load_path /path/to/directory --mode filter_comp_plot --save_figures

For more options to the data generation and plotting, the help texts can be consulted.

References

TorchBeast

@article{torchbeast2019,
  title={{TorchBeast: A PyTorch Platform for Distributed RL}},
  author={Heinrich K\"{u}ttler and Nantas Nardelli and Thibaut Lavril and Marco Selvatici and Viswanath Sivakumar and Tim Rockt\"{a}schel and Edward Grefenstette},
  year={2019},
  journal={arXiv preprint arXiv:1910.03552},
  url={https://github.com/facebookresearch/torchbeast},
}

PopArt

@inproceedings{hessel2019,
  title={Multi-task deep reinforcement learning with popart},
  author={Hessel, Matteo and Soyer, Hubert and Espeholt, Lasse and Czarnecki, Wojciech and Schmitt, Simon and van Hasselt, Hado},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={33},
  pages={3796--3803},
  year={2019}
}

Saliency

@article{greydanus2017visualizing,
  title={Visualizing and Understanding Atari Agents},
  author={Greydanus, Sam and Koul, Anurag and Dodge, Jonathan and Fern, Alan},
  journal={arXiv preprint arXiv:1711.00138},
  year={2017},
  url={https://github.com/greydanus/visualize_atari},
}

Name		Name	Last commit message	Last commit date
Latest commit History 123 Commits
libtorchbeast		libtorchbeast
models		models
movies		movies
nest		nest
results		results
scripts		scripts
tests		tests
third_party		third_party
torchbeast		torchbeast
.clang-format		.clang-format
.flake8		.flake8
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
main.py		main.py
plot.png		plot.png
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py
torchbeast.yml		torchbeast.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TorchBeastPopArt

Experiments

Movies

Trained models

Running the code

Preparation

Training a model

With MonoBeast

Testing a model

Saliency

CNN filter comparisons

References

About

Releases

Packages

Languages

License

LLYYKK/torchbeastpopart

Folders and files

Latest commit

History

Repository files navigation

TorchBeastPopArt

Experiments

Movies

Trained models

Running the code

Preparation

Training a model

With MonoBeast

Testing a model

Saliency

CNN filter comparisons

References

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages