This code was used to generate the results in "ReSeeding Latent States for Sequential Language Understanding".
Below are the instructions for installing and running the code on linux using conda.
conda create -n reseed python=3.10
conda activate reseed
pip install -e git+https://github.com/StephAO/gym-minigrid.git#egg=minigrid
Please note that if you need a non-default pytorch installation, here is a good place to do it.
git clone <TODO>
cd ReSEED
pip install -e .
Fill in necessary keys in reseed/personal_keys for desired functionality. There are three personal keys defined:
WANDB_ENTITY: if specified and account is logged into in the cli, logs experiments to wandb
OPENAI_API_KEY = used for experiments with openai models (i.e., when calling icl_promting with an openai model)
ANTHROPIC_API_KEY = used for experiments with anthropic models (i.e., when calling icl_promting with an anthropic model)
python -m text_traj_datasets.create_datasets
python -m contrastive_concepts.main
For scripts to run experiments found in the paper, see scripts/run_n_sample_sweep.sh
, scripts/run_ablations.sh
, and run_icl.sh
.
NOTE: The [STATE] token described in the paper is defined using the [CLS] token due to huggingface tokenizer nomenclature.
If you find this code useful, please cite the following paper:
@inproceedings{
aroca-ouellette2025reseeding,
title={ReSeeding Latent States for Sequential Language Understanding},
author={St{\'e}phane Aroca-Ouellette and Katharina von der Wense and Alessandro Roncone},
booktitle={The 2025 Conference on Empirical Methods in Natural Language Processing},
year={2025},
url={https://openreview.net/forum?id=9fd2YtkQk5}
}