Skip to content

StatBiomed/TemporalVAE-reproducibility

Repository files navigation

TemporalVAE: atlas-assisted temporal mapping of time-series single-cell transcriptomes during embryogenesis

License

Contact: Yuanhua Huang, Yijun Liu

Email: [email protected]

A user-oriented repo is at https://github.com/StatBiomed/TemporalVAE-release with more features to be added.

Introduction

TemporalVAE is a deep generative model in a dual-objective setting to infer the biological time of cells from a compressed latent space. We demonstrated its scalability to millions of cells in the mouse development atlas and its high accuracy in atlas-based cell staging on mouse organogenesis across platforms and during human peri-implantation between in vivo and in vitro conditions. Furthermore, we showed that our atlas-based time predictor can effectively support RNA velocity modeling over short-time cell differentiation, including hematopoiesis and neuronal development.


Contents

Latest Updates

  • v0.1 (May, 2024): Initial release.
  • v0.2 (May, 2024)

Installation

To install TemporalVAE, python 3.10.9 is required and follow the instruction

  1. Install Miniconda3 if not already available.
  2. Clone this repository:
  git clone https://github.com/StatBiomed/TemporalVAE
  1. Navigate to TemporalVAE directory:
  cd TemporalVAE
  1. (5-10 minutes)

    1. Create a conda environment (TemporalVAE-V1.0) with the required dependencies with two environment configuration files. env_necessary.yml inclueds minimal essential dependencies and env_all.yml includes complete development environment. If you encounter any pcks version issues, please check env_all.yml for more version information.
      conda env create -f env_necessary.yml
    1. Install PyTorch correctly for your system, check your computer's configuration (OS, CUDA version, etc.) and download Pytorch from https://pytorch.org.
    2. Install tensorboard
    pip install tensorboard
  2. Activate the TemporalVAE environment you just created:

  conda activate TemporalVAE-V1.0
  1. Create data folder to save your data, results folder to save training and prediction results, logs folder to save detailed log files.
  mkdir data
  mkdir results
  mkdir logs

Reproduce the result in manuscript

Figure 2:

Compare the TemporalVAE with baseline methods on three small datasets cited in Psupertime mansucript.

  1. Preprocess three datasets by the code described in preprocess_data_fromPsupertimeManuscript.md.

  2. Run the code of each benchmarking method.

    1. For example run the LR:
    python demo/Fig2_TemproalVAE_against_benchmark_methods/exp2_LR_toyDataset.py
  3. Run plotFig2_check_corr.py to generate Fig2.

Figure 3:

  1. Preprocess the mouse atlas data and mouse stereo data by
    python -u Fig3_mouse_data/preprocess_data_mouse_embryonic_development_combineData.py
    python -u Fig3_mouse_data/preprocess_data_mouse_embryo_stereo.py
  1. Reproduce the result of Figure3.A&B and save results in folder results/230827_trainOn_mouse_embryonic_development_kFold_testOnYZdata0809
python -u Fig3_mouse_data/TemporalVAE_kFoldOn_mouseAtlas.py
--result_save_path=230827_trainOn_mouse_embryonic_development_kFold_testOnYZdata0809
--vae_param_file=supervise_vae_regressionclfdecoder_mouse_stereo
--file_path=/mouse_embryonic_development/preprocess_adata_JAX_dataset_combine_minGene100_minCell50_hvg1000
--time_standard_type=embryoneg5to5
--train_epoch_num=100  --kfold_test --train_whole_model
> logs/log.log
  1. Plot Figure3.A&B with the result in results/230827_trainOn_mouse_embryonic_development_kFold_testOnYZdata0809, please check Fig3_mouse_data/plot_figure3AB.ipynb

  2. Figure3.C: Compare TemporalVAE with LR, PCA, RF on mouse atlas data, please check Fig3_mouse_data/LR_PCA_RF_kFoldOn_mouseAtlas.ipynb

  3. Figure3.D&E: Models train on mouse atlas data and predict on mouse stereo-seq data, please check Fig3_mouse_data/TemporalVAE_LR_PCA_RF_directlyPredictOn_mouseStereo.ipynb or run code Fig3_mouse_data/TemporalVAE_LR_PCA_RF_directlyPredictOn_mouseStereo.py on console.

Figure 4:

  1. Download original data of eight published human datasets (See details in Supplementary file). Integrate the raw dataset by
python -u Fig4_human_data/integration_humanEmbryo_Z_C_Xiao_M_P_Liu_Tyser_Xiang.py
  1. Figure 4.A-c: Performance of TemporalVAE by training on six training datasets and test on two hold-out test dataset by
python -u Fig4_human_data/TemporalVAE_humanEmbryo_ref6Dataset_queryOnXiang_Tyser.py
  1. Sfig: K-fold test on xiang19 dataset by:
python -u Fig4_human_data/TemporalVAE_humanEmbryo_kFoldOn_xiang19.py

Figure 5:

  1. Preprocess Marmoset and Cynomolgus data by
python -u Fig5_crossSpecies/preprocess_data_marmoset_inVivo.py
python -u Fig5_crossSpecies/preprocess_data_Cyno.py
  1. Figure5.A-D: Performance of TemporalVAE on cross species prediction by
python -u Fig5_crossSpecies/TemporalVAE_crossSpecies_referenceMelania_queryOnCynoAndMarmoset.py

Figure 6:

Identification of temporally sensitive genes by in silico perturbation.Here, we focus on the mouse embryo atlas as a showcase, thanks to its data consistency and broader time range.

python -u Fig6_identify_keyGenes/TemporalVAE_identify_keyGenes_mouseAtlas.py
python -u Fig6_identify_keyGenes/plot_perturbution_results.py

Todo

Figure 5 - RNA velocity:

  1. The data is from paper .
  2. 1 Figure 5. C&E is the data of hematopoiesis cells, please check Fig5_RNA_velocity/VAE_mouse_fineTune_Train_on_U_pairs_S_hematopoiesis.ipynb or run code on console:
python -u Fig5_RNA_velocity/TemporalVAE_mouse_fineTune_Train_on_U_pairs_S.py --sc_file_name=240108mouse_embryogenesis/hematopoiesis --clf_weight=0.2
  1. 2 Figure 5. D&F is the data of neuron cells, please check Fig5_RNA_velocity/VAE_mouse_fineTune_Train_on_U_pairs_S_neuron.ipynb or run code on console:
python -u Fig5_RNA_velocity/TemporalVAE_mouse_fineTune_Train_on_U_pairs_S.py --sc_file_name=240108mouse_embryogenesis/neuron --clf_weight=0.1
  1. The scVelo result in Figure 5. E&F is base on the .ipynb code provided by the dataset's paper, please check Fig5_RNA_velocity/scVelo_hematopoiesis.ipynb and Fig5_RNA_velocity/scVelo_neuron.ipynb //: # (Build a well-structured software packages)

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •