-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathfile_info.txt
130 lines (84 loc) · 5.62 KB
/
file_info.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
This file contains information about the code for "The Developmental Basis of SHH Medulloblastoma Heterogeneity"
There are two primary folders named "r-code" and "python-code". The "r-code" folder contains the scripts for processing single-cell RNA-seq data for this work and running Gene Set Variation Analysis (GSVA) on the bulk RNA-seq data from Cavalli et al. The "python-code" folder contains the rest of the code used to analyze the GSVA scores, proteomics data, spatial metabolomics data, and imaging data. Below is a description of what is contained in each file (which is also included at the beginning of each file) and other information for running these scripts.
#############################################
Individual File Descriptions
#############################################
## R-code
r-code/pub_scrna_analysis.R: code to process and analyze features from previously published scRNA-seq data from Hovesdadt et al. 2019, Vladoiu et al. 2019, and Riemondy et al. 2022.
r-code/process_sn-rnaseq_vf.R: code to process and integrate the scRNA-seq data from the MBEN tumors newly sequenced for this study
r-code/generate_figures_vf.R: code to generate figures from the data processed in r-code/process_sn-rnaseq_vf.R.
r-code/correlation_analysis_vf.R: code to load in previously published genesets and generate new gene sets from this data to perform GSVA. This file additionally contains code to perform consensus clustering on the GSVA scores to determine what genesets commonly cluster with each other.
r-code/validation_cohort.R: code to integrate snRNA-seq data from validation cohort with original MBEN tumors and generate correspodning figures.
## Python-code
python-code/process_archer_proteomics_vf.ipynb: code to process published bulk proteomics data from Archer et al 2018 to be used for downstream analysis of synaptic proteins.
python-code/fig2_cc.ipynb: code to generate the consensus figure2 clustering from figure 2, using data generated by r-code/correlation_analysis_vf.R
python-code/create_archer_and_cavalli_signatures.ipynb: code to generate geneset signatures for proteomic subtypes from Archer at al 2018 and RNA/methylation-based subtypes from Cavalli et al 2017.
python-code/cavalli_2017_analysis_vf.ipynb: code to analyze the Cavalli et al 2017 bulk transcriptomics dataset for Figure 3.
python-code/FMRP_calculations_vf.ipynb: code to analyze synaptic and FMRP genes on proteomics data from Archer et al 2017 that was used to create plots in Figure 4.
python-code/image_analysis/image_functions.py: functions used for taurine image processing.
python-code/image_analysis/taurine_image_processing.ipynb: notebook showing image processing and taurine quantification in different the four CHLA samples with variable VSNL1 and MAP2 staining.
python-code/image_analysis/ki67_analysis.ipynb: notebook to process and calculate ki67+ percentages in MB287 and CHLA-5.
python-code/maldi_code/MALDI_quality_control_vf.ipynb: code to perform initial quality control on MALDI data used in Figure 6
python-code/maldi_code/maldi_analysis_vf.ipynb: code to analyze the MALDI data and generate plots for Figure 6
python-code/maldi_code/joint_graphical_lasso_vf.ipynb: code to perform joint graphical lasso analysis on MALDI data and analyze network differences between samples with late-stage granule neurons
python-code/maldi_code/taurine_guanine_cardinal_data.ipynb: code that shows taurine/guanine spatial correlation analysis using the data processed with Cardinal (instead of Isoscope).
#############################################
Relevant Data for Scripts
#############################################
Many scripts in this directory rely on files in the "data" folder. This folder is currently empty, but the relevant omics and imaging data can be downloaded from the links below.
omics data link: https://www.dropbox.com/sh/ubwiecqi3w5oqo9/AACQtut-kKu3LH1Iomi83xk1a?dl=0
mIHC imaging data link: https://zenodo.org/records/10257144
taurine-focused IHC data link: https://zenodo.org/records/10256482
#############################################
snRNA-Seq Data
#############################################
The single-cell/nuc RNA-sequencing data from this work is stored on GEO with the accession GSE214469.
Additional published data can be accessed through their source publications:
Hovesdadt 2019: GSE119926
Vladoiu 2019: GSE118068
Riemondy 2022: GSE156053
Cavalli 2017: GSE85218
Archer 2018: Supplementary Table 2 from PMID-30205044
#############################################
Package Versions
#############################################
## Package versions (python-code base folder)
conda 4.8.3
python: 3.7.8
matplotlib: 3.4.1
numpy: 1.18.5
pandas: 1.3.5
scikit-learn: 0.21.1
scipy: 1.7.3
seaborn: 0.11.0
## Package versions (python-code/maldi_code)
conda: 4.7.12
python: 3.8.10
gglasso: 0.1.9
matplotlib: 3.4.2
networkx: 2.6.3
numpy: 1.18.5
pysal: 2.4.0
scanpy: 1.7.2
scikit-learn: 0.24.2
scipy: 1.7.1
seaborn: 0.11.1
## Package versions (R-code)
R: 4.1.2
biomaRt: 2.50.2
ConsensusClusterPlus: 1.58.0
data.table: 1.14.2
ggplot2: 3.3.5
gridExtra: 2.3
GSVA: 1.42.0
harmony: 0.1.0
monocle3: 1.0.0
RColorBrewer: 1.1.2
readxl: 1.3.1
Seurat: 4.1.0
SeuratWrappers: 0.3.0
stringr: 1.4.0
#############################################
Run Time and Installation Information
#############################################
Most of these scripts can be run in a matter of minutes by going through each step of the code. Some select files dealing with data generation (such as scRNA-Seq processing and graphical lasso paramater sweeps) take longer and may take closer to an hour. Additionally, installation of necessary packages may take 10s of minutes for the R and python scripts.