Transcriptome_Wide_Splicing_Analysis

FRASER_snakemake

This folder contains my Snakemake pipeline for running FRASER and FRASER2.

Installation

Please install the conda environments with:

micromamba create -f /path/to/fraser1.yml
micromamba create -f /path/to/fraser2.yml

The yml files can be found at:

FRASER_snakemake/conda_envs

Making file list

-Make a csv with a list of all of the full filepaths of the files you want to be included in your FRASER run.

-For an example, see symlink_blood.csv in the FRASER_snakemake/config directory

Setting up configurations

-Make a config.yaml file in the config folder with the following information

output_directory: "/file/path/to/output/directory"
input_directory: "/file/path/where/you/want/to/symlink/data/to"
file_list: "/csv/file/where/you/have/list/of/files/to/run.csv"
FRASER_type: "Both" OR "FRASER" OR "FRASER2" (see note below)

FRASER_type input options:
Please input ONE string from the three options below.
-"FRASER": to indicate you want to run FRASER with the outputs of theta, psi3, and psi5
-"FRASER2": to indicate you want to run FRASER with the Jaccard index output
-"Both": to indicate you want to receive all four outputs

The file_list is a .csv that contains one column (without a header) where every row is a complete filepath to the .bam or .bai files for your analysis cohort. Every individual in your cohort must have two lines in the file_list- one for the .bam file and one for the .bai file.

The first 12 lines of a file_list (csv) may look like this:

/desktop/STAROutput/genome/A1_star_hg38_Aligned.sortedByCoord.out.bam
/desktop/STAROutput/genome/A1_star_hg38_Aligned.sortedByCoord.out.bai
/desktop/STAROutput/genome/B1_star_hg38_Aligned.sortedByCoord.out.bam
/desktop/STAROutput/genome/B1_star_hg38_Aligned.sortedByCoord.out.bai
/desktop/STAROutput/genome/C1_star_hg38_Aligned.sortedByCoord.out.bam
/desktop/STAROutput/genome/C1_star_hg38_Aligned.sortedByCoord.out.bai
/desktop/STAROutput/genome/D1_star_hg38_Aligned.sortedByCoord.out.bam
/desktop/STAROutput/genome/D1_star_hg38_Aligned.sortedByCoord.out.bai
/desktop/STAROutput/genome/E1_star_hg38_Aligned.sortedByCoord.out.bam
/desktop/STAROutput/genome/E1_star_hg38_Aligned.sortedByCoord.out.bai
/desktop/STAROutput/genome/F1_star_hg38_Aligned.sortedByCoord.out.bam
/desktop/STAROutput/genome/F1_star_hg38_Aligned.sortedByCoord.out.bai

Running FRASER

-Go to the FRASER_snakemake/workflow directory

-run the following command:

./run_snakemake --config_file "/path/to/config/yaml" --profile "path/to/config/slurm_scg"

The files in the slurm_scg directory will allow the snakemake to be run with resources on Stanford's scg/ oak

run_results_paper

This folder contains the code for the main and supplemental figures of my manuscript (except Figure 1, which was made using Excel).

Installation

Please install the conda environments with:

micromamba create -f /path/to/fraser1.yml
micromamba create -f /path/to/fraser2.yml

The yml files can be found at:

run_results_paper/conda_envs

Setting up configurations

-Make a config.yaml file in the config folder with the following information

FRASER1_results_uncompiled: "/file/path/to/raw/FRASER/output.csv"
input_file_FRASER: "/file/path/to/csv/with/filepaths/and/ids/of/all/samples/run/in/FRASER.csv"
FRASER2_results_uncompiled: "/file/path/to/raw/FRASER2/output.csv"
input_file_FRASER2: "/file/path/to/csv/with/filepaths/and/ids/of/all/samples/run/in/FRASER2.csv"
metadata_file: "/file/path/to/metadata/file.csv"
mig_file: "/file/path/to/Homo_sapiens_gene.csv"
low_RIN: "/file/path/to/csv/with/samples/excluded/due/to/low/RIN.csv"
output_directory: "/file/path/to/desired/output/directory"
genesets: "/file/path/to/directory/with/genesets"
genes: "/file/path/to/FRASER/output/rds/file.rds"
genes_FRASER2: "/file/path/to/FRASER2/output/rds/file.rds"
missing: "/file/path/to/csv/with/samples/excluded/due/to/missing/info.csv"
size_run_dir: "/file/path/to/directory/with/different/iterations/of/FRASER/across/different/run/sizes"

Running run_results_paper

-Go to the run_results_paper/workflow directory

-run the following command:

./run_snakemake --config_file "/path/to/config/yaml" --profile "path/to/config/slurm_scg"

The files in the slurm_scg directory will allow the snakemake to be run with resources on Stanford's scg/ oak

run_results_review

This folder contains the code for the main and supplemental figures of the review of my manuscript (except Figure S1, which was made using Excel).

Installation

Please install the conda environments with:

micromamba create -f /path/to/fraser1.yml
micromamba create -f /path/to/fraser2.yml

The yml files can be found at:

run_results_review/conda_envs

Setting up configurations

-Make a config.yaml file in the config folder with the following information

FRASER1_results_uncompiled: "/file/path/to/raw/FRASER/output.csv"
input_file_FRASER: "/file/path/to/csv/with/filepaths/and/ids/of/all/samples/run/in/FRASER.csv"
FRASER2_results_uncompiled: "/file/path/to/raw/FRASER2/output.csv"
input_file_FRASER2: "/file/path/to/csv/with/filepaths/and/ids/of/all/samples/run/in/FRASER2.csv"
metadata_file: "/file/path/to/metadata/file.csv"
mig_file: "/file/path/to/Homo_sapiens_gene.csv"
output_directory: "/file/path/to/desired/output/directory"
genesets: "/file/path/to/directory/with/genesets"
genes: "/file/path/to/FRASER/output/rds/file.rds"
genes_FRASER2: "/file/path/to/FRASER2/output/rds/file.rds"
size_run_dir: "/file/path/to/directory/with/different/iterations/of/FRASER/across/different/run/sizes"

Running run_results_paper

-Go to the run_results_review/workflow directory

-run the following command:

./run_snakemake --config_file "/path/to/config/yaml" --profile "path/to/config/slurm_scg"

The files in the slurm_scg directory will allow the snakemake to be run with resources on Stanford's scg/ oak

misc_scripts

This folder contains miscellaneous scripts, such as the code for Figure 1, the creation of the metadata file, and a comparison of novel samples vs samples in Ungar et al., 2024 (PMID: PMC10802764).

Gene Information

Contains the gene sets used in the manuscript. They were originally from Cormier, et al., 2022 (PMID: 36376793) and can also be found at https://github.com/macarthur-lab/gene_lists

Sources of the gene sets:

Gene set	Source	Filename
Haploinsufficient	ClinGen dataset > Cormier et al., 2021	haploinsufficient.tsv
Autosomal recessive	Blekhman et al., 2008; Berg et al., 2013	autosomal_recessive.tsv
Autosomal dominant	Blekhman et al., 2008; Berg et al., 2013	autosomal_dominant.tsv
Olfactory receptor	Mainland, et al., 2015	olfactory_receptors.tsv
CRISPR non-essential	Hart et al., 2017	CRISPR_nonessential_genes.tsv
Developmental delay	https://www.ebi.ac.uk/gene2phenotype/downloads/DDG2P.csv.gz > Firth et al., 2011; Fitzgerald, et al., 2015; Wright et al., 2015; McRae, et al., 2017; Wright, et al., 2018	developmental_delay_genes.csv
OMIM	https://omim.org/downloads > Amberger et al., 2019	OMIM_genes.tsv

Name		Name	Last commit message	Last commit date
Latest commit History 183 Commits
FRASER_snakemake		FRASER_snakemake
Gene_Information		Gene_Information
misc_scripts		misc_scripts
run_results_paper		run_results_paper
run_results_phenotype_paper		run_results_phenotype_paper
run_results_review		run_results_review
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transcriptome_Wide_Splicing_Analysis

FRASER_snakemake

Installation

Making file list

Setting up configurations

Running FRASER

run_results_paper

Installation

Setting up configurations

Running run_results_paper

run_results_review

Installation

Setting up configurations

Running run_results_paper

misc_scripts

Gene Information

About

Uh oh!

Releases

Packages

Languages

maurermaggie/Transcriptome_Wide_Splicing_Analysis

Folders and files

Latest commit

History

Repository files navigation

Transcriptome_Wide_Splicing_Analysis

FRASER_snakemake

Installation

Making file list

Setting up configurations

Running FRASER

run_results_paper

Installation

Setting up configurations

Running run_results_paper

run_results_review

Installation

Setting up configurations

Running run_results_paper

misc_scripts

Gene Information

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages