PIGEAN method
Uses python 3.8
pip install -r requirements.txt
base cmd:
python3 ./priors.py gibbs
--first-for-sigma-cond
--sigma-power -2
--gwas-detect-high-power 100
--gwas-detect-low-power 10
--num-chains 10
--num-chains-betas 4
--max-num-iter 500
--background-prior 0.05
--filter-gene-set-p 0.005
--max-num-gene-sets 4000
--min-gene-set-size 5
--gene-loc-file ./NCBI37.3.plink.gene.loc
--gene-map-in ./gencode.gene.map
--gene-loc-file-huge ./refGene_hg19_TSS.subset.loc
--exons-loc-file-huge ./NCBI37.3.plink.gene.exons.loc
--gene-stats-out gs.out
--gene-set-stats-out gss.out
--gene-gene-set-stats-out ggss.out
--gene-effectors-out ge.out
Four out files:
- gs.out (gene data),
- gss.out (gene set data),
- ggss.out (combined data),
- ge.out (gene effectors data)
Note the four required files:
- gene-loc-file: NCBI37.3.plink.gene.loc
- gene-map-in: gencode.gene.map
- gene-loc-file-huge: refGene_hg19_TSS.subset.loc
- exons-loc-file-huge: NCBI37.3.plink.gene.exons.loc
Three types of inputs can be run: GWAS sumstats, gene lists, or exome data. The inputs for each of these three are as follows:
--gwas-in pigean.sumstats.gz
--gwas-chrom-col CHROM
--gwas-pos-col POS
--gwas-p-col P
--gwas-n-col N
--positive-controls-in gene_list.tsv
--positive-controls-id-col 1
--positive-controls-prob-col 2
--positive-controls-no-header True
--positive-controls-all-in ./refGene_hg19_TSS.subset.loc
--positive-controls-all-no-header True
--positive-controls-all-id-col 1
--exomes-in exomes.sumstats.gz
--exomes-gene-col Gene
--exomes-p-col P-value
--exomes-beta-col Effect
Additionally you will have to pass in a set of gene sets using the command --X-in
e.g.
--X-in gene_set_list_mouse_2024.txt
--X-ing ene_set_list_string_notext_medium_processed.txt
The gene sets look like:
PPIA_string_fusion RGPD3:0.93 RGPD8:0.92 RGPD2:0.92 RGPD1:0.92 RGPD5:0.92 RANBP1:1 RGPD4:0.93 FKBP4:0.69
PPIH_string_fusion RGPD3:0.87 RGPD8:0.86 RGPD2:0.86 RGPD1:0.85 RGPD5:0.86 RANBP1:1 FKBP4:0.8
The probabilities (the 0.93 in RGPD3:0.93
) are optional if all genes are equally weighted.