Skip to content

Commit cab3c14

Browse files
committed
HiCPlotter version 0.5.1
A manual is now available for HiCPlotter parameters. Epilogos plotting: HiCPlotter can now visualize Hi-C data with Epilogos (http://compbio.mit.edu/epilogos/#) from Kellis lab. Please check the manual for the parameters. Whole genome plotting with triple sparse file format is fixed. Please use -wg parameter with -chr, (-chrY for whole genome interactions, otherwise enter a particular chromosome name until which interactions profiles will be plotted). Please check the ReadMe page for examples. A new parameter (-hc) is introduced to color the area under histograms. Same as -tc/-ac please provide a hexadecimal number.
1 parent dae286a commit cab3c14

7 files changed

+244
-102
lines changed

HiCPlotter.py

Lines changed: 184 additions & 76 deletions
Large diffs are not rendered by default.

HiCPlotterManual.pdf

128 KB
Binary file not shown.

README.md

Lines changed: 60 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,8 @@ _HiCPlotter is purposefully designed with the least amount of dependencies to ma
2525

2626
# Arguments
2727

28+
_For reading more about each parameter, please check the [manual](HiCPlotterManual.pdf)._
29+
2830
Required parameters:
2931

3032
files (-f) : a list of filenames to be plotted.
@@ -35,11 +37,12 @@ _HiCPlotter is purposefully designed with the least amount of dependencies to ma
3537
Optional parameters:
3638

3739
verbose (-v) : print version and arguments into a file.
38-
tripleColumn (-tri) : an integer if input file is from HiC-Pro pipeline.
40+
tripleColumn (-tri) : a boolean if input file is from HiC-Pro pipeline.
3941
bedFile (-bed) : a file name for bin annotations, if -tri parameter is set.
4042
histograms (-hist) : a list of filenames to be plotted as histogram.
4143
histLabels (-h) : a list of labels for the histograms.
4244
fillHist (-fhist) : a list whether each histogram will be filled (1) or not (0:default).
45+
histColors (-hc) : a list of hexadecimal numbers for histogram filling colors.
4346
histMax (-hm) : a list of integer for maximum values of histograms.
4447
start (-s) : retain after x-th bin (0:default).
4548
end (-e) : continues until x-th bin (default: length of the matrix).
@@ -48,30 +51,32 @@ _HiCPlotter is purposefully designed with the least amount of dependencies to ma
4851
tilePlots (-t) : a list of filenames to be plotted as tile plots.
4952
tileLabels (-tl) : a list of labels for the tile plots.
5053
tileColors (-tc) : a list of hexadecimal numbers for coloring the tile plots.
51-
tileText (-tt) : an integer whether text will be displayed above tiles (0:default) or not (1).
54+
tileText (-tt) : a boolean whether text will be displayed above tiles (0:default) or not (1).
5255
arcPlots (-a) : a list of filenames to be plotted as arc plots.
5356
arcLabels (-al) : a list of labels for the arc plots.
5457
arcColors (-ac) : a list of hexadecimal numbers for coloring the arc plots.
55-
highlights (-high) : an integer for enabling highlights on the plot (0:default), enable(1).
58+
highlights (-high) : a boolean for enabling highlights on the plot (0:default), enable(1).
5659
highFile (-hf) : a file name for a bed file to highlight selected intervals.
5760
peakFiles (-peak) : a list of filenames to be plotted on the matrix.
61+
epiLogos (-ep) : a filename to be plotted as Epilogos format.
62+
imputed (-im) : a boolean if imputed epilogos will be plotted. (default:0 for observed)
5863
window (-w) : an integer of distance to calculate insulation score.
5964
tadRange (-tr) : an integer of window to calculate local minima for TAD calls.
6065
fileHeader (-fh) : an integer for how many lines should be ignored in the matrix file (1:default).
6166
fileFooter (-ff) : an integer for how many lines should be skipped at the end of the matrix file (0:default).
6267
smoothNoise (-sn) : a floating-point number to clean noise in the data.
6368
heatmapColor (-hmc) : an integer for choosing heatmap color codes: Greys(0), Reds(1), YellowToBlue(2), YellowToRed(3-default), Hot(4), BlueToRed(5).
64-
cleanNANs (-cn) : an integer for replacing NaNs in the matrix with zeros (1:default) or not (0).
65-
plotTriangular (-ptr) : an integer for plotting rotated half matrix (1:default) or not (0).
66-
plotTadDomains (-ptd) : an integer for plotting TADs identified by HiCPlotter (1) or not (0:default).
67-
plotPublishedTadDomins (-pptd) : an integer for plotting TADs from Dixon et, al. 2012 (1:default) or not (0).
68-
plotDomainsAsBars (-ptdb) : an integer for plotting TADs as bars (1) instead of triangles (0:default)
69-
highResolution (-hR) : an integer whether plotting high resolution (1:default) or not (0).
70-
plotInsulation (-pi) : an integer for plotting insulation scores (0:default) or plot (1).
71-
randomBins (-rb) : an integer for plotting random resolution data (1:default) or not (0).
72-
wholeGenome (-wg) : an integer for plotting whole genome interactions (1:default) or not (0).
69+
cleanNANs (-cn) : a boolean for replacing NaNs in the matrix with zeros (1:default) or not (0).
70+
plotTriangular (-ptr) : a boolean for plotting rotated half matrix (1:default) or not (0).
71+
plotTadDomains (-ptd) : a boolean for plotting TADs identified by HiCPlotter (1) or not (0:default).
72+
plotPublishedTadDomins (-pptd) : a boolean for plotting TADs from Dixon et, al. 2012 (1:default) or not (0).
73+
plotDomainsAsBars (-ptdb) : a boolean for plotting TADs as bars (1) instead of triangles (0:default)
74+
highResolution (-hR) : a boolean whether plotting high resolution (1:default) or not (0).
75+
plotInsulation (-pi) : a boolean for plotting insulation scores (0:default) or plot (1).
76+
randomBins (-rb) : a boolean for plotting random resolution data (1:default) or not (0).
77+
wholeGenome (-wg) : a boolean for plotting whole genome interactions (1:default) or not (0).
7378
plotCustomDomains (-pcd) : a list of file names to be plotted beneath the matrix.
74-
publishedTadDomainOrganism (-ptdo) : an integer for plotting human (1:default) or mouse (0) TADs from Dixon et, al. 2012.
79+
publishedTadDomainOrganism (-ptdo) : a boolean for plotting human (1:default) or mouse (0) TADs from Dixon et, al. 2012.
7580
customDomainsFile (-pcdf) : a list of filenames to be plotted as TADs for each experiments.
7681

7782
# Input Files
@@ -215,7 +220,7 @@ _Color code of the heatmaps can be changed with -hmc parameter_
215220

216221
# Example cases with publicly available datasets
217222

218-
## Visualization of ChIP-Seq and 4C data as histograms
223+
## Histogram Plotting
219224

220225
_Multiple histograms for the same matrix should be seperated by comma (true for hist labels and fill histogram parameters)._
221226

@@ -228,33 +233,31 @@ _Data taken from:_ 4C : [Noordermer et, al. Elife 2014](http://elifesciences.org
228233
<img src="examplePlots/HoxD-chr2.ofBins(1830-1880).40K.jpeg" alt="Example plot from HiCPlotter">
229234
</figure>
230235

231-
## Visualization of ChIP-Seq and RAP-Seq data as histograms
232-
233-
_Data taken from:_ RAP-seq : [Engreitz et al. Science 2014](http://www.sciencemag.org/content/341/6147/1237973.long), Hi-C : [Dixon et, al. Nature 2012](http://www.nature.com/nature/journal/v485/n7398/full/nature11082.html?WT.ec_id=NATURE-20120517) and H3K27me3 : [Mouse ENCODE Project](http://www.mouseencode.org/)
234236

235-
_Rotated matrix can be removed with -ptr 0 parameter_
237+
_Color for area under the curve fillings can be specific as a hexadecimal number with -hc parameter._
236238

237-
python HiCPlotter.py -f data/HiC/Mouse/mES.chrX -n mES -r 40000 -chr chrX -o RAP -fh 0 -hist data/HiC/Mouse/GSE46918_pSM33-0hr-Xist_vs_Input.W10000_O7500.bedGraph,data/HiC/Mouse/GSE46918_pSM33-1hr-Xist_vs_Input.W10000_O7500.bedGraph,data/HiC/Mouse/GSE46918_pSM33-2hr-Xist_vs_Input.W10000_O7500.bedGraph,data/HiC/Mouse/GSE46918_pSM33-3hr-Xist_vs_Input.W10000_O7500.bedGraph,data/HiC/Mouse/GSE46918_pSM33-6hr-Xist_vs_Input.W10000_O7500.bedGraph,data/HiC/Mouse/wgEncodeLicrHistoneEsb4H3k27me3ME0C57bl6StdSig.chrX.bedGraph -hl Xist_0h,Xist_1h,Xist_2h,Xist_3h,Xist_6h,H3K27me3_0h -pi 0 -ptr 0 -fhist 0,1,1,1,1,0 -hmc 4 -sn 0
239+
python HiCPlotter.py -f data/HiC/Mouse/mES.chr2 -n mES -chr chr2 -r 40000 -o HoxDc -hist data/HiC/Mouse/GSM1334415_4C_Mouse_EScells_Hoxd4_smoothed_11windows.bedGraph,data/HiC/Mouse/GSM1334412_4C_Mouse_EScells_Hoxd13_smoothed_11windows.bedGraph -hl Hoxd4-ES,Hoxd13-ES -s 1830 -e 1880 -fh 0 -pi 0 -pcd 1 -pcdf data/mES_domains_mm9.bed -fhist 1,1 -hm 2000,2000 -hc 143D52,9ACD32
238240

239241
<figure>
240-
<figcaption align="middle">**Xist spreading during initiation of X-chromosome inactivation**</figcaption>
241-
<img src="examplePlots/RAP-chrX.ofBins(0-4167).40K.jpeg" alt="Example plot from HiCPlotter">
242+
<figcaption align="middle">**Colored Histograms**</figcaption>
243+
<img src="examplePlots/HoxD-chr2.ofBins(1830-1880).Colored.40K.jpeg" alt="Example plot from HiCPlotter">
242244
</figure>
243245

244-
## Visualization of ChIP-Seq as histograms, ChIA-Pet as arcs and Polycomb domains as tiles
246+
247+
## Arcs plotting
245248

246249
_Arc plots require a bedGraph file (-a file1), color can be specied as a hexadecimal number (-ac B4B4B4) or for each arc by specified RGB colors in bedGraph file._
247250

248251
_Data taken from:_ SMC ChIA-Pet and Polycomb Domains: [Dowen et, al. Cell 2014](http://www.sciencedirect.com/science/article/pii/S0092867414011799), Hi-C and TADs : [Dixon et, al. Nature 2012](http://www.nature.com/nature/journal/v485/n7398/full/nature11082.html?WT.ec_id=NATURE-20120517) and H3K27me3 : [Mouse ENCODE Project](http://www.mouseencode.org/)
249252

250-
python s.py -f data/HiC/Mouse/mES.chr3 -n mES -chr chr3 -o Bhlhe22 -r 40000 -s 400 -e 500 -a data/HiC/Mouse/mESC_SMC_ChIPPet.bed -al SMC -hist data/HiC/Mouse/GSM747534_chr3.bedGraph,data/HiC/Mouse/wgEncodeLicrHistoneEsb4H3k27me3ME0C57bl6StdSig.chr3.bedGraph -hl CTCF,H3K27me3 -pi 0 -ptr 0 -t data/HiC/Mouse/mm9_Polycomb_domains.bed -tl Polycomb -tc 00CCFF -ac B4B4B4 -fh 0
253+
python HiCPlotter.py -f data/HiC/Mouse/mES.chr3 -n mES -chr chr3 -o Bhlhe22 -r 40000 -s 400 -e 500 -a data/HiC/Mouse/mESC_SMC_ChIPPet.bed -al SMC -hist data/HiC/Mouse/GSM747534_chr3.bedGraph,data/HiC/Mouse/wgEncodeLicrHistoneEsb4H3k27me3ME0C57bl6StdSig.chr3.bedGraph -hl CTCF,H3K27me3 -pi 0 -ptr 0 -t data/HiC/Mouse/mm9_Polycomb_domains.bed -tl Polycomb -tc 00CCFF -ac B4B4B4 -fh 0
251254

252255
<figure>
253256
<figcaption align="middle">**Bhlhe22 locus in mouse ES cells**</figcaption>
254257
<img src="examplePlots/Bhlhe22-chr3.ofBins(400-475).40K.jpeg" alt="Example plot from HiCPlotter">
255258
</figure>
256259

257-
## Visualization of 4C data as histograms and Enhancers as tiles with text
260+
## Tiles plotting
258261

259262
_If bedGraph file for tile plotting contains text in 6th column, features can be plotted above tiles with -tt parameter._
260263

@@ -268,6 +271,23 @@ _Data taken from:_ 4C : [Lonfat et, al. Science 2014](http://www.sciencemag.org/
268271
<img src="examplePlots/Digit.vs.GT-chr6.ofBins(1295-1338).40K.jpeg" alt="Example plot from HiCPlotter">
269272
</figure>
270273

274+
## Epilogos plotting
275+
276+
Epilogos is developed visualization and analysis of chromatin state model data in various cell types by Wouter Meuleman and Manolis Kellis. More about epilogos, [check](http://compbio.mit.edu/epilogos/#)
277+
278+
You can download the epilogos data [from](http://egg2.wustl.edu/roadmap/data/byFileType/chromhmmSegmentations/ChmmModels/epilogos/)
279+
280+
python HiCPlotter.py -f data/HiC/Human/GM12878-chr10_25kb.RAWobserved_KRnormalizedMatrix.txt -chr chr10 -fh 0 -n GM12878 -o Epilogos -r 25000 -ep qcat -hist RepliSeq.bedGraph -hl RepliSeq -fhist 1 -s 2500 -e 5000 -mm 8
281+
282+
<figure>
283+
<figcaption align="middle">**Epilogos with Replication Timing and Hi-C data**</figcaption>
284+
<img src="examplePlots/Epilogos-chr10.ofBins(2500-5000).25K.jpeg" alt="Example plot from HiCPlotter">
285+
</figure>
286+
287+
_Use parameter (-im) if you download the qcat file from imputed/ folder_
288+
289+
_Currently color of each states for Epilogos plotting is hard-coded in HiCPlotter, therefore please use qcat files in imputed or observed folders._
290+
271291
## Highlighting selected loci on the plot
272292

273293
_Highlights on the plots can be drawn with -high 1 and passing a bed file name to -hf parameter._
@@ -309,6 +329,22 @@ _Data taken from:_ Hi-C : [Seitan et, al. Genome Research 2014](http://genome.cs
309329
<img src="examplePlots/Tcell-WholeGenome-1400K.jpeg" alt="Example plot from HiCPlotter">
310330
</figure>
311331

332+
### Whole genome plotting with triple sparse files
333+
334+
_(-chr) parameter will be used designate to the end chromosome, such as (-chr chr11) will plot interactions starting from chr1 to chr11._
335+
336+
<figure>
337+
<figcaption align="middle">**hES whole genome interactions**</figcaption>
338+
<img src="examplePlots/hES-WholeGenome.chrY-1000K.jpeg" alt="Example plot from HiCPlotter">
339+
</figure>
340+
341+
_Please use (-chr chrY) for whole genome interaction plots._
342+
343+
<figure>
344+
<figcaption align="middle">**hES interactions from chr1 to chr11**</figcaption>
345+
<img src="examplePlots/hES-WholeGenome.chr11-1000K.jpeg" alt="Example plot from HiCPlotter">
346+
</figure>
347+
312348
## 5C data visualization
313349

314350
_Random binned 5C data plotting can be activated by -rb parameter (Please note: currently only matrixes and triangular plots can be plotted with this option)._
@@ -371,8 +407,6 @@ _Data taken from:_ 5C data [Nora et, al. Nature 2012](http://www.nature.com/natu
371407
Original : from scipy.signal import argrelextrema (line 20)
372408
Try this : #from scipy.signal import argrelextrema (line 20). Use HiCPlotter with the -pi 0 and -ptd 0
373409

374-
*If you received the following error: "IOError: encoder jpeg not available", please change extensions of '.jpeg' to '.png' after line 880.
375-
376410
*If you like to run HiCPlotter in verbose mode, please use -v parameter which will create a log file with which parameters the program ran.
377411

378412
*If you need to convert bigWig files to bedGraph files, you can use kentUtils/bigWigToBedGraph executable.
Loading
Loading
487 KB
Loading
509 KB
Loading

0 commit comments

Comments
 (0)