Skip to content

Persistent low-coverage “blue stripes” in Hi-C contact maps (dm6) #948

@Cathmyt

Description

@Cathmyt

Hi,

I am working on reproducing the Hi-C contact maps in Drosophila, focusing on comparing TADs in heterochromatin vs. euchromatin regions.
Following the methods in Ramírez et al. 2018, I used bwa to align several datasets to dm6 and built matrices at DpnII restriction fragment resolution. However, I consistently observe dark blue vertical/horizontal stripes in the contact maps, corresponding to bins with very low read counts across the whole matrix.

Below are the commands I used:
bwa mem -A 1 -B 4 -E 50 -L 0 -t 8 dm6.fa SRR1658525_1.fastq | samtools view -Shb - > SRR1658525_1.bam

(hicexplorerenv) ➜  hicMatrix hicSumMatrices --matrices DpnII_HiC_SRR3452738_wdup_matrix.h5 Hi-C_NT5_Rep1_SRR1658525_wdup_matrix.h5 Hi-C_NT_25_Rep2_SRR1658526_wdup_matrix.h5 Hi-C_NT_53_Rep3_SRR1658527_wdup_matrix.h5 Hi-C_NT_89_Rep4_SRR1658528_wdup_matrix.h5 --outFileName Hi-C_dm6_DpnIII_replicatesMerged_wdup.h5
(hicexplorerenv) ➜  hicMatrix hicCorrectMatrix diagnostic_plot --chromosomes  chr2L chr2R chr3L chr3R chr4 chrM chrX --matrix Hi-C_dm6_DpnIII_replicatesMerged_wdup.h5 --plotName Hi-C_dm6_DpnIII_replicatesMerged_wdup_diagnosticplot.png
INFO:hicexplorer.hicCorrectMatrix:Removing 9827 zero value bins
INFO:hicexplorer.hicCorrectMatrix:mad threshold -1.4936131497083602
INFO:hicexplorer.hicCorrectMatrix:Saving diagnostic plot Hi-C_dm6_DpnIII_replicatesMerged_wdup_diagnosticplot.png
hicCorrectMatrix correct --chromosomes chr2L chr2R chr3L chr3R chr4 chrM chrX  --matrix Hi-C_dm6_DpnII_replicatesMerged_wdup.h5  --filterThreshold -1.5 4.5 --perchr -o  Hi-C_dm6_DpnII_replicatesMerged_wdup_v2.corrected.h5
INFO:hicexplorer.hicCorrectMatrix:matrix contains 192841041 data points. Sparsity 0.010.
normalisation factor is 0.0158589
normalisation factor is 0.0162551
normalisation factor is 0.0161872
normalisation factor is 0.015526
normalisation factor is 0.0181953
normalisation factor is 0.00188723
normalisation factor is 0.0168567

The diagnostic plot looks like this:
Image

I plotted the matrix at a few cut sites and there are no contact reads at certain locations. For example at chr3L: 8000kb-8400kb, the desired output from your navigator is as:

Image


yet mine is like:
Image

The empty locations become more prevalent as HiC showing gap in the HC silenced regions.
I have tried merging more replicates and using different datasets like HindIII but these stripes were preserved. Is there a recommended best practice in HiCExplorer for masking/removing these low-coverage bins? And is there a way to merge matrix from different restriction enzymes (since I am doing restriction fragment resolution their matrix shapes do not match)?

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions