Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

epi.ct.bld_mtx_bed drops information from certain threads #138

Open
smithcathy opened this issue Dec 15, 2023 · 0 comments
Open

epi.ct.bld_mtx_bed drops information from certain threads #138

smithcathy opened this issue Dec 15, 2023 · 0 comments

Comments

@smithcathy
Copy link

I am using episcanpy to build h5ad files from snATAC-Seq data. I use the following code in episcanpy v 0.4.0:

epi_peak = epi.ct.load_peaks( peaks_file )

ann = epi.ct.bld_mtx_bed( fragments_file,
feature_region = epi_peak,
chromosomes = list( epi_peak.keys() ),
thread = 6 )

ann.write_h5ad( outdir + samp + '_pruned.h5ad' )

within a snakemake pipeline where peaks_file is a sample specific macs2 file and the fragments_file is bed format output from cellranger-arc (chrom, start, end, cell_id/barcode).

During QC, I noticed a sequential chunk of peaks/features with 0 counts across all cells/barcodes. This impacted only a small amount of reads (<1% of the total read count) but resulted in peaks/features without any reads/fragments in those regions within the impacted samples. However I can see reads/fragments within the input fragments file, and the macs2 peaks were called using the same fragments file so this should not be happening. The result is not reproducible - subsequent runs do not skip the same region. I suspect that a thread is infrequently lost into the ethos resulting in 0 read/fragment counts for a small section of the genome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant