-
Notifications
You must be signed in to change notification settings - Fork 48
Open
Labels
Description
Original report by Tobias Marschall (Bitbucket: tobiasmarschall, GitHub: tobiasmarschall).
I just tried to use a bgzipped reference (see Issue #149), but failed:
$ whatshap phase --chromosome chr22 --reference hg38-capitalized.fasta --ignore-read-groups --output NA12878.chr22.phased-with-allcaps-ref.vcf.gz NA12878.chr22.unphased.vcf.gz /MMCI/TM/nanopore/work/jebler/paper_experiments/data/np2.fixed/NA12878.hg38.np2.chr22.bam
[WORKS AS EXPECTED]
$ bgzip -c hg38-capitalized.fasta > hg38-capitalized.fasta.gz
$ samtools faidx hg38-capitalized.fasta.gz
$ whatshap phase --chromosome chr22 --reference hg38-capitalized.fasta.gz --ignore-read-groups --output test.vcf.gz NA12878.chr22.unphased.vcf.gz /MMCI/TM/nanopore/work/jebler/paper_experiments/data/np2.fixed/NA12878.hg38.np2.chr22.bamThis is WhatsHap 0.14.1 running under Python 3.6.2
Working on 1 samples from 1 family
======== Working on chromosome 'chr22'
---- Processing individual NA12878
Using maximum coverage per sample of 15X
Number of variants skipped due to missing genotypes: 0
Number of remaining heterozygous variants: 31534
Reading alignments and detecting alleles ...
Traceback (most recent call last):
File "/MMCI/TM/structvar/work/marschal/miniconda3/bin/whatshap", line 11, in <module>
load_entry_point('whatshap==0.14.1', 'console_scripts', 'whatshap')()
File "/MMCI/TM/structvar/work/marschal/miniconda3/lib/python3.6/site-packages/whatshap/__main__.py", line 82, in main
module.main(args)
File "/MMCI/TM/structvar/work/marschal/miniconda3/lib/python3.6/site-packages/whatshap/phase.py", line 768, in main
run_whatshap(**vars(args))
File "/MMCI/TM/structvar/work/marschal/miniconda3/lib/python3.6/site-packages/whatshap/phase.py", line 480, in run_whatshap
readset, vcf_source_ids = read_reads(readset_reader, chromosome, phasable_variant_table.variants, bam_sample, fasta, phase_input_vcfs, numeric_sample_ids, phase_input_bam_filenames)
File "/MMCI/TM/structvar/work/marschal/miniconda3/lib/python3.6/site-packages/whatshap/phase.py", line 120, in read_reads
readset = readset_reader.read(chromosome, variants, sample, reference)
File "/MMCI/TM/structvar/work/marschal/miniconda3/lib/python3.6/site-packages/whatshap/variants.py", line 65, in read
reads = self._alignments_to_readdict(alignments, variants, sample, reference)
File "/MMCI/TM/structvar/work/marschal/miniconda3/lib/python3.6/site-packages/whatshap/variants.py", line 119, in _alignments_to_readdict
reference = reference[:]
File "/MMCI/TM/structvar/work/marschal/miniconda3/lib/python3.6/site-packages/pyfaidx/__init__.py", line 698, in __getitem__
return self._fa.get_seq(self.name, start + 1, stop)[::step]
File "/MMCI/TM/structvar/work/marschal/miniconda3/lib/python3.6/site-packages/pyfaidx/__init__.py", line 887, in get_seq
seq = self.faidx.fetch(name, start, end)
File "/MMCI/TM/structvar/work/marschal/miniconda3/lib/python3.6/site-packages/pyfaidx/__init__.py", line 532, in fetch
seq = self.from_file(name, start, end)
File "/MMCI/TM/structvar/work/marschal/miniconda3/lib/python3.6/site-packages/pyfaidx/__init__.py", line 572, in from_file
self.file.seek(i.offset)
File "/MMCI/TM/structvar/work/marschal/miniconda3/lib/python3.6/site-packages/Bio/bgzf.py", line 615, in seek
self._load_block(start_offset)
File "/MMCI/TM/structvar/work/marschal/miniconda3/lib/python3.6/site-packages/Bio/bgzf.py", line 577, in _load_block
block_size, self._buffer = _load_bgzf_block(handle, self._text)
File "/MMCI/TM/structvar/work/marschal/miniconda3/lib/python3.6/site-packages/Bio/bgzf.py", line 415, in _load_bgzf_block
% (_bgzf_magic, magic, handle.tell()))
ValueError: A BGZF (e.g. a BAM file) block should start with b'\x1f\x8b\x08\x04', not b'd@\x7f\xde'; handle.tell() now says 25611