Skip to content

Conversation

@ggydush
Copy link
Contributor

@ggydush ggydush commented Aug 21, 2025

Currently, for remote files (fsspec objects), there is support for pre-computed .fai files, but no support for pre-computed .gzi files. This means that the entire file must be read to re-compute the .fai file as well as the .gzi file (looks like this re-computation of the index was added recently #164)

I have implemented the .gzi support in a similar fashion to the .fai support. All of the tests pass, and I created all combinations of local/remote with the following code:

import s3fs
from itertools import product
from fsspec.core import OpenFile

import pyfaidx


fs = s3fs.S3FileSystem()
files_local_and_remote = [
    (
        "/local/ercc.fa.gz",
        "s3://.../ercc.fa.gz",
    ),
    (
        "/local/ercc.fa.gz.fai",
        "s3://.../ercc.fa.gz.fai",
    ),
    (
        "/local/ercc.fa.gz.gzi",
        "s3://.../ercc.fa.gz.gzi",
    ),
]
combos = list(product([0, 1], repeat=3))
fs = s3fs.S3FileSystem()
for combo in combos:
    files = []
    for file_index, file_options in zip(combo, files_local_and_remote):
        fname = file_options[file_index]
        if fname.startswith("s3://"):
            fname = OpenFile(fs, fname, mode="rb")  # type: ignore
        files.append(fname)
    fa = pyfaidx.Fasta(files[0], indexname=files[1], gzi_indexname=files[2])
    print("is_remote: ", combo, files, len(fa.records))

@codecov
Copy link

codecov bot commented Aug 21, 2025

Codecov Report

❌ Patch coverage is 68.75000% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.63%. Comparing base (de5cdf8) to head (f8f960c).
⚠️ Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
pyfaidx/__init__.py 68.75% 10 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #232      +/-   ##
==========================================
- Coverage   78.96%   78.63%   -0.34%     
==========================================
  Files           2        2              
  Lines        1160     1184      +24     
==========================================
+ Hits          916      931      +15     
- Misses        244      253       +9     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@mdshw5
Copy link
Owner

mdshw5 commented Aug 21, 2025

Hey @ggydush this looks great, and is actually something I meant to do in v0.9.0! Thanks for providing the test code for this. I'm going to release a quick bug fix version with your changes.

@mdshw5 mdshw5 merged commit 438255b into mdshw5:master Aug 21, 2025
7 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants