You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current quickstart guide for using GISAID data assumes that users will download a single metadata XLS file and a single sequences FASTA file. However, GISAID limits the number of records that users can download at once, so users need a way to run this type of workflow starting with one or more input files.
Instead of requiring users to merge their own XLS and FASTA files manually, the workflow could support multiple input files and handle that concatenation logic for users. The implementation could include the addition of a wildcard to the prepare_data.smk rules or perhaps a glob of the files present in the hardcoded input directory. We could modify the XLS to CSV script to accept multiple input files and similar update the prepare_sequences rule to first concatenate all available sequences before renaming, sorting, etc.
The text was updated successfully, but these errors were encountered:
Description
The current quickstart guide for using GISAID data assumes that users will download a single metadata XLS file and a single sequences FASTA file. However, GISAID limits the number of records that users can download at once, so users need a way to run this type of workflow starting with one or more input files.
Instead of requiring users to merge their own XLS and FASTA files manually, the workflow could support multiple input files and handle that concatenation logic for users. The implementation could include the addition of a wildcard to the
prepare_data.smk
rules or perhaps a glob of the files present in the hardcoded input directory. We could modify the XLS to CSV script to accept multiple input files and similar update theprepare_sequences
rule to first concatenate all available sequences before renaming, sorting, etc.The text was updated successfully, but these errors were encountered: