-
Notifications
You must be signed in to change notification settings - Fork 8
6. Modification
Modifications to your FlexTaxD database can refine your data to incorporate improved taxonomic resolutions or alternative taxonomies such as GTDB. Here's a detailed guide on how to integrate these changes.
This process is useful when you want to integrate taxonomic frameworks from two different sources, like NCBI and GTDB.
-
Copy Original Database: Create a copy of the original NCBI database which will serve as the base for modifications.
cp ncbi.fdb ncbi_gtdbBacAr.fdb
-
Clean Original Database: Remove non-essential nodes to prevent conflicts with GTDB taxonomy.
flextaxd --db ncbi_gtdbBacAr.fdb --clean
-
Merge GTDB Bacteria: Integrate GTDB Bacteria taxonomy, replacing corresponding nodes in the NCBI taxonomy.
flextaxd --db ncbi_gtdbBacAr.fdb --mod_database gtdb_bac120.fdb --parent Bacteria --replace
-
Merge GTDB Archaea: Integrate GTDB Archaea taxonomy in a similar manner.
flextaxd --db ncbi_gtdbBacAr.fdb --mod_database gtdb_ar53.fdb --parent Archaea --replace
This method is handy when expanding or updating specific branches of the taxonomy based on new data or refined classifications.
-
Copy Database for Modification: As with the merge, start by copying the original database.
cp francisellaceae.fdb francisellaceae_tularensis.fdb
-
Import New Taxonomy: Replace the current taxonomy for "Francisella tularensis" with new data.
-
Using GTDB Example:
flextaxd --db francisellaceae_tularensis.fdb --mod_file ftd.tree2tax.tul.tsv --genomeid2taxid genomes_map.tul.tsv --parent "Francisella tularensis" --replace
-
Using FTD/CanSNP Example:
flextaxd --db francisellaceae_tularensis.fdb --mod_file ftd.tree2tax.tul.tsv --genomeid2taxid genomes_map.tul.tsv --parent "Francisellaceae_Francisella_tularensis_GCF_000008985.1" --replace
-
To confirm the changes, visualize the Fdb before and after modification.
-
Before Modification:
flextaxd --db francisellaceae.fdb --vis_type tree --vis_node Francisellaceae --vis_depth 0 --vis_label_size 8
-
After Modification:
flextaxd --db francisellaceae_tularensis.fdb --vis_type tree --vis_node Francisellaceae --vis_depth 0 --vis_label_size 8
By following these visualization commands, you should be able to graphically confirm that your database now includes the expanded taxonomy of "Francisella tularensis".
When executing these steps, it's crucial to always verify that the paths and filenames correspond to your actual files and directory structure. Adjust the commands accordingly if your setup differs. This ensures that your modifications are correctly applied and reflected in the database.