Skip to content

Pangenome generation

Leonard Dubois edited this page May 18, 2020 · 1 revision

Generates the Bowtie2 indexes needed for mapping.

Example:
./panphlan/panphlan_new_pangenome_generation.py -c erectale --i_fna reference_genomes/ -o database/ --verbose

Input

  • Since PanPhlAn version 1.3, the pangenome generation uses ChocoPhlAn export as input. The ChocoPhlAn export is a pangenome file (panphlan_[NCBI_TAX_ID]_pangenome.csv).
  • References genomes must be provided as .fna in the folder given by the -i_fna argument.
  • -c CLADE_NAME to specify the clade or species database-name; PanPhlAn will search for a file named panphlan_CLADE_NAME_pangenome.csv

Output

If no --output argument is provided, the default value database will lead to the creation of the database/ folder. In this folder :

  • a CLADE_NAME_ref_genomes.fna containing the concatenation of the .fna files from the input folder
  • 6 bowtie2 indexes files named panphlan_CLADE_NAME.[1-4].bt2 and panphlan_CLADE_NAME.rev.[1-2].bt2

Help -h

./panphlan/panphlan_pangenome_generation.py -h
  -h, --help            show this help message and exit
  -i INPUT_FNA_FOLDER, --i_fna INPUT_FNA_FOLDER
                        Folder containing the .fna genome sequence files
  -c CLADE_NAME, --clade CLADE_NAME
                        Name of the species pangenome database, for example:
                        -c ecoli17
  -o OUTPUT_FOLDER, --output OUTPUT_FOLDER
                        Result folder for all database files
  --verbose             Show progress information

For old pangenome generation detail (like with PanPhlAn <= 1.2 ), see older version