-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge pangenome graphs #68
Comments
Hi, What are you trying to achieve through this comparison, exactly? Adelme |
Hey, Thanks for the (super) quick reply! |
We do not have something that directly implements a straightforward comparison between two pangenomes (for now), however you can get that with some file comparisons. get all family sequences for both pangenomes:
Those commands will write a file 'all_protein_families.faa' in the output directory provided with -o.
You can provide --identity (default is 0.5) and --coverage (default is 0.8) thresholds for the comparison. The first column indicates a family id from the faa file, and the second column indicates the partition of the most similar family in the pangenome it was compared to. And alternatively the 'input_to_pangenome_associations.blast-tab' file is a alignment file with blast-like results on the proteins vs pangenome alignment, which will give you family ids from both pangenomes directly. (there can be multiple hits) By comparing those files, and the origin family partitions, you should be able to get what you want, I believe? Adelme |
Hey, Thanks for the detailed explanation. So, if I understood correctly, this approach will give you information about the family ids from pangenome 1 that match families in pangenome 2, right? But the classification in the 2nd column only let's you know that a given id is considered 'persistent' in pangenome 2, and may not be so in pangenome 1? Also, family ids not listed in column 1 from the 'proteins_partition_projection.tsv' will represent family-specific ids from pangenome 1, i.e. which have no match in pangenome 2? |
Yes absolutely, you are correct for all of your points. If you want you can play with the filters available with
to write only the persistent gene families (in a file called 'persistent_protein_families.faa'). You can do this with all partitions, the filename will change accordingly. Adelme |
Hi there,
Is it possible to merge pangenome graphs from independent runs? I know panaroo has that option, and would like to know if it would be possible to do so with ppanggolin.
If not, could you please provide me alternatives to compare the pangenome of independent runs?
Thanks!
The text was updated successfully, but these errors were encountered: