-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comparing core genomes #136
Comments
Hi, If you mean compare core vs persistent, Yes, you can write down the gene families belonging to all partitions with:
There you find lists for exact core, soft core, persistent, shell, anything you may want. Then if you sort the files and diff them, you can easily get the difference, e.g.:
Or do any other kind of comparisons quite easily. Adelme |
If you meant between pangenomes, you can follow what I've written in this issue: #68 (comment) |
Hi, Thank you for the quick response! I would like to compare the core genome of one group of organisms (of one species) and compare them to another group from the same species (but a different lineage) in order to check the difference between the two lineages. In essence it is not really the "species core genome" that I would like to compare, but rather the genes that are (almost) always present in one lineage and how they differ from the genes that are "core" to another lineage or the species. It would be like taking the core genome of the lineage and substract the core genome of the species in order to have the lineage-specific core. Many thanks! Best |
I see ! You can do it as such: get all persistent sequences for both pangenomes:
Those commands will write a file 'persistent_protein_families.faa' in the output directory provided with -o.
You can provide --identity and --coverage thresholds for the comparison to 'ppanggolin align', to define depending on how distant your species are I guess Then, you should get results on what is persistent from one species persistent in the other species, by reading the 'proteins_partition_projection.tsv' file. It should get you close to what you wish to achieve, if I understood correctly ! |
I think so, many thanks! |
Is it correct that the code above should start with the .faa file after -S? Thanks again! |
I did not understand what you meant by "-S". Your assumption is correct it seems I wrote it too quickly, and should have inverted the two files, sorry! |
Is there a possibility to compare core genomes (or persistent genes) and extract the "difference"?
The text was updated successfully, but these errors were encountered: