Retrieve the number of orthologs per taxon included in the final matrix #98
thesamuanels
started this conversation in
Ideas
Replies: 1 comment 1 reply
-
Hi @thesamuanels, Thanks for reaching out. In the latest version (v1.2.12) we added a file, occupancy.tsv, that is produced by matrix_constructor.py. It is a presence absence matrix with Unique IDs on the y axis and genes on the x axis. Will this suffice? Best, |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
I was discussing this issue with other colleagues who use phylofisher in our lab. We have noticed that there no easy way of telling how many orthologous genes per taxon are included in the final matrix. Indeed, in the "select_taxa.tsv" table, the completeness values reported there seem to refer to the total number of genes in the database (240). However, our analyses may include a subset of those genes (especially when some are excluded using the select_orthologs.py command), and some taxa may be more affected than others, so it doesn't scale down proportionally for each taxon. We were wondering whether this has been addressed before and if it is possible to include such information somewhere along with the final matrix construction output.
Thanks
Beta Was this translation helpful? Give feedback.
All reactions