-
Notifications
You must be signed in to change notification settings - Fork 499
Wrap Liana toolkit for Single-cell and Spatial Transcriptomics #7625
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Co-authored-by: Ona <[email protected]>
Co-authored-by: Ona <[email protected]>
Co-authored-by: Ona <[email protected]>
Co-authored-by: Ona <[email protected]>
Co-authored-by: Ona <[email protected]>
Co-authored-by: Ona <[email protected]>
SaimMomin12
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some preliminary comments inline. @nilchia @pavanvidem would you like to review this PR more in detail?
Co-authored-by: Saim Momin <[email protected]>
Co-authored-by: Saim Momin <[email protected]>
Co-authored-by: Saim Momin <[email protected]>
Co-authored-by: Saim Momin <[email protected]>
Co-authored-by: Saim Momin <[email protected]>
|
I am changing the parameters label and help section. The updates will be shared ASAP |
tools/liana/misty.xml
Outdated
| <section name="advanced_common"> | ||
| <param name="show_log" value="true"/> | ||
| </section> | ||
| <output name="misty_out" location="https://zenodo.org/records/18388645/files/genericMistyData_rf.h5mu" ftype="h5mu" compare="sim_size" delta="100000"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pleas add a has_size assertion instead of sim_size. This has the huge advantage that we do not need to save the test data anywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also note that there are has_h5_... assertions that could be useful here, e.g. https://docs.galaxyproject.org/en/master/dev/schema.html#has-h5-attribute
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tools hate the has_size when they were tested on GitHub,
<has_size size="1500000" delta="100000"/>
<has_h5_keys keys="uns/liana_res"/>
For some of the tests, I am using both of the assertions, butthey still fail
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are equivalent to using compare="sim_size" where the size is determined for the given file.
Co-authored-by: Ona <[email protected]>
Co-authored-by: Ona <[email protected]>
Co-authored-by: Ona <[email protected]>
tools/liana/multi.xml
Outdated
| <section name="advanced_common"> | ||
| <param name="show_log" value="true"/> | ||
| </section> | ||
| <output name="mdata_out" location="https://zenodo.org/records/18388645/files/multi_lrs_to_views.h5mu" ftype="h5mu"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| <output name="mdata_out" location="https://zenodo.org/records/18388645/files/multi_lrs_to_views.h5mu" ftype="h5mu"> | |
| <output name="mdata_out" ftype="h5mu"> |
Otherwise the file at zenodo and the file genereated in the test will be compared bitwise using diff which will fail:
Output misty_out: different than expected, difference (using diff):
( /tmp/tmp3dobkgt8genericMistyData_rf.h5mu v. /tmp/tmp4ebi8nslgenericMistyData_rf.h5mu )
Binary data detected, not displaying diff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, I have updated the wrapper with this. The tools fail due to a Zenodo issue, not for the test itself. I might move the input files to the GitHub repo and keep the output in Zenodo.
But please let me know if I need to update the wrapper to comply with Galaxy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have seen Parameter 'adata': the previously selected dataset has entered an unusable state in other tools. So this is not a problem of your PR.
|
Hi @pavanvidem Please check if the wrapper is ready, including the label and the help sections for the parameters. |
@khaled196 where did you find the labels and help text? It seems they are far off from the documentation here: https://liana-py.readthedocs.io/en/latest/api.html |
| </xml> | ||
| <xml name="bio_tools"> | ||
| <xrefs> | ||
| <xref type="bio.tools">liana</xref> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please improve the bio.tools entry with links, description, edam topics and so on?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would adding xrf do that work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please go to https://bio.tools/liana, request edit access and update it.
Once the bio.tools entry is complete, you don't have to change anything in the Galaxy tool.
Please let me know if you require any assistance.
tools/liana/macros.xml
Outdated
| </xml> | ||
| <xml name="param_kernel"> | ||
| <param argument="kernel" type="select" label="Spatial weight kernel" help="Kernel for computing spatial weights from distance. misty_rbf: Gaussian-like (default); gaussian: standard Gaussian; exponential: exponential decay; linear: inverse distance weighting."> | ||
| <option value="misty_rbf" selected="true">misty_rbf - MISTy RBF (default)</option> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the default is not always misty_rbf. For example, in util.spatial_neighbors, gaussian is default. Please check again and use the correct default based on the function it is used in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can pass any attribute (for eg, value or help or selected of any param) by using a placeholder and pass the value while expanding macro.
Check out an example here:
- A macro with placeholder: https://github.com/galaxyproject/tools-iuc/blob/main/tools/scanpy/macros.xml#L723-L725
- expand the macro by passing a value: https://github.com/galaxyproject/tools-iuc/blob/main/tools/scanpy/normalize.xml#L133
tools/liana/utils.xml
Outdated
| </param> | ||
| <param argument="max_neighbours" type="integer" value="100" min="1" label="Maximum neighbors" help="Maximum nearest neighbors to include per spot/cell. Default: 100. Limits computation."> | ||
| </param> | ||
| <expand macro="param_kernel"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the default is gaussian
tools/liana/macros.xml
Outdated
| </xml> | ||
| <!-- Simple resource name parameter --> | ||
| <xml name="param_resource_name"> | ||
| <param argument="resource_name" type="select" label="Ligand-receptor database" help="Built-in database of ligand-receptor pair annotations. consensus: combines multiple resources (recommended); mouseconsensus: for mouse data. Affects which interactions are considered."> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the DB fetched every time one of the ligand-receptor methods is called?
If the DBs are updated, it might break the reproducibility. In that case, we need a datamanager that caches all the resources on Galaxy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the DB was obtained from Liana. I can see your concern if anyupdate happend. Any resources to create DM?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check this out: https://galaxyproject.org/admin/tools/data-managers/
There are links at the bottom of the page.
tools/liana/macros.xml
Outdated
| <option value="interactions">Enter specific interaction pairs</option> | ||
| </param> | ||
| <when value="resource_name"> | ||
| <param argument="resource_name" type="select" label="Ligand-receptor database" help="Built-in L-R pair database. consensus recommended for general use; choose species-specific versions for cross-species studies."> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as before. If it breaks reproducibilty, we might need a datamanager.
| @@ -0,0 +1,254 @@ | |||
| <tool id="liana_resource" name="Liana Resource" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" profile="@PROFILE@"> | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For all the LR-methods in single-cell.xml, there is a select parameter that allows the user to select the resource.
So, why and where should the user run this tool?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://liana-py.readthedocs.io/en/latest/notebooks/mofatalk.html
the tools here added because of this tutorial.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just saw that it is also used in some other tutorials. Some functions need db_path param, which is an SQL DB, if not provided, it downloads from somewhere. It again breaks reproducibility.
As we discussed earlier, let's focus on the functions that are required for the first and last tutorials.
@nilchia any other tutorials of interest?
If some function downloads the data on the fly and is not used in the tutorials of interest, then we can safely leave them out now.
tools/liana/macros.xml
Outdated
| <param argument="seed" type="integer" value="1337" label="Random seed" help="Seed for reproducibility in stochastic methods. Default: 1337."/> | ||
| </xml> | ||
| <xml name="param_verbose"> | ||
| <param argument="verbose" type="boolean" truevalue="True" falsevalue="False" checked="false" label="Verbose progress output" help="If enabled: print progress messages during computation."/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If percentage-wise progress printed to stdout, it might generate a huge stdout. Please remove this param entirely and use the default of False in the function call.
| </param> | ||
| </xml> | ||
| <xml name="param_de_method"> | ||
| <param argument="de_method" type="select" label="Differential expression test" help="Statistical method for rank_genes_groups (1-vs-Rest). t-test: parametric; wilcoxon: non-parametric, robust; logreg: logistic regression. Default: t-test."> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good find. Is a list of possible DE methods documented somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in the scanpy documentation, I am not sure about the rest; it was very annoying to get those parameters
|
Halfway through. I will review the rest soon. |
|
@khaled196, there are some unused macros or macros used only once. It will be hard to maintain if there are nested macros that are used only once in the entire suite. For example, the macros used in spatial.xml can be removed because most of the parameters are only used once in the tool suite. Please create (nested) macros only when required. |
The help section and labels are my best attempts to clarify the parameters, as they can sometimes be unclear. Using the parameters provided by the API can be misleading for users. I went through them again to read and adjust them |
FOR CONTRIBUTOR: