-
Notifications
You must be signed in to change notification settings - Fork 10
added support for tabular as well as new rdkit functions #162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added support for tabular as well as new rdkit functions #162
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces functionality for converting from InChI and SMILES to molecular formulas using RDKit and adds support for handling tabular files (TSV) in addition to CSV and XLSX.
- Added new conversion methods (smiles_to_formula and inchi_to_formula) in the RDKit converter.
- Updated tests to include the new conversion methods.
- Modified file format handling in DataFrame, Spectra, and app modules, and updated dependencies/version information.
Reviewed Changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated no comments.
Show a summary per file
File | Description |
---|---|
tests/test_rdkit.py | Added tests for the new smiles_to_formula and inchi_to_formula methods. |
pyproject.toml | Updated version and dependencies, and switched to Ruff linting. |
MSMetaEnhancer/libs/utils/Errors.py | Renamed UnknownSpectraFormat to UnknownFileFormat. |
MSMetaEnhancer/libs/data/Spectra.py | Updated exception usage to UnknownFileFormat. |
MSMetaEnhancer/libs/data/DataFrame.py | Added support for 'tabular' file format alongside tsv. |
MSMetaEnhancer/libs/converters/compute/RDKit.py | Added new RDKit functions for formula conversion from SMILES/InChI. |
MSMetaEnhancer/app.py | Updated data loading file format list and exception usage. |
CHANGELOG.md | Documented new features and dependency updates. |
.pre-commit-config.yaml | Added new pre-commit hooks and configured Python 3.12 environment. |
Comments suppressed due to low confidence (3)
MSMetaEnhancer/libs/converters/compute/RDKit.py:80
- Consider adding unit tests for invalid SMILES input to verify that the conversion correctly returns an empty formula when mol is None.
mol = MolFromSmiles(smiles)
MSMetaEnhancer/libs/converters/compute/RDKit.py:95
- Consider adding unit tests for invalid InChI input to ensure consistent behavior when the molecule conversion fails, returning an empty formula.
mol = MolFromInchi(inchi)
MSMetaEnhancer/libs/data/DataFrame.py:15
- Update the docstring to explicitly mention that 'tabular' is accepted as an alias for 'tsv' to ensure clarity for future maintainers.
Supported formats: csv, tsv/tabular, xlsx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just a few comments :)
matchms = ">=0.28.2" | ||
matchms = ">=0.30.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it not work with older versions of matchms? I'd be cautious of the matchms 0.30 pin as this means forcing numpy 2 - or does it not work with 0.28.2 anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, as far as tests are concerned, they work with 0.30.0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do they also work on 0.28.2?
Describe your changes
Added functionality for converting from inchi to formula and from smiles to formula. This is based on this issue #161. I also added support to treat tabular data as TSV files based on this issue #160
Checklist before requesting a review
Note:
I bumped the version to 0.5.0 instead of 0.4.2 because recent changes, like adding
inchi_to_formula
andsmiles_to_formula
, introduced new features rather than bug fixes.Per [Semantic Versioning](https://semver.org/):
Since this PR includes a new feature, increasing the minor version is appropriate.