Mosses is a library that provides a set of functionalities to assess molecular property prediction models, e.g., QSAR/QSPR models. The library currently includes:
- A model validation module (
predictive_validity.py) built on top of the concept of predictive validity described by Scannell et al. Nat Rev Drug Discov. 2022;21(12):915-931. doi:10.1038/s41573-022-00552-x. The functionpredictive_validity.evaluate_pv()allows the analysis of the quality of predictions on a given data set (e.g., a prospective test set of compounds), according to a desired threshold. The analysis can be used to determine whether the model used to generate the predictions is suitable for the data of interest (e.g., the validation can be done on a new series of compounds), and if so, to configure optimal thresholds for maximising enrichment of compounds with the desired property. - A heatmap module (
heatmap.py) which summarises the information from the validation using predictive validity. The heatmap shows in one table, for each series in the data and according to the selected experimental threshold (SET), what the PPV and FOR percentages are, the recommended thresholds and resulting optimised PPV and FOR percentages, as well as, a qualitative label indicating whether the model is Good, Medium, or Bad at predicting against the data in the series.
The library is written in Python and requires a version >= 3.10 for runtime. The dependencies required by the library are defined in pyproject.toml and are automatically installed when installing the library.
You can install the library using pip install mosses, or you can clone this repository then run make build && make install.
Jupyter notebooks with examples can be found in the folder examples. We recommend following those to adapt your data, configs, and code to work with the modules in mosses.