Skip to content

Commit 35dcd6e

Browse files
committed
update paper
1 parent ded74cc commit 35dcd6e

File tree

1 file changed

+7
-7
lines changed

1 file changed

+7
-7
lines changed

paper/paper.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -27,9 +27,9 @@ bibliography: ./paper.bib
2727

2828
# Statement of need
2929

30-
Spectroscopy comprises a group of several analytical techniques used to understand the composition of materials using light. Traditionally, spectroscopic data is analyzed by a discipline called *chemometrics*, a branch of machine learning specialized on extracting chemical information from multivariate spectra. Over the last decades, *chemometricians*, have excelled by developing advanced preprocessing methods designed to attenuate instrument and measuring artifacts from the spectra, enhancing the pure chemical information of the samples [@RINNAN20091201], [@MISHRA2020116045].
30+
Spectroscopy comprises a group of several analytical techniques used to understand the composition of materials using light. Traditionally, spectroscopic data is analyzed by a discipline called *chemometrics*, a branch of machine learning specialized on extracting chemical information from multivariate spectra. Over the last decades, *chemometricians*, have excelled by developing advanced preprocessing methods designed to attenuate instrument and measuring artifacts from the spectra, and to enhance the pure chemical information of the samples [@RINNAN20091201], [@MISHRA2020116045].
3131

32-
Spectroscopic methods are very suited for a wide range of applications because they allow analyzing the chemical properties of various samples in a fast and simple manner. For this reason, their adoption as integral components of Process Analytical Technology (PAT) has witnessed significant growth across industries, including chemical, biotech, food, and pharmaceuticals. Despite this surge, a notable obstacle has been the absence of open-source standardized, accessible toolkit for *chemometric* model development and deployment. ```chemotools```, positioned as a comprehensive solution, addresses this void by integrating into the Python machine learning ecosystem. By implementing a variety of preprocessing and feature selection tools with the ```scikit-learn``` API [@pedregosa2018scikitlearn], ```chemotools``` opens up the entire ```scikit-learn``` toolbox to users, encompassing features such as:
32+
Spectroscopic methods are very suited for a wide range of applications because they allow analyzing the chemical properties of various samples in a fast and simple manner. For this reason, their adoption as integral components of Process Analytical Technology (PAT) has witnessed significant growth across industries, including chemical, biotech, food, and pharmaceuticals. Despite this surge, a notable obstacle has been the absence of open-source standardized, accessible toolkit for *chemometric* model development and deployment. ```chemotools```, positioned as a comprehensive solution, addresses this void by integrating *chemometric* methods into the Python machine learning ecosystem. By implementing a variety of preprocessing and feature selection tools with the ```scikit-learn``` API [@pedregosa2018scikitlearn], ```chemotools``` opens up the entire ```scikit-learn``` toolbox to users, encompassing features such as:
3333

3434
- a rich collection of estimators for regression, classification, and clustering
3535
- cross-validation and hyper-parameter optimization algorithms
@@ -40,7 +40,7 @@ This integration empowers users with a versatile array of tools for robust model
4040

4141
In addition to its foundational capabilities, ```chemotools``` not only enables users to preprocess data and train models using ```scikit-learn``` but also streamlines the transition of these models into a production setting. By enabling users with a well defined interface, ```chemotools``` facilitates the reception of input data and delivery of predictions from the trained model. This can then be containerized using Docker, providing an efficient means for the distribution and implementation of the model in any Docker-compatible environment, facilitating the deployment of models to cloud environments. This adaptive capability not only enables organizations to scale model usage but also allows them to monitor performance and promptly update or rollback the model as necessary.
4242

43-
```chemotools``` also introduces a practical innovation by providing a streamlined framework for data augmentation of spectroscopic datasets through the ```scikit-learn``` API. This feature offers users a straightforward and consistent method to enhance spectral datasets, by introducing stochastic artifacts that represent real-world variations. By integrating data augmentation into the *chemometric* workflow, ```chemotools``` provides users with an efficient tool for refining and their datasets generalizing the models and optimizing their performance.
43+
```chemotools``` also introduces a practical innovation by providing a streamlined framework for data augmentation of spectroscopic datasets through the ```scikit-learn``` API. This feature offers users a straightforward and consistent method to enhance spectral datasets, by introducing stochastic artifacts that represent real-world variations. By integrating data augmentation into the *chemometric* workflow, ```chemotools``` provides users with an efficient tool for improving their datasets to generale the models and optimize their performance.
4444

4545

4646
![chemotools in the Python machine learning environment .\label{fig:1}](../assets/images/overview_2.png)
@@ -90,15 +90,15 @@ spectra_augmented = np.array([augmentation_pipeline.fit_transform(spectrum) for
9090

9191
In addition to the transformers, ```chemotools``` also implements selectors. Selectors are mathematical functions used to select the relevant features from the spectral dataset based on a given criteria. Selectors are used to select the features that contain the chemical information of the sample, making the models more robust and generalizable.
9292

93-
Beyond its mathematical prowess, ```chemotools``` goes a step further by providing real-world spectral datasets [@cabaneros1]. Accompanied by guides demonstrating the integration of scikit-learn and ```chemotools``` for training regression and classification models, these datasets immerse learners in practical applications. This hands-on approach bridges theoretical concepts and real-world implementation, nurturing a deeper understanding of potential challenges in real-world scenarios.
93+
Beyond its mathematical features, ```chemotools``` goes a step further by providing real-world spectral datasets [@cabaneros1]. Accompanied by guides demonstrating the integration of scikit-learn and ```chemotools``` for training regression and classification models, these datasets immerse learners in practical applications. This hands-on approach bridges theoretical concepts and real-world implementation, nurturing a deeper understanding of potential challenges in real-world scenarios.
9494

95-
For those seeking detailed insights, the documentation page (https://paucablop.github.io/chemotools/) meticulously outlines all available mathematical functions within chemotools. This comprehensive resource serves as a guide for users exploring the extensive capabilities of the library.
95+
The documentation page (https://paucablop.github.io/chemotools/) meticulously outlines all available mathematical functions within chemotools. This comprehensive resource serves as a guide for users exploring the extensive capabilities of the library.
9696

9797
# Adoption and applications
9898

99-
The ultimate objective of developing *chemometric* and machine learning models is either to gain insights about complex datasets and/or to train models that can be used in production applications (\autoref{fig:4}). From a research and development perspective, ```chemotools``` offers a wide range of transformers and selectors that, combined with the rest of the Python machine learning environment, enables researchers to investigate and understand their spectral datasets. From an industrial point of view, ```chemotools``` allows users to streamline the deployment of their trained models into production environments adhering to the frameworks developed by the machine learning community in Python (\autoref{fig:1}).
99+
The ultimate objective of developing *chemometric* and machine learning models is either to gain insights about complex datasets and/or to train models that can be used in production applications (\autoref{fig:4}). From a research and development perspective, ```chemotools``` offers a wide range of transformers and selectors that, combined with the rest of the Python machine learning environment, enables researchers to investigate and understand their spectral datasets. From an industrial point of view, ```chemotools``` allows users to streamline the deployment of their trained models into production environments adhering to standard frameworks developed by the machine learning community in Python (\autoref{fig:1}).
100100

101-
Beyond its practical applications, ```chemotools``` has being utilized as an educational tool at universities for both Master's (MSc) and Doctoral (PhD) levels. Its incorporation into academic curricula provides a valuable way to enable the students to benefit from hands-on experience on real-world datasets gaining practical insights into the application of sophisticated techniques for preprocessing and analyzing spectral data. The tool's user-friendly interface, coupled with comprehensive documentation, has proven and enriching learning experience for students pursuing higher education in fields related to analytical chemistry and *chemometrics*.
101+
Beyond its practical applications, ```chemotools``` has being utilized as an educational tool at universities for both Master's (MSc) and Doctoral (PhD) levels. Integrating chemotools into academic curricula using Jupyter notebooks, offers students a valuable opportunity to gain hands-on experience with real-world datasets, providing practical insights into the application of sophisticated techniques for preprocessing and analyzing spectral data. The tool's user friendly interface, coupled with comprehensive documentation, has proven an enriching learning experience for students pursuing education in fields relate to analytical chemistry, process analytical technology, data science or *chemometrics*.
102102

103103
![Applications of ```chemotools```.\label{fig:4}](../assets/images/applications.png)
104104

0 commit comments

Comments
 (0)