Skip to content

Commit 5f7b455

Browse files
authored
Merge pull request #602 from Steinbeck-Lab/development
Docs: Updates
2 parents ee24a4f + ff986f6 commit 5f7b455

File tree

14 files changed

+331
-111
lines changed

14 files changed

+331
-111
lines changed

.github/workflows/deploy-doc.yml

Lines changed: 32 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,57 +1,68 @@
1-
# GitHub Actions workflow to deploy documentation to GitHub Pages
2-
31
name: Docs Deployment - GitHub Pages
42

5-
# Trigger on manual dispatch or push to main branch
3+
# Trigger on manual dispatch or push to main branch with docs changes
64
on:
75
workflow_dispatch: {}
86
push:
97
branches:
108
- main
9+
paths:
10+
- 'docs/**'
11+
- 'package*.json'
12+
- '.github/workflows/deploy-doc.yml'
1113

1214
jobs:
1315
deploy:
14-
# Job to build and deploy docs
1516
runs-on: ubuntu-latest
1617
permissions:
1718
pages: write
1819
id-token: write
20+
contents: read
1921
environment:
2022
name: github-pages
2123
url: ${{ steps.deployment.outputs.page_url }}
2224
steps:
23-
# Checkout repository
24-
- uses: actions/checkout@v4
25+
# Checkout repository with full history
26+
- name: Checkout code
27+
uses: actions/checkout@v4
2528
with:
2629
fetch-depth: 0
2730

28-
# Setup Node.js environment
29-
- uses: actions/setup-node@v4
31+
# Setup Node.js with caching
32+
- name: Setup Node.js
33+
uses: actions/setup-node@v4
3034
with:
31-
node-version: 18
35+
node-version: 20
3236
cache: npm
3337

34-
# Install dependencies
35-
- run: npm ci
38+
# Install dependencies with clean install
39+
- name: Install dependencies
40+
run: npm ci
3641

3742
# Build documentation
38-
- name: Build
43+
- name: Build documentation
3944
run: npm run docs:build
4045

41-
# Debug build output
42-
- name: Debug - Check build output
43-
run: ls -l docs/.vitepress/dist
46+
# Verify build output exists
47+
- name: Verify build output
48+
run: |
49+
if [ ! -d "docs/.vitepress/dist" ]; then
50+
echo "Build output directory not found!"
51+
exit 1
52+
fi
53+
ls -la docs/.vitepress/dist
4454
4555
# Configure GitHub Pages
46-
- uses: actions/configure-pages@v3
56+
- name: Setup Pages
57+
uses: actions/configure-pages@v4
4758

48-
# Upload static site artifact
49-
- uses: actions/upload-pages-artifact@v3
59+
# Upload build artifacts
60+
- name: Upload Pages artifact
61+
uses: actions/upload-pages-artifact@v3
5062
with:
5163
path: docs/.vitepress/dist
5264

5365
# Deploy to GitHub Pages
54-
- name: Deploy
66+
- name: Deploy to GitHub Pages
5567
id: deployment
56-
uses: actions/deploy-pages@v4
57-
68+
uses: actions/deploy-pages@v4

.github/workflows/dev-build.yml

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
# - Test linting with pylint.
33
# - Fetch Latest release.
44
# - Build the latest docker image in development which needs test to pass first.
5-
# - Push the docker image to Github Artifact Registry-Dev.
5+
# - Push the docker image to Docker Hub.
66
#
77
# Maintainers:
88
# - name: Nisha Sharma
@@ -28,15 +28,20 @@ jobs:
2828
- name: Check out the repo
2929
uses: actions/checkout@v4
3030

31+
# Login to Docker Hub
3132
- name: Log in to Docker Hub
3233
uses: docker/login-action@v3
3334
with:
3435
username: ${{ env.DOCKER_HUB_USERNAME }}
3536
password: ${{ env.DOCKER_HUB_PASSWORD }}
3637

38+
# Set up Docker Buildx
39+
- name: Set up Docker Buildx
40+
uses: docker/setup-buildx-action@v3
41+
3742
# Build and push main API Docker image
3843
- name: Build and push Docker image
39-
uses: docker/build-push-action@v5.3.0
44+
uses: docker/build-push-action@v5
4045
with:
4146
context: .
4247
file: ./Dockerfile
@@ -56,7 +61,7 @@ jobs:
5661
5762
# Build frontend Docker image only if frontend files changed
5863
- name: Build and push frontend Docker image
59-
uses: docker/build-push-action@v5.3.0
64+
uses: docker/build-push-action@v5
6065
if: steps.frontend-changes.outputs.frontend == 'true'
6166
with:
6267
context: ./frontend

.github/workflows/prod-build.yml

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# This workflow will perform the following actions when a release is published:
22
# - Fetch Latest release.
33
# - Build the latest docker image in production.
4-
# - Push the docker image to Github Artifact Registry-Prod.
4+
# - Push the docker image to Docker Hub.
55
#
66
# Maintainers:
77
# - name: Nisha Sharma
@@ -52,7 +52,7 @@ jobs:
5252

5353
# Build and push backend image
5454
- name: Build and push full Docker image
55-
uses: docker/build-push-action@v4
55+
uses: docker/build-push-action@v5
5656
with:
5757
context: .
5858
file: ./Dockerfile
@@ -65,7 +65,7 @@ jobs:
6565
6666
# Build and push lite version of backend image
6767
- name: Build and push lite Docker image
68-
uses: docker/build-push-action@v4
68+
uses: docker/build-push-action@v5
6969
with:
7070
context: .
7171
file: ./Dockerfile.lite
@@ -87,8 +87,7 @@ jobs:
8787
8888
# Build frontend Docker image only if frontend files changed
8989
- name: Build and push frontend Docker image
90-
uses: docker/[email protected]
91-
#if: steps.frontend-changes.outputs.frontend == 'true'
90+
uses: docker/build-push-action@v5
9291
with:
9392
context: ./frontend
9493
file: ./frontend/Dockerfile

.github/workflows/test.yml

Lines changed: 30 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,18 @@
11
name: test
22

33
on:
4-
push:
5-
branches:
6-
- main
7-
- development
84
pull_request:
9-
branches:
10-
- main
11-
- development
5+
branches: [main, development]
6+
paths-ignore:
7+
- 'CHANGELOG.md'
8+
- 'package.json'
9+
- 'package-lock.json'
10+
push:
11+
branches: [main, development]
12+
paths-ignore:
13+
- 'CHANGELOG.md'
14+
- 'package.json'
15+
- 'package-lock.json'
1216

1317
jobs:
1418
test:
@@ -17,15 +21,24 @@ jobs:
1721
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
1822
strategy:
1923
matrix:
20-
python-version: ["3.10"]
24+
python-version: ["3.10", "3.11"]
2125
steps:
22-
- uses: actions/[email protected]
26+
# Get the source code
27+
- name: Checkout code
28+
uses: actions/checkout@v4
29+
30+
# Setup Python environment with caching
2331
- name: Set up Python ${{ matrix.python-version }}
24-
uses: actions/setup-python@v3
32+
uses: actions/setup-python@v5
2533
with:
2634
python-version: ${{ matrix.python-version }}
35+
cache: 'pip'
36+
37+
# Debug environment variables
2738
- name: Print all environment variables
2839
run: printenv
40+
41+
# Install all required dependencies
2942
- name: Install dependencies
3043
run: |
3144
python -m pip install --upgrade pip
@@ -39,12 +52,18 @@ jobs:
3952
wget -O surge "https://github.com/StructureGenerator/surge/releases/download/v1.0/surge-linux-v1.0"
4053
chmod +x surge
4154
sudo mv surge /usr/bin
55+
56+
# Run code quality checks
4257
- name: Analysing the code with pylint
4358
run: |
4459
flake8 --per-file-ignores="__init__.py:F401" --ignore E402,E501,W503 $(git ls-files '*.py') .
60+
61+
# Execute tests with coverage
4562
- name: Run tests and collect coverage
4663
run: |
4764
python3 -m pytest --cov=./ --cov-report=xml
65+
66+
# Upload coverage data to Codecov
4867
- name: Upload coverage reports to Codecov
4968
uses: codecov/[email protected]
5069
env:
@@ -54,4 +73,4 @@ jobs:
5473
fail_ci_if_error: true
5574
flags: service
5675
name: codecov-umbrella
57-
verbose: true
76+
verbose: true

docs/.vitepress/config.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,7 @@ export default defineConfig({
5050
{ text: 'Depict', link: '/depict' },
5151
{ text: 'Tools', link: '/tools' },
5252
{ text: 'OCSR', link: '/ocsr' },
53+
{ text: 'Python Documentation', link: 'https://cheminformatics-microservice.readthedocs.io/en/latest/' },
5354
]
5455
},
5556
{

docs/architecture.md

Lines changed: 69 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -2,22 +2,82 @@
22
outline: deep
33
---
44

5-
# Architecture
6-
7-
Cheminformatics toolkits are based on different underlying programming languages. RDkit is based on C++ and Python, CDK is based on JAVA, and OpenBabel is based on C++, to name a few. These toolkits require different environmental setups for their integration. Being able to package and use any or all of these toolkits through a unified API would offer several advantages in the ease of use and integration into the existing workflows or applications. In addition, including these toolkits with the application code can be challenging because of compatibility issues and could become a maintenance nightmare. To address these issues, we have reached out to two typical software development techniques: containerization and microservices.
5+
# Architecture
86

97
<p align="center">
108
<img align="center" src="/architecture.png" alt="Logo" width="90%">
119
</p>
1210

13-
Microservices, also known as microservice architecture, is a software development approach that involves building applications as a collection of small, independent services that can be deployed, scaled and maintained independently. Each microservice performs a specific business function and communicates with other microservices using well-defined APIs. Containers are lightweight and isolated environments that package applications and their dependencies, allowing them to run consistently across different systems and environments. Containers provide a consistent and reproducible execution environment, ensuring that applications work the same way across development, testing, and production environments. Docker is a leading platform for containerization, providing a comprehensive set of tools and services for creating and managing containers. CM is containerized using Docker and is distributed publicly via the docker hub, a cloud-based registry provided by Docker that allows developers to store, share, and distribute Docker images.
1411

15-
REST (Representational State Transfer) API is widely used and preferred for application development due to several advantages it offers in terms of simplicity and ease of use, scalability, and performance. REST API also offers platform and language independence, flexibility extensibility and wide range compatibility in its integrations. We have chosen FAST API, a modern, fast, and highly efficient web framework for building APIs with Python. It allows you to create robust and scalable APIs quickly and easily. Our REST API is built on the OpenAPI Specification 3.1.0 (OpenAPI, formerly known as Swagger, is an open standard for defining, documenting, and designing RESTful APIs. It allows you to describe the endpoints, request/response payloads, authentication methods, and other details of your API in a machine-readable format), which improves the functionality of REST APIs by offering standard documentation, promoting interoperability, enabling code generation, simplifying validation, and integrating with various tools and libraries.
12+
<p align="center">
13+
<img align="center" src="/abstract.png" alt="Logo" width="90%">
14+
</p>
15+
16+
17+
## Backend Architecture
18+
- Written in **Python** using the **FastAPI** framework.
19+
- Provides **RESTful APIs** for cheminformatics operations.
20+
- **Hybrid toolkit integration**:
21+
- Native Python toolkits: **RDKit**, **Open Babel**.
22+
- Java toolkits (e.g., **CDK**, **SRU**, **OPSIN**) accessed via **JPype**.
23+
- **Docker containerization** ensures reproducibility and consistent deployment.
24+
- Designed for **scalability** and future extensibility without disrupting current services.
25+
- Publicly available at https://api.naturalproducts.net/latest/docs.
26+
27+
## Frontend Architecture
28+
- Built using **React** for strong community support and sustainability.
29+
- **Component-based design** with modular service layers and **custom hooks**.
30+
- Styled using **Tailwind CSS** for responsive UI.
31+
- Communicates with backend via RESTful API using **Axios**.
32+
- API interactions are abstracted through a **dedicated service layer**.
33+
- Structured into **functionally specialized pages**.
34+
- Publicly available at **https://app.naturalproducts.net/**.
35+
36+
## Toolkit Integration Challenge
37+
- Toolkits use different languages:
38+
- **RDKit/OpenBabel**: C++/Python
39+
- **CDK**: Java
40+
- Integrating multiple toolkits is complex due to **compatibility and environment setup**.
41+
- Solution: use **containerization** and **microservices** for abstraction and flexibility.
42+
43+
## Microservices & Containerization
44+
- Microservices: each function is a **separate, scalable, maintainable unit**.
45+
- Containers provide isolated, reproducible environments.
46+
- **Docker** is used for packaging and sharing toolkits.
47+
- CM (Cheminformatics Microservice) is **publicly available via Docker Hub**.
48+
49+
## API Design
50+
- Uses **REST API** based on **OpenAPI Specification 3.1.0**.
51+
- Built with **FastAPI** for speed, efficiency, and automatic documentation.
52+
- Promotes **interoperability**, **validation**, and **tool integration**.
53+
54+
## Functional Scope
55+
- Provides:
56+
- **Format conversions**
57+
- **Optical Structure Recognition (OSR)**
58+
- **Chemical data standardization**
59+
- **Descriptor calculations**
60+
- Integrated tools:
61+
- Cheminformatics: **RDKit**, **CDK**, **OpenBabel**
62+
- Deep learning: **DECIMER**, **STOUT**
63+
- Enables **scalable**, **interoperable**, and **efficient** cheminformatics applications.
64+
65+
## Architectural Decisions
66+
- Combines **FastAPI** + **Docker** for seamless deployment.
67+
- Maintains flexibility via **microservice architecture**:
68+
- Individual services can be updated without affecting others.
69+
- Contrary to usual practice, **all toolkits are packaged in one container**:
70+
- Simplifies deployment and avoids **complex orchestration**.
71+
72+
## Deployment & Monitoring
73+
- **Dockerfile**, **docker-compose YAML**, and deployment scripts are available on GitHub.
74+
- **HELM charts** provided for Kubernetes-based deployments.
75+
- **Monitoring and visualization**:
76+
- **Prometheus** for metrics collection and alerting.
77+
- **Grafana** for real-time dashboards and usage visualization.
78+
79+
1680

17-
This Cheminformatics Microservice project utilizes the containerized microservices approach to package chemistry toolkits and state-of-the-art deep learning tools to provide various functionalities from format conversions, OSR and chemical data standardises accessible via standard REST API. Cheminformatics Microservice comes pre-packaged with toolkits RDKit, CDK, OpenBabel and deep learning tools (DECIMER, STOUT) for handling chemical data - OSR, format conversions, and descriptor calculation. This enables efficient handling of large data volumes and improved performance and development of cheminformatics applications that are scalable and interoperable.
1881

19-
Combining FastAPI with Docker will also simplify the deployment process, making it easier to distribute and run your API in various environments. Moreover, the microservice architecture can help improve the maintainability and flexibility of cheminformatics applications. Changes to one microservice can be made without affecting the other services, which reduces the risk of introducing bugs or errors. It also allows developers to modify or update individual services without having to rewrite the entire application.
2082

21-
It's important to note that the cheminformatics toolkits distributed with CM are all packaged under one container. We consciously chose to go against the usual notion/practice that containers are supposed to do one thing, so every cheminformatics toolkit needs to be packaged as a separate microservice. This is to avoid unnecessary complexity of container orchestration across multiple containers while the containers, as such, can scale indefinitely as they are stateless.
2283

23-
CM Docker file, a docker-compose YAML file, and other deployment scripts are available on the GitHub repository for anyone to orchestrate their deployment and manage multiple Docker containers as a single unit. HELM charts are also available for users to deploy the CM docker container and its dependencies to their Kubernetes cluster. Prometheus (is a monitoring and alerting tool that collects and stores time-series data metrics from various targets in real-time. It has a flexible query language and powerful data model that allows you to aggregate, analyze, and alert on your metrics data.) and Grafana (a popular open-source data visualization tool that works seamlessly with Prometheus and other data sources. It provides a rich set of features for creating and sharing dynamic, customizable dashboards that display metrics in real time) popular open-source tools are implemented for logging, monitoring, and visualizing usage statistics in a standalone or distributed system.

docs/contributors.md

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,11 @@ outline: deep
44

55
# Contributors
66

7-
Please check our GitHub Contributors page for more details - https://github.com/Steinbeck-Lab/cheminformatics-microservice/graphs/contributors
8-
9-
10-
### Contributions overview (last 30 days)
11-
12-
![Alt](https://repobeats.axiom.co/api/embed/5731b0fc10956ab00ceae59e7da402a6322fd73d.svg "Repobeats analytics image")
7+
- Dr. Kohulan Rajan
8+
- Venkata Chandrasekhar
9+
- Prof. Dr. Christoph Stienbeck
10+
- Nisha Sharma
11+
- Sri Ram Sagar Kanakam
12+
- Felix Baensch
13+
- Dr. Jonas Schaub
14+
- Noura Rayya

0 commit comments

Comments
 (0)