Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
4f486be
Add MLflow tracking uri in ioparameters
xiaoyachong Mar 12, 2025
ef9388d
Merge remote-tracking branch 'upstream/prefect' into xiaoya-add-featu…
xiaoyachong Mar 12, 2025
b07d28e
Add mlflow tracking uri in ioparameters
xiaoyachong Mar 12, 2025
51a6d24
Merge remote-tracking branch 'upstream/master' into xiaoya-add-featur…
xiaoyachong Mar 12, 2025
5c64d8b
Add mlflow tracking uri in ioparameters
xiaoyachong Mar 12, 2025
dd212da
Add mlflow tracking uri in ioparameters
xiaoyachong Mar 12, 2025
a154326
Update readme for mlflow instruction
xiaoyachong Mar 13, 2025
944c4d5
Update readme for mlflow instruction
xiaoyachong Mar 13, 2025
2e7f2fc
Add mlflow to infrastruture check and update env variables
taxe10 Apr 8, 2025
e99cf4c
:books: Update docker related documentation
taxe10 Apr 8, 2025
a397148
:bug: Fix displaying results
taxe10 Apr 8, 2025
5c3db5b
Merge remote-tracking branch 'upstream/master' into xiaoya-add-featur…
xiaoyachong Apr 27, 2025
61e0bb0
Merge branch 'xiaoya-add-feature-mlflow' of https://github.com/xiaoya…
xiaoyachong Apr 27, 2025
119cdc6
add mlflow build-in auth
xiaoyachong Apr 28, 2025
18c09c0
add mlflow build-in auth
xiaoyachong Apr 28, 2025
bc3f43b
add mlflow build-in auth
xiaoyachong Apr 28, 2025
e0ab708
add mlflow build-in auth
xiaoyachong Apr 28, 2025
ce31f52
add tmp dir for mlflow to handle large model
xiaoyachong May 26, 2025
473d6f7
add mlflow algorithm registry
xiaoyachong Jun 27, 2025
20f2893
add mlflow algorithm registry
xiaoyachong Jun 27, 2025
926b650
add mlflow algorithm registry
xiaoyachong Jun 27, 2025
ae18bcd
update job params list
xiaoyachong Jun 29, 2025
78dc2cb
use black to reformat the code
xiaoyachong Jun 30, 2025
506a53c
update kaleido and plotly versions
xiaoyachong Aug 4, 2025
4658d52
update algorithm registry
xiaoyachong Aug 30, 2025
fbe649b
update algorithm registry
xiaoyachong Aug 30, 2025
db556e2
upgrade prefect version
xiaoyachong Sep 13, 2025
a24c527
fix log display
xiaoyachong Sep 17, 2025
90686a9
fix log display
xiaoyachong Sep 17, 2025
736ef47
fix log display
xiaoyachong Sep 17, 2025
e7c3d7a
fix log display
xiaoyachong Sep 17, 2025
1bd7919
import packages from mlex_utils
xiaoyachong Nov 2, 2025
be9e290
run isort and black
xiaoyachong Nov 2, 2025
909a463
load dvc html from mlflow
xiaoyachong Nov 23, 2025
2db09da
remove credentials
xiaoyachong Nov 23, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 27 additions & 17 deletions .env.example
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
# Services
TILED_SINGLE_USER_API_KEY=<unique api key>

PREFECT_DB_PW=<unique password>
Expand All @@ -11,32 +10,40 @@ TILED_DB_USER=tiled_user
TILED_DB_NAME=tiled
TILED_DB_SERVER=tiled_db

# Tiled setup
TILED_KEY=<your-key>
DEFAULT_TILED_URI=http://tiled:8000
DEFAULT_TILED_SUB_URI=""
RESULTS_TILED_URI=http://tiled:8000
RESULTS_TILED_API_KEY=<your-key>
MLFLOW_DB_PW=<unique password>
MLFLOW_DB_USER=mlflow_user
MLFLOW_DB_NAME=mlflow

# Directory setup
# Directories
READ_DIR=/path/to/read/data
WRITE_DIR=/path/to/read/data/write
WRITE_DIR=/path/to/write/results

# Static Tiled setup [Optional]
STATIC_TILED_URI=
STATIC_TILED_API_KEY=
# Default Tiled setup
DEFAULT_TILED_URI=http://tiled:8000
DEFAULT_TILED_SUB_URI=
DATA_TILED_KEY=<your_data_tiled_key>
RESULTS_TILED_URI=http://tiled:8000
RESULTS_TILED_API_KEY=<your_results_tiled_key>

# Prefect
PREFECT_API_URL=http://prefect:4200/api
FLOW_NAME="Parent flow/launch_parent_flow"
TIMEZONE="US/Pacific"
PREFECT_TAGS='["data-clinic"]'
FLOW_TYPE=docker
PREFECT_TAGS='["latent-space-explorer"]'

# MLFlow
MLFLOW_TRACKING_URI=http://mlflow:5000
# MLflow Authentication
MLFLOW_TRACKING_USERNAME=admin
MLFLOW_TRACKING_PASSWORD=<secure password>
MLFLOW_FLASK_SERVER_SECRET_KEY=<random secret key>

MODE=dev
# Mode
MODE="deployment"

# Docker jobs
CONTAINER_NETWORK=mle_net
# Job settings
# Docker/Podman jobs
CONTAINER_NETWORK="mle_net"

# Slurm jobs
PARTITIONS_CPU='["p_cpu1", "p_cpu2"]'
Expand All @@ -48,3 +55,6 @@ RESERVATIONS_GPU='["r_gpu1", "r_gpu2"]'
MAX_TIME_GPU="1:00:00"
SUBMISSION_SSH_KEY="~/.ssh/id_rsa"
FORWARD_PORTS='["8888:8888"]'

#algorithm registry in mlflow
ALGORITHM_JSON_PATH="../src/assets/default_models.json"
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -168,3 +168,5 @@ cython_debug/
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/

basic_auth.ini
64 changes: 61 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
# mlex_data_clinic
# MLExchange Data Clinic

## Description
This app provides a training/testing platform for latent space exploration with
unsupervised deep-learning approaches.

## Running as a Standalone Application (Using Docker)

The **Prefect server, Tiled server, the application, and the Prefect worker job** all run within a **single Docker container**. This eliminates the need to start the servers separately.
The **Prefect server, Tiled server and the application** are all defined within a **single Docker Compose file**. Each service runs in its own Docker container, simplifying the setup process while maintaining modularity.

However, the **Prefect worker** must be run separately on your local machine (refer to step 5).

Expand All @@ -31,13 +31,71 @@ Then **update the** `.env` file with the correct values.

**Important Note:** Due to the current tiled configuration, ensure that the `WRITE_DIR` is a subdirectory of the `READ_DIR` if the same tiled server is used for both reading data and writing results.

#### MLFlow Configuration in .env

When setting `MLFLOW_TRACKING_URI` in the `.env` file:

- If you run the [MLFlow server](https://github.com/xiaoyachong/mlex_mlflow) locally, you can set it to:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add mlflow to docker-compose.yml in this repo to avoid cloning an extra repo

```
MLFLOW_TRACKING_URI="http://mlflow-server:5000"
```
This works because the MLFlow server also runs in the `mle_net` Docker network.

- If you run MLFlow server on vaughan and use SSH port forwarding:
```
ssh -S forward -L 5000:localhost:5000 <your-username>@vaughan.als.lbl.gov
```
Then you can set it to:
```
MLFLOW_TRACKING_URI="http://host.docker.internal:5000"
```

You also need to set `MLFLOW_TRACKING_USERNAME` and `MLFLOW_TRACKING_PASSWORD` in the `.env` file and modify the admin_username and admin_password in `basic_auth.ini` as well.

Create a `basic_auth.ini` file using `basic_auth.ini.example` as a reference:

```sh
cp basic_auth.ini.example basic_auth.ini
```


### 3 Build and Start the Application

#### 3.1 Algorithm Registry Setup in MLFlow

Before starting the application, you need to register your algorithms in MLflow. This is a one-time setup process:

1. Start only the MLflow services:
```sh
docker compose up -d mlflow mlflow_db
```

2. Wait a few seconds for MLflow to initialize, then register the algorithms:
```sh
cd scripts
python save_mlflow_algorithms.py
```

This script will:
- Connect to the MLflow server
- List any existing algorithms
- Register all algorithms from the JSON file specified by the `ALGORITHM_JSON_PATH` environment variable
- Show the registration status for each algorithm

> **Note:** By default, `ALGORITHM_JSON_PATH` points to `./all_models.json`, which is the combination of the models defined in Data Clinic and Latent Space Explorer. You can customize this by setting the environment variable in your `.env` file.

#### 3.2 Start the Full Application

After successfully registering the algorithms, you can start the complete application:

```sh
docker compose up -d
```

* `-d` → Runs the containers in the background (detached mode).
This command will:
- Start all services defined in your docker-compose.yml
- Run the containers in the background (detached mode)
- Use the algorithms registered in MLflow

### 4 Verify Running Containers

Expand Down
5 changes: 5 additions & 0 deletions basic_auth.ini.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
[mlflow]
default_permission = READ
database_uri = sqlite:////mlflow_auth/basic_auth.db
admin_username = <MLFLOW_TRACKING_USERNAME>
admin_password = <MLFLOW_TRACKING_PASSWORD>
61 changes: 59 additions & 2 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,9 @@ version: '3.7'

services:
prefect:
image: prefecthq/prefect:2.14-python3.11
image: prefecthq/prefect:3.4.2-python3.11
command: prefect server start
container_name: prefect-server
environment:
- PREFECT_SERVER_API_HOST=0.0.0.0
- PREFECT_API_DATABASE_CONNECTION_URL=postgresql+asyncpg://${PREFECT_DB_USER}:${PREFECT_DB_PW}@prefect_db:5432/${PREFECT_DB_NAME} # Needed if using postgres and not sqlite
Expand All @@ -19,6 +20,7 @@ services:

prefect_db:
image: postgres:14.5-alpine
container_name: prefect-db
environment:
- POSTGRES_USER=${PREFECT_DB_USER}
- POSTGRES_PASSWORD=${PREFECT_DB_PW}
Expand All @@ -33,6 +35,7 @@ services:
tiled:
# see the file ./tiled/deploy/config.yml for detailed configuration of tiled
image: ghcr.io/bluesky/tiled:v0.1.0a118
container_name: tiled-server
ports:
- "127.0.0.1:8000:8000"
environment:
Expand All @@ -52,6 +55,7 @@ services:

tiled_db:
image: postgres:14.5-alpine
container_name: tiled-db
environment:
- POSTGRES_USER=${TILED_DB_USER}
- POSTGRES_PASSWORD=${TILED_DB_PW}
Expand All @@ -63,6 +67,49 @@ services:
networks:
mle_net:

mlflow:
image: ghcr.io/mlflow/mlflow:v2.22.0
container_name: mlflow-server
command: >
/bin/sh -c "pip install --no-cache-dir psycopg2-binary 'mlflow[auth]' &&
mlflow server
--backend-store-uri postgresql://${MLFLOW_DB_USER}:${MLFLOW_DB_PW}@mlflow_db:5432/${MLFLOW_DB_NAME}
--host 0.0.0.0
--port 5000
--app-name basic-auth"
ports:
- "127.0.0.1:5000:5000"
environment:
- POSTGRES_USER=${MLFLOW_DB_USER}
- POSTGRES_PASSWORD=${MLFLOW_DB_PW}
- POSTGRES_DB=${MLFLOW_DB_NAME}
- MLFLOW_TRACKING_USERNAME=${MLFLOW_TRACKING_USERNAME}
- MLFLOW_TRACKING_PASSWORD=${MLFLOW_TRACKING_PASSWORD}
- MLFLOW_FLASK_SERVER_SECRET_KEY=${MLFLOW_FLASK_SERVER_SECRET_KEY}
- MLFLOW_AUTH_CONFIG_PATH=/basic_auth.ini
depends_on:
- mlflow_db
networks:
- mle_net
volumes:
- ./data/mlflow_storage:/mlartifacts:rw # Persist MLflow models, logs, and artifacts
- ./data/mlflow_auth:/mlflow_auth:rw # Dedicated volume for auth database
- ./basic_auth.ini:/basic_auth.ini

mlflow_db:
image: postgres:14.5-alpine # Lightweight PostgreSQL version
container_name: mlflow-db
restart: unless-stopped
environment:
- POSTGRES_USER=${MLFLOW_DB_USER}
- POSTGRES_PASSWORD=${MLFLOW_DB_PW}
- POSTGRES_DB=${MLFLOW_DB_NAME}
volumes:
- ./data/mlflow_db:/var/lib/postgresql/data:rw
- ./data/mlflow_storage:/mlartifacts:rw # Persist PostgreSQL database
networks:
- mle_net

data_clinic:
restart: "unless-stopped"
container_name: "data_clinic"
Expand All @@ -88,7 +135,6 @@ services:
FLOW_NAME: '${FLOW_NAME}'
TIMEZONE: "${TIMEZONE}"
PREFECT_TAGS: "${PREFECT_TAGS}"
FLOW_TYPE: "${FLOW_TYPE}"
CONTAINER_NETWORK: "${CONTAINER_NETWORK}"
# Slurm jobs
PARTITIONS_CPU: "${PARTITIONS_CPU}"
Expand All @@ -102,10 +148,21 @@ services:
# Mode
MODE: "development"
USER: ${USER}
# MLflow
MLFLOW_TRACKING_URI: '${MLFLOW_TRACKING_URI}'
MLFLOW_TRACKING_USERNAME: '${MLFLOW_TRACKING_USERNAME}'
MLFLOW_TRACKING_PASSWORD: '${MLFLOW_TRACKING_PASSWORD}'
MLFLOW_CACHE_DIR: "/mlflow_cache"
volumes:
- $READ_DIR:/tiled_storage
- ./src:/app/work/src
- ./data/mlflow_cache:/mlflow_cache
ports:
- 127.0.0.1:8072:8070
depends_on:
- mlflow
- tiled
- prefect
networks:
- mle_net

Expand Down
23 changes: 23 additions & 0 deletions frontend.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,29 @@
# Configure logging at the earliest possible point
import logging
import os
import sys
from uuid import uuid4

# Set up basic configuration
logging.basicConfig(
level=logging.INFO, # Use DEBUG to see all logs
format="%(asctime)s - %(name)s - %(levelname)s - %(message)s",
handlers=[logging.StreamHandler(sys.stdout)],
)

# Create logger for this module
logger = logging.getLogger("dataclinic")
logger.info("Logging configured. Frontend module initializing.")

# Explicitly set level for lse namespace
logging.getLogger("dataclinic").setLevel(logging.INFO)

# Force propagation for all existing lse loggers
for name in logging.root.manager.loggerDict:
if name.startswith("dataclinic."):
logging.getLogger(name).propagate = True


from dash import MATCH, Input, Output, html
from dotenv import load_dotenv

Expand Down
11 changes: 6 additions & 5 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -26,18 +26,19 @@ dependencies = [
"dash-extensions==0.0.71",
"flask==3.0.0",
"Flask-Caching",
"kaleido",
"kaleido<=0.2.1",
"humanhash3",
"mlex_file_manager@git+https://github.com/mlexchange/mlex_file_manager.git",
"mlex_utils[all]@git+https://github.com/mlexchange/mlex_utils.git",
"mlex_utils[all]@git+https://github.com/xiaoyachong/mlex_utils.git@xiaoya-update-prefect3",
"mlflow",
"numpy>=1.19.5",
"pandas",
"Pillow",
"plotly>=5.21.0",
"plotly>=5.21.0,<6.0.0",
"plotly-express",
"pyFAI==2023.9.0",
"pyFAI==2025.3.0",
"python-dotenv",
"requests==2.26.0",
"requests",
"diskcache==5.6.3"
]

Expand Down
Empty file added scripts/__init__.py
Empty file.
Loading