Skip to content

Commit

Permalink
Merge pull request #11 from ThalesGroup/v0.5.0
Browse files Browse the repository at this point in the history
Towards version 0.5.0!
  • Loading branch information
BaptisteMorisse authored Dec 1, 2023
2 parents 4de2d2d + daf19b5 commit b0290a1
Show file tree
Hide file tree
Showing 33 changed files with 1,580 additions and 1,278 deletions.
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@ datasets
saved_models_path
experiments
*.code-workspace
**/.ruffcache
**/.ruffcache
40 changes: 40 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# See https://pre-commit.com for more information
# See https://pre-commit.com/hooks.html for more hooks
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v3.2.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-yaml
- id: check-added-large-files

# Using this mirror lets us use mypyc-compiled black, which is about 2x faster
- repo: https://github.com/psf/black-pre-commit-mirror
rev: 23.11.0
hooks:
- id: black

- repo: https://github.com/astral-sh/ruff-pre-commit
# Ruff version.
rev: v0.1.4
hooks:
# Run the linter.
- id: ruff
args: [ --fix ]
# Run the formatter.
- id: ruff-format
- repo: https://github.com/pycqa/isort
rev: 5.12.0
hooks:
- id: isort
args: ["--profile", "black", "--filter-files"]

- repo: https://github.com/python-poetry/poetry
rev: 1.7.1 # add version here
hooks:
- id: poetry-check
- id: poetry-lock
args: ["--no-update"]
- id: poetry-install
args: ["--sync"]
14 changes: 12 additions & 2 deletions Changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,16 @@

## [Unreleased]


## [Version 0.5.0]

* Renaming the commands of client and server parts. Improving the help printed by Typer.
* Adding a new check command for both Server and Client sides in order to possibly check before-hand the validity of the provided configuration file.
* **REFACTO**: the main commands of the pybiscus app are now located in src/commands. For instance, the previous version of `src/flower/client_fabric.py` is split into the Typer part src/commands/app_client.py and the new version of `src/flower/client_fabric.py`. This helps structure the code into more distinct block.
* The Unet3D Module comes now with a better Dice Loss and a Dice metric instead of the Accuracy (not suitable in the context of segmentation of 3D images).
* Small change on the `weighted_average` function, to take care of the change of keywords.
* **NEW FEATURE**: using Trogon to add a Terminal User Interface command to the Pybiscus app. This helps new users to browse through help, existing commands and their syntax.

## [Version 0.4.0]

* the Server has now the possibility to save the weights of the model at the end of the FL session.
Expand All @@ -24,5 +34,5 @@
* getting rid of load_data_paroma, amd replaces it by direct use of LightningDataModule.
* updating config files accordingly
* moving logging of evaluate function to evaluate inside FabricStrategy; more coherence with aggregate_fit and aggregate_evaulate.
* upgrading the config for local training with key 'trainer', making all Trainer arguments virtually available
* adding a constraint on deepspeed library due to some issues with the installation of the wheel. Issue with poetry? In poetry, version is 0.9.0 but in installing the wheel built by poetry, it is 0.11.1...
* upgrading the config for local training with key 'trainer', making all Trainer arguments virtually available
* adding a constraint on deepspeed library due to some issues with the installation of the wheel. Issue with poetry? In poetry, version is 0.9.0 but in installing the wheel built by poetry, it is 0.11.1...
23 changes: 21 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,29 @@
# Pybiscus: a flexible Federated Learning Framework

## Get started
## Introduction

Pybiscus is a simple tool to perform Federated Learning on various models and datasets.
Pybiscus is a simple tool to perform Federated Learning on various models and datasets.
It aims at automated as much as possible the FL pipeline, and allows to add virtually any kind of dataset and model.

Pybiscus is built on top of Flower, a mature Federated Learning framework; Typer (script and CLI parts) and Lightning/Fabric for all the Machine Learning machinery.

You can simply test Pybiscus by downloading the latest wheel available and install it.

## Get started

You can simply test Pybiscus by downloading the latest wheel available in the dist folder and install it in a virtual environnement:
```bash
python -m pip install virtualenv
python -m virtualenv .venv
source .venv/bin/activate
(.venv) python -m pip install pybiscus_paroma-0.5.0-py3-none-any.whl
```

and you are good to go! The packages comes with an app named `pybiscus_paroma_app` that you can use in the virtual environment. You can then test if everything went well by launching a local training:
```bash
(.venv) pybiscus_paroma_app local train-config configs/local_train.yml
```

## Documentation

Documentation is available at [docs](docs/).
Expand All @@ -17,6 +32,10 @@ Documentation is available at [docs](docs/).

If you are interested in contributing to the Pybiscus project, start by reading the [Contributing guide](/CONTRIBUTING.md).

## Who uses Pybiscus

Pybiscus is on active development at Thales, both for internal use and on some collaborative projects. One major use is in the Europeean Project [PAROMA-MED](https://paroma-med.eu), dedicated to Federated Learning in the context of medical data distributed among several Hospitals.

## License

The License is Apache 2.0. You can find all the relevant information here [LICENSE](/LICENSE.md)
Expand Down
26 changes: 26 additions & 0 deletions ROADMAP.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@

# Features

## Work in Progress

- [ ] Improving the Documentation part and docstrings of the code.
- [ ] Adding some other datasets and models.
- [ ] Adding a test suite using pytest
- [ ] Thoughts on Checkpointing.
- [ ] Adding commands to deal with data downloading and/or preprocessing?
- [ ] Integration of Differential Privacy.
- [ ] Integration of FHE
- [ ] Adding other strategies:
* FedProx
* ...
- [ ] Refacto: splitting flower directory into flower and typer parts. Separating commands themselves from code of Flower.
- [ ] Better logging: fusion between logging through Rich Console and logging from Flower?
- [x] Use Pydantic to control config files and the good use of the different tools.
- [x] Adding the possibility to save the weights of the model. Cane be done using Fabric.
- [x] Working on Docker part.
- [x] Using only LightningDataModule, and getting rid of load_data.
- [x] Logging with tensorboard.

## Road Map

Here is a list of more mid/long term ideas to implement in Pybiscus for Federated Learning.
3 changes: 2 additions & 1 deletion configs/client_1.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
cid: 1
pre_train_val: true
fabric:
accelerator: gpu
devices:
Expand All @@ -18,4 +19,4 @@ data:
dir_val: ${root_dir}/datasets/client1/val/
dir_test: None
batch_size: 32
server_adress: localhost:22222
server_adress: localhost:22222
2 changes: 1 addition & 1 deletion configs/client_2.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,4 @@ data:
dir_val: ${root_dir}/datasets/client2/val/
dir_test: None
batch_size: 32
server_adress: localhost:22222
server_adress: localhost:22222
2 changes: 1 addition & 1 deletion container/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ WORKDIR "${HOME}/app"
COPY ./ ./
# RUN poetry build
RUN pip install -U pip wheel setuptools
RUN pip install --no-cache-dir pybiscus-0.4.0-py3-none-any.whl
RUN pip install --no-cache-dir pybiscus-0.5.0-py3-none-any.whl

##############################################################################

Expand Down
3 changes: 2 additions & 1 deletion container/configs/client_1.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
cid: 1
pre_train_val: true
fabric:
accelerator: gpu
root_dir: ${oc.env:PWD}
Expand All @@ -16,4 +17,4 @@ data:
dir_val: ${root_dir}/datasets/client1/val/
dir_test: None
batch_size: 32
server_adress: server:22222
server_adress: server:22222
2 changes: 1 addition & 1 deletion container/configs/client_2.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,4 @@ data:
dir_val: ${root_dir}/datasets/client2/val/
dir_test: None
batch_size: 32
server_adress: server:22222
server_adress: server:22222
2 changes: 1 addition & 1 deletion container/configs/local_train.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,4 @@ model:
input_shape: 3
mid_shape: 6
n_classes: 10
lr: 0.001
lr: 0.001
2 changes: 1 addition & 1 deletion container/configs/server.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,4 +21,4 @@ server_adress: "[::]:22222"
num_rounds: 10
client_configs:
- ${root_dir}/configs/client_1.yml
- ${root_dir}/configs/client_2.yml
- ${root_dir}/configs/client_2.yml
Binary file added container/pybiscus-0.5.0-py3-none-any.whl
Binary file not shown.
2 changes: 1 addition & 1 deletion container/scripts/launch_client_1.sh
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,4 @@ docker run \
--net-alias client-1 \
--user $uid:$gid \
--shm-size 50G \
pybiscus:app.v0.4.0 client launch-config configs/client_1.yml
pybiscus:app.v0.5.0 client launch configs/client_1.yml
2 changes: 1 addition & 1 deletion container/scripts/launch_client_2.sh
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,4 @@ docker run \
--net-alias client-2 \
--user $uid:$gid \
--shm-size 50G \
pybiscus:app.v0.4.0 client launch-config configs/client_2.yml
pybiscus:app.v0.5.0 client launch configs/client_2.yml
2 changes: 1 addition & 1 deletion container/scripts/launch_local_train.sh
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,4 @@ docker run \
-v ${PWD}/container/configs:/app/configs \
--user $uid:$gid \
--shm-size 50G \
pybiscus:app.v0.4.0 local train-config configs/local_train.yml
pybiscus:app.v0.5.0 local train-config configs/local_train.yml
2 changes: 1 addition & 1 deletion container/scripts/launch_server.sh
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,4 @@ docker run \
--net-alias server \
--user $uid:$gid \
--shm-size 50G \
pybiscus:app.v0.4.0 server launch-config configs/server.yml
pybiscus:app.v0.5.0 server launch configs/server.yml
Binary file added dist/pybiscus-0.5.0-py3-none-any.whl
Binary file not shown.
Binary file added dist/pybiscus-0.5.0.tar.gz
Binary file not shown.
8 changes: 4 additions & 4 deletions docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,23 +22,23 @@ The keyword `devices` is waiting for either a list of integers (the id of the de
...
fabric:
accelerator: cpu
# devices:
# devices:
...
```

The keyword `devices` is left intentionnaly commented, as Fabric will automatically find a suitable device corresponding to the choice cpu.

## Models

Please look at
Please look at
::: src.flower.server_fabric.evaluate_config
options:
heading_level: 3

and
and

::: src.flower.server_fabric.launch_config

## Data

## Others
## Others
26 changes: 13 additions & 13 deletions docs/getting_started.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ A simple tool to perform Federated Learning on various models and datasets. Buil
You have two ways of using Pybiscus. Either by cloning the repo and installing (via Poetry) all dependencies, and working on the code itself; or by just downloading the wheel and installing as a package.

### User Mode
The wheel in `dist/pybiscus-0.4.0-py3-none-any.whl` is the packaged version of Pybiscus. You can download it and do
The wheel in `dist/pybiscus-0.5.0-py3-none-any.whl` is the packaged version of Pybiscus. You can download it and do
```bash
pyenv local 3.9.12
python -m pip install virtualenv
Expand All @@ -29,22 +29,22 @@ this command will show you some documentation on how to use the app. There are t

Note that the package is still actively under development, and even if we try as much as possible to not break things, it could happen!

To work, the app needs only config files for the server and the clients. Any number of clients can be launched, using the same command `client launch-config`.
To work, the app needs only config files for the server and the clients. Any number of clients can be launched, using the same command `client launch`.
or, now with config files (examples are provided in `configs/`):
```bash
pybiscus_app server launch-config path-to-config/server.yml
pybiscus_app client launch-config path-to-config/client_1.yml
pybiscus_app client launch-config path-to-config/client_2.yml
pybiscus_app server launch path-to-config/server.yml
pybiscus_app client launch path-to-config/client_1.yml
pybiscus_app client launch path-to-config/client_2.yml
```

Here is the API for the server, for instance:
::: src.flower.server_fabric.launch_config

### Dev Mode

We strongly suggest the use of both pyenv and poetry.
We strongly suggest the use of both pyenv and poetry.

* Pyenv is a tool to manage properly Python versions, and you can find install instructions here https://github.com/pyenv/pyenv#installation.
* Pyenv is a tool to manage properly Python versions, and you can find install instructions here https://github.com/pyenv/pyenv#installation.

* Poetry is a dependency tool, way better than the usual "pip install -r requirements.txt" paradigm, and manages virtual environments too. It is easy to use, well documented, and the install instructions are here https://python-poetry.org/docs/#installation.

Expand All @@ -61,7 +61,7 @@ and you are good to go! We suggest to create a directory `experiments` to hold c
To build the image (which is quite heavy as of now), do the following
```bash
cd container
docker build . -t pybiscus:app.v0.4.0
docker build . -t pybiscus:app.v0.5.0
```


Expand All @@ -74,7 +74,7 @@ docker build \
--build-arg http_proxy=$HTTP_PROXY \
--build-arg https_proxy=$HTTPS_PROXY \
--build-arg no_proxy=$NO_PROXY \
. -t pybiscus:app.v0.4.0
. -t pybiscus:app.v0.5.0
```

Then, again only if you have to go through a proxy for internet access, then to download the data the different containers will need and internet access.
Expand All @@ -98,21 +98,21 @@ to ne noProxy config.
and voila! The docker image is aimed at running only the pybiscus_app itself. In order to facilitate the use of docker (which can be quite verbose), some scripts are available in container/scripts. To launch a local training, you just need to update `container/scripts/launch_local_train.sh` and `container/configs/local_train.yml` according to where are located your datasets and such. Then, simply run
```bash
bash container/scripts/launch_local_train.sh
```
```

It is as simple as running
```bash
docker run -t --gpus device=(some_device) -v "$(pwd)":/app/datasets pybiscus:app --help
```

to get the help of the app. The short version is, `docker run -t pybiscus:app.v0.4.0` is equivalent to running `pybiscus_app`. As for the app itself, the docker image can launch either client, server or local components.
to get the help of the app. The short version is, `docker run -t pybiscus:app.v0.5.0` is equivalent to running `pybiscus_app`. As for the app itself, the docker image can launch either client, server or local components.

To launch a "true" Federated learning, you need first to create a docker network for the containers to communicate:
```bash
docker network create federated
```

then
then
```bash
bash container/scripts/launch_server.sh
```
Expand All @@ -121,7 +121,7 @@ followed by (in other terminal)
```bash
bash container/scripts/launch_client_1.sh
```
and
and
```bash
bash container/scripts/launch_client_2.sh
```
Expand Down
Loading

0 comments on commit b0290a1

Please sign in to comment.