Diffusion model is a type of generative model. Its approach is different from GAN, VAE and Flow-based models. In my repository, I re-setup diffusion model from scratch to do some experiments:
- Diffusion Model: Training with simple loss
- Inference with DDPM and DDIM
- Using (label, image, text) as condition for diffusion model
- Latent diffusion: Image space to latent space with VAE
- Stable diffusion: Latent + Condition Diffusion
- Classifier-free guidance
- Sketch2Image: using condition as sketch image
- Medical Image Segmentation: using condition as medical image
On the other hand, Flow Matching is currently better than the Diffusion Model in serveral tasks. In my repository, i used the library from Meta Research Team to run some experiments.
git clone https://github.com/hanhNK1604/diffusion.git
cd diffusion
conda create -n diffusion python=3.10
conda activate diffusion
pip install -r requirements.txt
set-up CUDA_VISIBLE_DEVICES and WANDB_API_KEY before training
export CUDA_VISIBLE_DEVICES=0
export WANDB_API_KEY=???
cd src
python train.py
there are several config file, folder should be modified to run the achieved experiment such as configs/data, logger, model, trainer, train.yaml
...
-
Generation task:
-
Segmentation task:
- Self Attention
- Cross Attention
- Spatial Transformer
- ResNet Block
- VGG Block
- DenseNet Block
- Inception Block
- Time
- Label: animal (dog, cat), number (0,1,...9), gender (male, female)
- Image: Sketch2Image, Segmentation
- Text: not implemented
- DDPM: Denoising Diffusion Probabilistic Models
- DDIM: Denoising Diffusion Implicit Models
- Unet: Encoder, Decoder
- Unconditional Diffusion Model
- Conditional diffusion model (label, image, text - need to implement text embedder model)
- Variational autoencoder: Vanilla (only work for reconstruction), VQ
- Latent diffusion model
- Stable diffusion model
- Classifier-free; not work
In this repository, Flow Matching Model use the same unconditional, conditional backbone UNet with Diffusion Model
The MRI Image is so important in the medical field, which can provide the information to doctor for diagnosing some diseases about brain like Cancer,... I have implemented the generation tasks on MRI based on age to produce some synthetic data.
Dataset | Image-Size | FID (features=2048, ddim -> ddpm) |
---|---|---|
Mnist | 32x32 | 2.65 -> 0.89 |
Fashion-Mnist | 32x32 | 3.31 -> 2.42 |
Cifar10 | 32x32 | 5.54 -> 3.58 |
Dataset | Image-Size | FID (features=2048, ddim -> ddpm) |
---|---|---|
Mnist | 32x32 | 3.91 -> 1.16 |
Fashion-Mnist | 32x32 | 3.10 -> 2.15 |
Cifar10 | 32x32 | 5.66 -> 3.37 |
Gender | 64x64 | 3. |
CelebA | 64x64 | 3. |
- Image Super Resolution (Low Resolution, Reconstructed Image, High Resolution)
- Sketch2Image (Sketch, Fake, Real)