- Mar-15-25: Preprint is available.
- Mar-13-25: Public release of the code and models.
- Mar-12-25: Paper accepted at ICLRw-2025.
- Contextual & Negative Prompting: Guides the diffusion process to generate domain-specific images while suppressing undesired content.
- Hard Cosine Similarity Filtration: Uses CLIP embeddings to filter out generated samples that do not meet semantic alignment criteria.
- Composite Image Mixing: Combines real and generative images using both pixel-wise and patch-wise strategies.
- Clone this repository and navigate to DiffCoRe-Mix folder
git clone https://github.com/khawar-islam/DiffCoRe-Mix.git
cd DiffCoRe-Mix- Install Package
conda create -n DiffCoreMix python=3.9.19 -y
conda activate DiffCoreMix- Download pre-trained CosXL model
https://huggingface.co/cocktailpeanut/c/blob/main/cosxl.safetensors
- To run the augmentation process, use:
python main.py --dataset <DATASET_NAME> --output_folder <PATH_TO_OUTPUT_FOLDER> --aug_per <AUGMENTATION_PERCENTAGE>
- For instance, to augment the CUB200 dataset with 30% augmentation
python main.py --dataset cub200 --output_folder /path/to/cub200/train --aug_per 0.3
If you use DiffCoRe-Mix in your research, please cite our paper:
@inproceedings{islam2025context,
title={Context-Guided Responsible Data Augmentation with Diffusion Models},
author={Islam, Khawar and AKHTAR, NAVEED},
booktitle={ICLR 2025 Workshop on Navigating and Addressing Data Problems for Foundation Models}
}

