|
Dawid Kopeć1,
Wojciech Kozłowski1,
Maciej Wizerkaniuk1,
Dawid Krutul1,
Jan Kocoń1, and
Maciej Zięba1 WUST, Wybrzeże Stanisława Wyspiańskiego 27, 50-370 Wrocław, Poland {wojciech.kozlowski, jan.kocon, maciej.zieba}@pwr.edu.pl |
This repository contains the implementation of our novel super-resolution (SR) method, as presented in our paper published at the ICCS 2025. The repository is designed with modularity and flexibility in mind, leveraging PyTorch Lightning for training, Hydra for configuration management, and Weights & Biases (W&B) for experiment tracking.
In this work, we present SupResDiffGAN, a novel hybrid architecture that combines the strengths of Generative Adversarial Networks (GANs) and diffusion models for super-resolution tasks. By leveraging latent space representations and reducing the number of diffusion steps, SupResDiffGAN achieves significantly faster inference times than other diffusion-based super-resolution models while maintaining competitive perceptual quality. To prevent discriminator overfitting, we propose adaptive noise corruption, ensuring a stable balance between the generator and the discriminator during training. Extensive experiments on benchmark datasets show that our approach outperforms traditional diffusion models such as SR3 and I2SB in efficiency and image quality. This work bridges the performance gap between diffusion- and GAN-based methods, laying the foundation for real-time applications of diffusion models in high-resolution image generation.
- SupResDiffGAN a new approach for the Super-Resolution task 🚀✨
The best and second-best results are highlighted in bold and underline, respectively. Methods are categorized into Diffusion-based and GAN-based to reflect their distinct architectural frameworks.
| Model / Dataset | Imagenet | Celeb | Div2k | RealSR-nikon | RealSR-canon | Set14 | Urban100 |
|---|---|---|---|---|---|---|---|
| Metric | LPIPS ↓ | LPIPS ↓ | LPIPS ↓ | LPIPS ↓ | LPIPS ↓ | LPIPS ↓ | LPIPS ↓ |
| GAN-based methods | |||||||
| SRGAN | 0.3452 | 0.2441 | 0.3327 | 0.3464 | 0.3050 | 0.2901 | 0.3156 |
| ESRGAN | 0.2320 | 0.1903 | 0.2649 | 0.3380 | 0.3053 | 0.2375 | 0.2408 |
| Real-ESRGAN | 0.2123 | 0.1690 | 0.2562 | 0.3309 | 0.3020 | 0.2301 | 0.2285 |
| Diffusion-based methods | |||||||
| SR3 | 0.3519 | 0.2229 | 0.3396 | 0.4018 | 0.4008 | 0.3015 | 0.2428 |
| I2SB | 0.3755 | 0.2221 | 0.3309 | 0.4069 | 0.3867 | 0.3169 | 0.2635 |
| ResShift | 0.5360 | 0.3275 | 0.4724 | 0.4959 | 0.4671 | 0.4832 | 0.4822 |
| SupResDiffGAN | 0.3079 | 0.1875 | 0.2876 | 0.3970 | 0.3853 | 0.2789 | 0.2570 |
The best and second-best results are highlighted in bold and underline, respectively. Methods are categorized into Diffusion-based and GAN-based to reflect their distinct architectural frameworks.
| Model / Dataset | Imagenet | Celeb | Div2k | RealSR-nikon | RealSR-canon | Set14 | Urban100 |
|---|---|---|---|---|---|---|---|
| Metric | Time per batch [s] | Time per batch [s] | Time per batch [s] | Time per batch [s] | Time per batch [s] | Time per batch [s] | Time per batch [s] |
| GAN-based methods | |||||||
| SRGAN | 0.0671 | 0.0109 | 0.0193 | 0.0367 | 0.0113 | 0.0888 | 0.0070 |
| ESRGAN | 0.2188 | 0.0870 | 0.2316 | 0.2711 | 0.1504 | 0.2049 | 0.0821 |
| Real-ESRGAN | 0.1392 | 0.0816 | 0.1899 | 0.2468 | 0.1427 | 0.2361 | 0.1013 |
| Diffusion-based methods | |||||||
| SR3 | 1.9953 | 0.3072 | 7.6377 | 8.4242 | 3.6420 | 0.8627 | 1.5028 |
| I2SB | 1.6776 | 0.1184 | 6.7292 | 7.0910 | 3.1629 | 1.8049 | 1.2395 |
| ResShift | 2.2466 | 0.4394 | 8.6647 | 8.9677 | 4.1880 | 0.5983 | 1.6762 |
| SupResDiffGAN | 0.2954 | 0.1832 | 0.9333 | 1.0021 | 0.6114 | 0.3542 | 0.3206 |
Two representative SupResDiffGAN outputs: (top) 4× face superresolution at 128×128→512×512 pixels (bottom) 4× natural image super-resolution at 125×93→500×372 pixels.
Qualitative comparison of visual performance on two example images from ImageNet. Low-quality inputs are on the left, while results from bicubic upscale and seven SR models: SRGAN, ESRGAN, Real-ESRGAN, SR3, ResShift, I2SB, and Ours are on the right.
- Python >= 3.9,
- PyTorch Lightning == 2.2.2
- CUDA-enabled GPU (recommended for training)
-
Clone the repository:
git clone https://github.com/Dawir7/SupResDiffGAN.git cd SupResDiffGAN -
Create a virtual environment and activate it:
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install the required dependencies:
pip install -r requirements-gpu.txt
This repository uses Hydra for managing configurations. Configuration files are located in the conf/ directory. You can override any configuration parameter directly from the command line.
python train_model.py model.name=ESRGAN dataset.batch_size=16 trainer.max_epochs=50More information about overriding parameters in Hydra documentation Basic Override syntax
config.yaml: Default configuration file.config_srgan.yaml: Configuration for SRGAN.config_esrgan.yaml: Configuration for ESRGAN.config_real_esrgan: Configuration for Real-ESRGAN.config_sr3.yaml: Configuration for SR3.config_i2sb.yaml: Configuration for I2SB.config_resshift.yaml: Configuration for ResShift.config_supresdiffgan.yaml: Configuration for SupResDiffGAN.config_supresdiffgan_without_adv.yaml: Configuration for SupResDiffGAN model without a discriminator or adversarial loss.config_supresdiffgan_simple_gan.yaml: Configuration for SupResDiffGAN model with a discriminator but without Gaussian noise augmentation.
This repository integrates Weights & Biases (W&B) for experiment tracking. Follow these steps to get started:
-
Login to W&B:
wandb login
-
Track Experiments:
-
Metrics, losses, and visualizations are automatically logged to your W&B project.
-
Customize the W&B project name in the configuration file in use f.e.:
wandb_logger: project: 'your_project' # your wandb project entity: 'your_entity' # your wandb entity
-
-
View Results:
- Visit https://wandb.ai and navigate to your project to view experiment results.
This section outlines how to download the necessary datasets for training and evaluating the SupResDiffGAN model. We provide a convenient bash script to automate the download process.
- Activated virtual environment (as described in the Installation section).
- Note: If you haven't installed all GPU requirements using
requirements-gpu.txt, the minimal libraries required for downloading the CelebA and ImageNet datasets are listed inrequirements-data.txt. You can install these specifically using:pip install -r requirements-data.txt
The get_data.sh script will download the specified datasets to the appropriate directories (the exact locations are defined within the script). Please ensure you have sufficient disk space before running the script.
Notes:
- The specific implementation and sources for each dataset download are defined within the
get_data.shscript. Refer to the script for more details on the download process for each dataset. - Due to the potentially long download and processing times for some datasets, especially ImageNet and large RealSR variants, it is highly recommended to run the script within a terminal multiplexer such as
tmuxorscreen. This will allow the process to continue even if your SSH connection is interrupted. - Crucially, datasets are subjects to its own license terms and conditions. By using any of datasets, you are solely responsible for understanding and complying with the respective dataset's license. We, as the authors of this code repository, assume no responsibility for your usage of these datasets or any potential license violations. It is your responsibility to ensure your use adheres to the terms set forth by the dataset providers.
We strongly recommend that you familiarize yourself with the licensing terms of any dataset you choose to use before downloading and incorporating it into your workflow. Links to the official licenses are typically available on the dataset providers' websites.
-
Ensure you are in the repository's root directory:
cd SupResDiffGAN -
Run the
get_data.shscript with the desired dataset flags. The script accepts the following flags:-ior--imagenet: Downloads the ImageNet dataset.-cor--celeba: Downloads the CelebA dataset.-dor--div2k: Downloads the Div2k dataset.-ror--realsr: Downloads the RealSR dataset.-sor--set14: Downloads the Set14 dataset.-uor--urban100: Downloads the Urban100 dataset.
-
Download the ImageNet dataset:
bash get_data.sh -i
-
Download the ImageNet and CelebA datasets:
bash get_data.sh -i -c
-
Download supported datasets using full names:
bash get_data.sh --celeba --div2k
To train a model, use the train_model.py script. Example:
python train_model.py -cn "config_supresdiffgan"To evaluate a trained model, use the evaluate_model.py script. Example:
python evaluate_model.py "config_supresdiffgan"More about configs in CONFIGS.md.
More about usage of Hydra flags: Hydra documentation
We provide pre-trained weights for SupResDiffGAN to facilitate evaluation and fine-tuning. These weights are trained on ImageNet and can be used for inference or as a starting point for further training.
To use a pre-trained model, specify the path to the checkpoint file in the load_model field of the configuration file. For example, in config.yaml:
model:
load_model: 'path/to/your/checkpoint_file.pth' # Path to the pre-trained model checkpointIf you use this repository in your research, please cite our paper:
@inproceedings{kopec2025supresdiffgan,
title={SupResDiffGAN: A New Approach for the Super-Resolution Task},
author={Kope{\'c}, Dawid and Koz{\l}owski, Wojciech and Wizerkaniuk, Maciej and Krutul, Dawid and Koco{\'n}, Jan and Zi{\k{e}}ba, Maciej},
booktitle={Proceedings of the International Conference on Computational Science (ICCS)},
year={2025}
}
We would like to acknowledge the following repositories and works that served as inspiration or baselines for our research:
- PyTorch-GAN: A collection of PyTorch implementations of GANs.
- Real-ESRGAN-bicubic: A bicubic version of Real-ESRGAN for super-resolution tasks.
- Real-ESRGAN: A practical algorithm for general image restoration.
- ResShift: A novel approach for image super-resolution.
- I2SB: A diffusion-based method for image-to-image super-resolution.
We are grateful for the contributions of these projects to the field of super-resolution and deep learning.
This repository is licensed under the Academic Free License (AFL) v3.0. See the LICENSE.txt file for the full license text.
By using this repository, you agree to comply with the terms of the Academic Free License and any applicable third-party licenses.
Some parts of this repository are modified or adapted from other open-source projects mentioned in the Acknowledgement 🙏 section. These parts retain their original licenses, which are included in their respective directories. Please refer to the following for more details:
- Real-ESRGAN: Licensed under the BSD 3-Clause License. See the RealESRGAN/LICENSE.txt file for the full license text.
- ResShift: Licensed under the S-Lab License 1.0. See the ResShift/LICENSE.txt file for the full license text.
- I2SB: Licensed under the NVIDIA Source Code License. See the I2SB/LICENSE.txt file for the full license text.



