A comprehensive image enhancement pipeline that combines Real-ESRGAN super-resolution with GFPGAN facial restoration to dramatically improve image quality and restore facial details in low-resolution photographs.
This project implements a two-stage image enhancement process:
- Super-Resolution Enhancement: Uses Real-ESRGAN to upscale images by 4x while preserving fine details
- Facial Restoration: Applies GFPGAN to specifically enhance and restore facial features in the upscaled images
The system is designed to handle various image types but excels particularly with photographs containing human faces, making it ideal for restoring old photographs, enhancing low-quality images, and improving facial clarity.
- 4x Super-Resolution: Upscale images to 4 times their original resolution
- Advanced Facial Enhancement: Restore and enhance facial features using generative adversarial networks
- Automatic Model Management: Downloads and manages required model weights automatically
- GPU Acceleration: Leverages CUDA when available for faster processing
- Batch Processing Support: Process multiple images efficiently
- Memory Optimization: Implements tiling and padding for handling large images
graph TD
A[Input Image] --> B[RealESRGAN Super-Resolution]
B --> C[4x Upscaled Image]
C --> D[GFPGAN Face Enhancement]
D --> E[Final Enhanced Image]
F[Model Weights] --> G[SRVGGNetCompact]
F --> H[GFPGAN Model]
G --> B
H --> D
I[Configuration] --> J[Hardware Detection]
J --> K[CUDA Available?]
K -->|Yes| L[GPU Processing]
K -->|No| M[CPU Processing]
L --> B
M --> B
flowchart LR
subgraph "Input Processing"
A1[Image Loading] --> A2[Format Validation]
A2 --> A3[Preprocessing]
end
subgraph "Enhancement Pipeline"
B1[RealESRGAN] --> B2[Super-Resolution]
B2 --> B3[GFPGAN]
B3 --> B4[Face Restoration]
end
subgraph "Model Management"
C1[Weight Download] --> C2[Model Initialization]
C2 --> C3[Hardware Optimization]
end
A3 --> B1
C3 --> B1
C3 --> B3
B4 --> D1[Output Generation]
- Python 3.7+
- CUDA-compatible GPU (optional, but recommended for faster processing)
- At least 4GB of available disk space for model weights
pip install realesrgan gfpgan
pip install transformers accelerate safetensors diffusers
pip install torch torchvision opencv-python pillow requests
- Clone the repository:
git clone https://github.com/officiallyutso/ai-enhanced-image-restoration.git
cd ai-enhanced-image-restoration
- Install dependencies:
pip install -r requirements.txt
- Run the setup script to download model weights:
python setup.py
from image_enhancer import ImageEnhancer
# Initialize the enhancer
enhancer = ImageEnhancer()
# Enhance a single image
enhancer.enhance_image('input.jpg', 'output.jpg')
# Custom configuration
enhancer = ImageEnhancer(
scale_factor=4,
tile_size=512,
use_gpu=True,
face_enhancement=True
)
# Process with specific settings
result = enhancer.process_image(
input_path='low_res_image.jpg',
output_path='enhanced_image.jpg',
enhance_faces=True,
upscale_background=True
)
graph LR
A[Input Image] --> B[SRVGGNetCompact]
B --> C[Feature Extraction]
C --> D[Upsampling Layers]
D --> E[Reconstruction]
E --> F[4x Upscaled Output]
subgraph "Network Architecture"
G[3 Input Channels] --> H[64 Feature Channels]
H --> I[32 Convolution Layers]
I --> J[PReLU Activation]
J --> K[4x Upscale Factor]
end
sequenceDiagram
participant I as Input Image
participant D as Face Detector
participant G as GFPGAN Model
participant R as Real-ESRGAN
participant O as Output
I->>D: Detect faces
D->>G: Extract face regions
G->>G: Generate enhanced faces
G->>R: Upscale background
R->>O: Composite final image
O->>I: Return enhanced result
gantt
title Image Enhancement Timeline
dateFormat X
axisFormat %s
section Initialization
Model Loading :0, 3
section Processing
Super-Resolution :3, 8
Face Detection :8, 9
Face Enhancement :9, 12
Composition :12, 13
section Output
Image Saving :13, 14
graph LR
A[Original Image] -->|Load| B[Memory: 1x]
B -->|Super-Resolution| C[Memory: 4x]
C -->|Face Processing| D[Memory: 6x Peak]
D -->|Optimization| E[Memory: 4x]
E -->|Output| F[Memory: 1x]
The system automatically downloads the following pre-trained models:
Model | Size | Purpose | Download Source |
---|---|---|---|
realesr-general-x4v3.pth | ~65MB | Super-resolution | Real-ESRGAN v0.2.5.0 |
GFPGANv1.4.pth | ~348MB | Face restoration | GFPGAN v1.3.0 |
- CPU: Multi-core processor (4+ cores recommended)
- RAM: 8GB system memory
- Storage: 1GB free space for models and temporary files
- GPU: NVIDIA GPU with 6GB+ VRAM
- CPU: 8+ core processor
- RAM: 16GB+ system memory
- Storage: SSD with 2GB+ free space
ai-enhanced-image-restoration/
├── src/
│ ├── models/
│ │ ├── realesrgan_wrapper.py
│ │ └── gfpgan_wrapper.py
│ ├── utils/
│ │ ├── image_utils.py
│ │ └── model_utils.py
│ └── image_enhancer.py
├── weights/
│ ├── realesr-general-x4v3.pth
│ └── GFPGANv1.4.pth
├── examples/
│ ├── basic_usage.py
│ └── batch_processing.py
├── tests/
│ └── test_enhancement.py
├── requirements.txt
├── setup_models.py
└── README.md
ImageEnhancer(scale=4, tile_size=0, use_gpu=True, model_path='weights/')
enhance_image(input_path: str, output_path: str, enhance_faces: bool = True) -> bool
Enhances a single image with super-resolution and optional face restoration.
batch_enhance(input_dir: str, output_dir: str, file_extensions: list = ['.jpg', '.png']) -> dict
Processes multiple images in a directory.
get_enhancement_stats(input_path: str) -> dict
Returns processing statistics and image quality metrics.
Input Resolution | GPU Processing | CPU Processing | Output Quality |
---|---|---|---|
256x256 | 2.3s | 12.8s | Excellent |
512x512 | 4.1s | 28.2s | Excellent |
1024x1024 | 8.7s | 65.4s | Excellent |
2048x2048 | 18.3s | 156.7s | Excellent |
graph LR
A[Input PSNR: 22.5dB] --> B[RealESRGAN: 28.3dB]
B --> C[GFPGAN: 31.7dB]
D[Input SSIM: 0.65] --> E[RealESRGAN: 0.82]
E --> F[GFPGAN: 0.91]
# Reduce tile size for large images
enhancer = ImageEnhancer(tile_size=256)
# Manual model download
python setup_models.py --force-download
- Ensure faces are clearly visible in the input image
- Input resolution should be at least 64x64 pixels per face
- Avoid heavily compressed or artifacted input images
- Fork the repository
- Create a feature branch:
git checkout -b feature-name
- Make your changes and add tests
- Submit a pull request with a clear description
# Clone your fork
git clone https://github.com/yourusername/ai-enhanced-image-restoration.git
# Install development dependencies
pip install -r requirements-dev.txt
# Run tests
python -m pytest tests/
- Architecture: SRVGGNetCompact with 64 feature channels
- Upscaling Factor: 4x resolution enhancement
- Activation Function: PReLU for better gradient flow
- Memory Optimization: Tile-based processing for large images
- Model Version: GFPGANv1.4 for optimal face restoration
- Background Handling: Integrated Real-ESRGAN for non-face regions
- Face Detection: Automatic face localization and enhancement
- Blending: Seamless integration of enhanced faces with background
This project is licensed under the MIT License. See the LICENSE file for details.
- Real-ESRGAN: Xintao Wang et al. for the super-resolution framework
- GFPGAN: Tencent ARC Lab for the face restoration technology
- PyTorch Community: For the underlying deep learning infrastructure
If you use this project in your research, please cite:
@software{ai_enhanced_image_restoration,
title={AI-Enhanced Image Restoration},
author={Utso Sarkar},
year={2025},
url={https://github.com/officiallyutso/ai-enhanced-image-restoration}
}
Author: Utso Sarkar
GitHub: github.com/officiallyutso
Repository: ai-enhanced-image-restoration
For questions, issues, or contributions, please open an issue on the GitHub repository.