IMAGHarmony: Controllable Image Editing with Consistent Object Quantity and Layout

🗓️ Release

[2025/5/30] 🔥 We released the technical report of IMAGHarmony.
[2025/5/28] 🔥 We release the train and inference code of IMAGHarmony.
[2025/5/17] 🎉 We launch the project page of IMAGHarmony.

💡 Introduction

IMAGHarmony tackles the challenge of controllable image editing in multi-object scenes, where existing models struggle with aligning object quantity and spatial layout. To this end, IMAGHarmony introduces a structure-aware framework for quantity-and-layout consistent image editing (QL-Edit), enabling precise control over object count, category, and arrangement. We propose a harmony aware (HA) mudule to jointly model object structure and semantics, and a preference-guided noise selection (PNS) strategy to stabilize generation by selecting semantically aligned initial noise. Our method is trained and evaluated on HarmonyBench, a newly curated benchmark with diverse editing scenarios.

🚀 HarmonyBench Dataset Demo

🚀 Examples

Dual-Category Editing

🔧 Requirements

Python>=3.8
PyTorch>=2.0.0
cuda>=11.8

conda create --name IMAGHarmony python=3.8.18
conda activate IMAGHarmony

# Install requirements
pip install -r requirements.txt

🌐 Download Models

You can download our models from Huggingface. You can download the other component models from the original repository, as follows.

🚀 How to train

# Please download the HarmonyBench data first or prepare your own images
# and modify the path in run.sh
## Write caption of your image in your train.json file 
# start training

sh train.sh

🚀 How to test

#Please convert your checkpionts
python conver_bin.py

#Please fill in your path in test.py
#then run

python test.py

Or you may like to test it on gradio

python demo.py

Acknowledgement

We would like to thank the contributors to the Instantstyle and IP-Adapter repositories, for their open research and exploration.

The IMAGHarmony code is available for both academic and commercial use. Users are permitted to generate images using this tool, provided they comply with local laws and exercise responsible use. The developers disclaim all liability for any misuse or unlawful activity by users.

Citation

If you find IMAGHarmony useful for your research and applications, please cite using this BibTeX:

@misc{shen2025imagharmonycontrollableimageediting,
      title={IMAGHarmony: Controllable Image Editing with Consistent Object Quantity and Layout}, 
      author={Fei Shen and Yutong Gao and Jian Yu and Xiaoyu Du and Jinhui Tang},
      year={2025},
      eprint={2506.01949},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2506.01949}, 
}

🕒 TODO List

👉 Our other projects:

IMAGEdit: Training-Free Controllable Video Editing with Consistent Object Layout. [可控多目标视频编辑]
IMAGDressing: Controllable dressing generation. [可控穿衣生成]
IMAGGarment: Fine-grained controllable garment generation. [可控服装生成]
IMAGHarmony: Controllable image editing with consistent object layout. [可控多目标图像编辑]
IMAGPose: Pose-guided person generation with high fidelity. [可控多模式人物生成]
RCDMs: Rich-contextual conditional diffusion for story visualization. [可控故事生成]
PCDMs: Progressive conditional diffusion for pose-guided image synthesis. [可控人物生成]
V-Express: Explores strong and weak conditional relationships for portrait video generation. [可控数字人生成]
FaceShot: Talkingface plugin for any character. [可控动漫数字人生成]
CharacterShot: Controllable and consistent 4D character animation framework. [可控4D角色生成]
StyleTailor: An Agent for personalized fashion styling. [个性化时尚Agent]
SignVip: Controllable sign language video generation. [可控手语生成]

📨 Contact

If you have any questions, please feel free to contact with us at [email protected] and [email protected].

Name		Name	Last commit message	Last commit date
Latest commit History 125 Commits
assets		assets
demo		demo
ip_adapter		ip_adapter
sdxl-fine-tuning/data		sdxl-fine-tuning/data
LICENSE		LICENSE
README.md		README.md
baseline.py		baseline.py
convert_bin.py		convert_bin.py
demo.py		demo.py
requirements.txt		requirements.txt
run.sh		run.sh
shared_models.py		shared_models.py
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

IMAGHarmony: Controllable Image Editing with Consistent Object Quantity and Layout

🗓️ Release

💡 Introduction

🚀 HarmonyBench Dataset Demo

🚀 Examples

Dual-Category Editing

🔧 Requirements

🌐 Download Models

🚀 How to train

🚀 How to test

Acknowledgement

Citation

🕒 TODO List

👉 Our other projects:

📨 Contact

About

Uh oh!

Releases

Packages

Contributors 4

Uh oh!

Languages

License

muzishen/IMAGHarmony

Folders and files

Latest commit

History

Repository files navigation

IMAGHarmony: Controllable Image Editing with Consistent Object Quantity and Layout

🗓️ Release

💡 Introduction

🚀 HarmonyBench Dataset Demo

🚀 Examples

Dual-Category Editing

🔧 Requirements

🌐 Download Models

🚀 How to train

🚀 How to test

Acknowledgement

Citation

🕒 TODO List

👉 Our other projects:

📨 Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Uh oh!

Languages

Packages