Gesture-Controlled Real-Time AI Animation

Exhibited at the Art Gallery of Ontario, March 2025

HEADS UP: This project is in constant iteration. Mainly for the CVZone-Python side. Latest py iteration includes a gesture rule classifier snippet that switches from one text-prompt to another inside TD. (TouchDesigner file update not included yet).

Project Overview

The Microscopic is a real-time generative AI system exhibited as a public video mapping installation. It merges multimodal user input with diffusion-based image generation to create immersive, reactive visuals. The project was designed to test how human gestures can condition and control AI imagery in live settings, using a custom pipeline that combines gesture recognition with a diffusion model conditioning.

Objectives

Explore gesture-based control as a creative input for AI visual systems.
Investigate image conditioning using predefined animations and structure-aware prompts.
Demonstrate live integration between user input and Stable Diffusion using the TouchDiffusion plugin for TouchDesigner.

Gesture Control System

This system implements a rule-based classifier using cvzone.HandTrackingModule, OpenCV, and autopy to translate webcam-captured hand landmarks into real-time interaction modes:

Modes Implemented:

Virtual Mouse – Controls the mouse using the index fingertip position.
Zoom Mode – Recognizes two-hand gestures to scale and reposition images.

Key Libraries:

cvzone
OpenCV
autopy
socket (for UDP communication with TouchDesigner)

Code Features:

Smoothed pointer movement using interpolation.
click recognition based on hand pose (index + middle finger clse together).
Zoom level control by measuring distance between both hands.
UDP data transmission of gestures states and zoom values to TouchDesigner.

See code example in VirtualMouse_GestureControl_v02.py (add this as a file to your repo).

Diffusion Image Generation

We used TouchDiffusion, a real-time implementation of Stable Diffusion in TouchDesigner.

Conditioning Strategy

Noise Map: The default randomness input.
Author-Controlled RGB Animation: A high contrast particle-based animation was used as a second conditioning map. This helped the model "preserve the structure" while allowing creative variation.
Gesture Input: Gestures sent via UDp dynamically transformed or influenced the diffusion parameters during runtime.

This conditioning approach allowed the diffusion model to maintain coherence with the structured reference (e.g., particle animations) while introducing stylistic variation bassed on the noise.

Future Possibilities

While The Microscopic was designed as a standalone installation, its architecture opens doors to future uses in:

Theater and Live Performance (gesture-driven control of VFX, lighting, sound)
Prototype testing for multimodal AI interaction
Creative gaming and memory-based physical interaction systems

The combination of authored animations, structured prompts, and reactive gesture input makes this system ideal for immersive, performative applications!

Acknowledgments

TouchDiffusion by @olegchomp
cvzone
OpenCV, autopy, and TouchDesigner community
Paketa12 on Youtube

Considerations:

I'm running python version 3.8.0 because Mediapipe didn't seem to be running on later versions (at least for me).
I'm using TouchDiffusion main version. I've tried to install the portable version but it simply won't run. Don't get discourage if either versions don't work properly at first. Reinstalling the main version seemed to work for me.
The TouchDesigner file uploaded contains esential nodes for decoding coming data from OSC. However, I keep working on more recent versions for research purposes.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
README.md		README.md
TDiffusion_Test.16.toe		TDiffusion_Test.16.toe
VirtualMouse_GestureControl_v02.py		VirtualMouse_GestureControl_v02.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Gesture-Controlled Real-Time AI Animation

Project Overview

Objectives

Gesture Control System

Modes Implemented:

Key Libraries:

Code Features:

Diffusion Image Generation

Conditioning Strategy

Future Possibilities

Acknowledgments

Considerations:

About

Uh oh!

Releases

Packages

Uh oh!

Languages

josebringas/Virtual-mouse-with-TouchDesigner-and-Diffusion-Model

Folders and files

Latest commit

History

Repository files navigation

Gesture-Controlled Real-Time AI Animation

Project Overview

Objectives

Gesture Control System

Modes Implemented:

Key Libraries:

Code Features:

Diffusion Image Generation

Conditioning Strategy

Future Possibilities

Acknowledgments

Considerations:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages