Skip to content

Gesture-controlled real-time AI animation system that combines rule-based hand tracking and UDP control with Stable Diffusion via TouchDesigner. Explores multimodal interaction for generative art, live performance, and reactive installations.

Notifications You must be signed in to change notification settings

josebringas/Virtual-mouse-with-TouchDesigner-and-Diffusion-Model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 

Repository files navigation

Gesture-Controlled Real-Time AI Animation

Exhibited at the Art Gallery of Ontario, March 2025

ago-2

HEADS UP: This project is in constant iteration. Mainly for the CVZone-Python side. Latest py iteration includes a gesture rule classifier snippet that switches from one text-prompt to another inside TD. (TouchDesigner file update not included yet).

Project Overview

The Microscopic is a real-time generative AI system exhibited as a public video mapping installation. It merges multimodal user input with diffusion-based image generation to create immersive, reactive visuals. The project was designed to test how human gestures can condition and control AI imagery in live settings, using a custom pipeline that combines gesture recognition with a diffusion model conditioning.

ago-3

Objectives

  • Explore gesture-based control as a creative input for AI visual systems.
  • Investigate image conditioning using predefined animations and structure-aware prompts.
  • Demonstrate live integration between user input and Stable Diffusion using the TouchDiffusion plugin for TouchDesigner.

Gesture Control System

This system implements a rule-based classifier using cvzone.HandTrackingModule, OpenCV, and autopy to translate webcam-captured hand landmarks into real-time interaction modes:

Modes Implemented:

  1. Virtual Mouse – Controls the mouse using the index fingertip position.
  2. Zoom Mode – Recognizes two-hand gestures to scale and reposition images.

ago-2-ezgif com-optimize

Key Libraries:

  • cvzone
  • OpenCV
  • autopy
  • socket (for UDP communication with TouchDesigner)

Code Features:

  • Smoothed pointer movement using interpolation.
  • click recognition based on hand pose (index + middle finger clse together).
  • Zoom level control by measuring distance between both hands.
  • UDP data transmission of gestures states and zoom values to TouchDesigner.

See code example in VirtualMouse_GestureControl_v02.py (add this as a file to your repo).

Diffusion Image Generation

We used TouchDiffusion, a real-time implementation of Stable Diffusion in TouchDesigner.

Conditioning Strategy

  • Noise Map: The default randomness input.
  • Author-Controlled RGB Animation: A high contrast particle-based animation was used as a second conditioning map. This helped the model "preserve the structure" while allowing creative variation.
  • Gesture Input: Gestures sent via UDp dynamically transformed or influenced the diffusion parameters during runtime.

GIF_touchdesigner_01

This conditioning approach allowed the diffusion model to maintain coherence with the structured reference (e.g., particle animations) while introducing stylistic variation bassed on the noise.

PXL_20250604_025615794 (1)

Future Possibilities

While The Microscopic was designed as a standalone installation, its architecture opens doors to future uses in:

  • Theater and Live Performance (gesture-driven control of VFX, lighting, sound)
  • Prototype testing for multimodal AI interaction
  • Creative gaming and memory-based physical interaction systems

The combination of authored animations, structured prompts, and reactive gesture input makes this system ideal for immersive, performative applications!

Acknowledgments

Considerations:

  1. I'm running python version 3.8.0 because Mediapipe didn't seem to be running on later versions (at least for me).
  2. I'm using TouchDiffusion main version. I've tried to install the portable version but it simply won't run. Don't get discourage if either versions don't work properly at first. Reinstalling the main version seemed to work for me.
  3. The TouchDesigner file uploaded contains esential nodes for decoding coming data from OSC. However, I keep working on more recent versions for research purposes.

About

Gesture-controlled real-time AI animation system that combines rule-based hand tracking and UDP control with Stable Diffusion via TouchDesigner. Explores multimodal interaction for generative art, live performance, and reactive installations.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages