Skip to content

Blender Agent Framework and Notebook #58

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 15 commits into from
Closed

Blender Agent Framework and Notebook #58

wants to merge 15 commits into from

Conversation

kovtcharov
Copy link
Collaborator

Overview

This pull request introduces a comprehensive Blender Agent system that enables natural language-driven 3D content creation using Local LLMs and Ryzen AI hardware. The implementation provides a complete framework for controlling Blender through conversational AI, making 3D modeling accessible to users without technical expertise.

🚀 Key Features

Core Agent System

  • BlenderAgent: Advanced agent with full tool registry and complex scene manipulation
  • BlenderAgentSimple: Streamlined agent for basic object creation with minimal parameters
  • MCP Integration: Message Control Protocol for seamless Blender communication
  • LLM-Driven: Support for both local LLMs and OpenAI API integration

Intelligent 3D Operations

  • Natural language to 3D object creation
  • Automated scene setup and management
  • Material and shader generation

Modular Architecture

  • SceneManager: Scene reset, clearing, and diagnostics
  • ObjectManager: Sphere creation, lighting, and texture loading
  • MaterialManager: Ground and atmosphere material generation
  • RenderManager: Volume rendering and color grading
  • ViewManager: Viewport settings and workspace management

🏗️ Architecture Components

Agent Framework (src/gaia/agents/base/)

├── agent.py          # Base agent class with tool registry
├── console.py        # Rich console output and progress indicators
├── tools.py          # Tool registration and decorator system
└── __init__.py       # Agent module exports

Key Features

🔄 Intelligent State Management

  • Execution States: PLANNING → EXECUTING_PLAN → COMPLETION with automatic transitions
  • Plan Enforcement: Forces LLM to create detailed plans before tool execution
  • Error Recovery: Built-in retry logic and graceful failure handling

🛠️ Dynamic Tool Integration

  • Auto-Discovery: Tools automatically registered and documented for LLM prompts
  • Parameter Validation: Automatic validation of required vs optional arguments
  • Safe Execution: Error handling with conversation context preservation

🧠 Robust LLM Processing

  • Multi-Provider Support: Local LLM servers and OpenAI API compatibility
  • Advanced JSON Parsing: Multiple extraction strategies with regex fallbacks
  • Streaming Support: Real-time response display with progress indicators

📋 Multi-Step Execution

  • Plan Creation: LLM generates detailed execution plans with tool sequences
  • Sequential Processing: Step-by-step execution with progress tracking
  • Context Management: Full conversation history with performance statistics

Blender Integration (src/gaia/agents/Blender/)

├── agent.py              # Main BlenderAgent implementation
├── agent_simple.py       # Simplified BlenderAgentSimple
├── app.py               # Demo and interactive applications
├── core/
│   ├── materials.py     # MaterialManager for shaders
│   ├── objects.py       # ObjectManager for 3D operations
│   ├── render.py        # RenderManager for output
│   ├── scene.py         # SceneManager for environment
│   └── view.py          # ViewManager for viewport
└── mcp_client.py        # Blender MCP communication client

🎯 Specialized 3D Assistant

  • Domain Expertise: Inherits robust Agent framework with Blender-specific tools and workflows
  • MCP Communication: Direct integration with Blender via Message Control Protocol
  • Natural Language Interface: Converts conversational requests into precise 3D operations

🛠️ Core Tool Suite

  • Scene Management: clear_scene(), get_scene_info(), diagnose_scene()
  • Object Operations: create_object(), modify_object(), delete_object(), get_object_info()
  • Material System: set_material_color() with RGBA support
  • Code Execution: execute_blender_code() for advanced operations

🧠 Intelligent Workflow System

  • Enforced Planning: Requires detailed plans before tool execution
  • Common Workflows: Pre-defined patterns for colored objects, scene clearing, object modification
  • Smart Object Tracking: Automatically tracks Blender-assigned object names and updates future plan steps
  • JSON Format Enforcement: Strict response validation with format reminders

📋 Advanced Scene Creation

  • Multi-Step Execution: Plans and executes complex scene compositions
  • Object Relationship Management: Handles dependencies between objects and materials
  • Interactive Scene Builder: create_interactive_scene() for complex multi-object scenes
  • Error Recovery: Graceful handling of Blender API failures

Infrastructure (src/gaia/)

├── interface/
│   └── mcp_blender_client.py  # MCP server integration
├── llm/
│   └── llm_client.py          # LLM communication layer

NOTE: MCP client-server implementation has been obtained from the following project found here. We thank the authors for the implementation.

💡 Usage Examples

Simple Object Creation

from gaia.agents.Blender import BlenderAgentSimple

agent = BlenderAgentSimple()
response = agent.process_query("Create a red cube at position (2, 0, 0)")

Complex Multi-Turn Building

from gaia.agents.Blender import BlenderAgent

agent = BlenderAgent()
agent.process_query("Create a red cube, make sure to clear the scene first")

🧪 Testing Framework

Comprehensive Test Suite

  • Unit Tests: Individual component validation
  • Integration Tests: End-to-end MCP server communication
  • Mocked Tests: LLM response simulation

Test Structure

tests/
├── conftest.py              # Test configuration and fixtures
├── test_agent.py            # BlenderAgent tests
├── test_agent_simple.py     # BlenderAgentSimple tests
└── test_mcp_client.py       # MCP communication tests

Running Tests

# Unit tests only
pytest tests/ -m "not integration"

# All tests (requires MCP server)
pytest tests/

📚 Documentation & Workshop Materials

Workshop Implementation (workshop/)

  • Interactive Notebook: Step-by-step LLM agent building
  • Ryzen AI Integration: Local hardware optimization guides
  • MCP Server Setup: Blender addon installation and configuration

⚡ Performance & Compatibility

Local LLM Support

  • Optimized for Ryzen AI hardware using the NPU/iGPU hybrid mode
  • Configurable model endpoints
  • Performance monitoring and statistics
  • Fallback to OpenAI API when needed

Blender Compatibility

  • Supports Blender 3.6+ via MCP addon
  • Thread-safe command execution
  • Error handling and recovery
  • Scene state validation

🔧 Configuration

VS Code Integration

  • Debug configuration for Blender agent
  • Custom terminal profiles for Windows
  • Notebook word wrap settings

Dependencies

  • Updated: Pydantic ≥ 2.9.2 for enhanced validation
  • Added: bpy package requirement for Blender integration
  • New Agent: Blender agent in setup.py configuration

🚨 Breaking Changes

  1. New Agent Type: Introduction of Blender-specific agent requires updated imports
  2. MCP Dependency: Requires MCP server addon installation in Blender
  3. Logging Changes: Default log levels changed from DEBUG to INFO for reduced verbosity

🎯 Success Criteria

This implementation successfully demonstrates:

  • ✅ Natural language to 3D object conversion
  • ✅ Modular and extensible architecture
  • ✅ Comprehensive testing and documentation
  • ✅ Local LLM optimization for Ryzen AI
  • ✅ Error handling and logging

@kovtcharov kovtcharov requested review from itomek and vgodsoe May 22, 2025 23:14
@kovtcharov kovtcharov self-assigned this May 22, 2025
@kovtcharov kovtcharov changed the title Blender Agent, Framework and Notebook Blender Agent Framework and Notebook May 22, 2025
Copy link
Collaborator

@vgodsoe vgodsoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is stellar work! Nicely done.
Left a few suggestions to help with the new user going through the workflow.

kovtcharov and others added 5 commits June 4, 2025 12:30
Co-authored-by: Victoria Godsoe <[email protected]>
This pull request introduces a comprehensive **Blender Agent system**
that enables natural language-driven 3D content creation using Local
LLMs and Ryzen AI hardware. The implementation provides a complete
framework for controlling Blender through conversational AI, making 3D
modeling accessible to users without technical expertise.

- **BlenderAgent**: Advanced agent with full tool registry and complex
scene manipulation
- **BlenderAgentSimple**: Streamlined agent for basic object creation
with minimal parameters
- **MCP Integration**: Message Control Protocol for seamless Blender
communication
- **LLM-Driven**: Support for both local LLMs and OpenAI API integration

- Natural language to 3D object creation
- Automated scene setup and management
- Material and shader generation

- **SceneManager**: Scene reset, clearing, and diagnostics
- **ObjectManager**: Sphere creation, lighting, and texture loading
- **MaterialManager**: Ground and atmosphere material generation
- **RenderManager**: Volume rendering and color grading
- **ViewManager**: Viewport settings and workspace management

```
├── agent.py          # Base agent class with tool registry
├── console.py        # Rich console output and progress indicators
├── tools.py          # Tool registration and decorator system
└── __init__.py       # Agent module exports
```

🔄 Intelligent State Management
- Execution States: PLANNING → EXECUTING_PLAN → COMPLETION with
automatic transitions
- Plan Enforcement: Forces LLM to create detailed plans before tool
execution
- Error Recovery: Built-in retry logic and graceful failure handling

🛠️ **Dynamic Tool Integration**
- Auto-Discovery: Tools automatically registered and documented for LLM
prompts
- Parameter Validation: Automatic validation of required vs optional
arguments
- Safe Execution: Error handling with conversation context preservation

🧠 **Robust LLM Processing**
- Multi-Provider Support: Local LLM servers and OpenAI API compatibility
- Advanced JSON Parsing: Multiple extraction strategies with regex
fallbacks
- Streaming Support: Real-time response display with progress indicators

📋 **Multi-Step Execution**
- Plan Creation: LLM generates detailed execution plans with tool
sequences
- Sequential Processing: Step-by-step execution with progress tracking
- Context Management: Full conversation history with performance
statistics

```
├── agent.py              # Main BlenderAgent implementation
├── agent_simple.py       # Simplified BlenderAgentSimple
├── app.py               # Demo and interactive applications
├── core/
│   ├── materials.py     # MaterialManager for shaders
│   ├── objects.py       # ObjectManager for 3D operations
│   ├── render.py        # RenderManager for output
│   ├── scene.py         # SceneManager for environment
│   └── view.py          # ViewManager for viewport
└── mcp_client.py        # Blender MCP communication client
```

🎯 **Specialized 3D Assistant**
- Domain Expertise: Inherits robust Agent framework with
Blender-specific tools and workflows
- MCP Communication: Direct integration with Blender via Message Control
Protocol
- Natural Language Interface: Converts conversational requests into
precise 3D operations

🛠️ **Core Tool Suite**
- Scene Management: clear_scene(), get_scene_info(), diagnose_scene()
- Object Operations: create_object(), modify_object(), delete_object(),
get_object_info()
- Material System: set_material_color() with RGBA support
- Code Execution: execute_blender_code() for advanced operations

🧠 **Intelligent Workflow System**
- Enforced Planning: Requires detailed plans before tool execution
- Common Workflows: Pre-defined patterns for colored objects, scene
clearing, object modification
- Smart Object Tracking: Automatically tracks Blender-assigned object
names and updates future plan steps
- JSON Format Enforcement: Strict response validation with format
reminders

📋 **Advanced Scene Creation**
- Multi-Step Execution: Plans and executes complex scene compositions
- Object Relationship Management: Handles dependencies between objects
and materials
- Interactive Scene Builder: create_interactive_scene() for complex
multi-object scenes
- Error Recovery: Graceful handling of Blender API failures

```
├── interface/
│   └── mcp_blender_client.py  # MCP server integration
├── llm/
│   └── llm_client.py          # LLM communication layer
```
NOTE: MCP client-server implementation has been obtained from the
following project found [here](https://github.com/ahujasid/blender-mcp).
We thank the authors for the implementation.

```python
from gaia.agents.Blender import BlenderAgentSimple

agent = BlenderAgentSimple()
response = agent.process_query("Create a red cube at position (2, 0, 0)")
```

```python
from gaia.agents.Blender import BlenderAgent

agent = BlenderAgent()
agent.process_query("Create a red cube, make sure to clear the scene first")
```

- **Unit Tests**: Individual component validation
- **Integration Tests**: End-to-end MCP server communication
- **Mocked Tests**: LLM response simulation

```
tests/
├── conftest.py              # Test configuration and fixtures
├── test_agent.py            # BlenderAgent tests
├── test_agent_simple.py     # BlenderAgentSimple tests
└── test_mcp_client.py       # MCP communication tests
```

```bash
pytest tests/ -m "not integration"

pytest tests/
```

- **Interactive Notebook**: Step-by-step LLM agent building
- **Ryzen AI Integration**: Local hardware optimization guides
- **MCP Server Setup**: Blender addon installation and configuration

- Optimized for Ryzen AI hardware using the NPU/iGPU hybrid mode
- Configurable model endpoints
- Performance monitoring and statistics
- Fallback to OpenAI API when needed

- Supports Blender 3.6+ via MCP addon
- Thread-safe command execution
- Error handling and recovery
- Scene state validation

- Debug configuration for Blender agent
- Custom terminal profiles for Windows
- Notebook word wrap settings

- **Updated**: Pydantic ≥ 2.9.2 for enhanced validation
- **Added**: bpy package requirement for Blender integration
- **New Agent**: Blender agent in setup.py configuration

1. **New Agent Type**: Introduction of Blender-specific agent requires
updated imports
2. **MCP Dependency**: Requires MCP server addon installation in Blender
3. **Logging Changes**: Default log levels changed from DEBUG to INFO
for reduced verbosity

This implementation successfully demonstrates:
- ✅ Natural language to 3D object conversion
- ✅ Modular and extensible architecture
- ✅ Comprehensive testing and documentation
- ✅ Local LLM optimization for Ryzen AI
- ✅  Error handling and logging

## Overview

This hands-on workshop will guide participants through building local LLM agents using the Ryzen AI stack to interact with the Blender 3D creation suite. Participants will learn how to create an agent that can understand natural language instructions and translate them into Blender operations, all running locally on Ryzen AI hardware using the powerful NPU/iGPU hybrid execution mode.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This hands-on workshop will guide participants through building local LLM agents using the Ryzen AI stack to interact with the Blender 3D creation suite. Participants will learn how to create an agent that can understand natural language instructions and translate them into Blender operations, all running locally on Ryzen AI hardware using the powerful NPU/iGPU hybrid execution mode.
This hands-on workshop will guide participants through building local LLM agents using Ryzen AI software to interact with the Blender 3D creation suite. Participants will learn how to create an agent that can understand natural language instructions and translate them into Blender operations, all running locally on Ryzen AI hardware using the powerful NPU/iGPU hybrid execution mode.

- **Extensible**: Easily build custom agents for specific use cases like 3D modeling in Blender
- **Agent Framework**: Comprehensive base classes with tool registration, planning, execution, and observability

GAIA works with Lemonade, a specialized LLM server that optimizes model execution on Ryzen AI hardware through NPU/iGPU hybrid mode, allowing LLMs to run as efficiently and fast as possible.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
GAIA works with Lemonade, a specialized LLM server that optimizes model execution on Ryzen AI hardware through NPU/iGPU hybrid mode, allowing LLMs to run as efficiently and fast as possible.
GAIA works with Lemonade Server, a specialized LLM server that optimizes model execution on Ryzen AI hardware through NPU/iGPU hybrid mode, allowing LLMs to run as efficiently and fast as possible.

@kovtcharov
Copy link
Collaborator Author

Closing as this was already included in the v0.8.4 release. FYI, @vgodsoe.

@kovtcharov kovtcharov closed this Jun 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants