Skip to content

Commit b2b3d3a

Browse files
authored
Merge pull request #62 from trycua/fix/cua-lint
Add dev container, fix lints
2 parents 13d9ec5 + afce3b9 commit b2b3d3a

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

45 files changed

+1026
-1882
lines changed

.dockerignore

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
# Version control
2+
.git
3+
.github
4+
.gitignore
5+
6+
# Environment and cache
7+
.venv
8+
.env
9+
.env.local
10+
__pycache__
11+
*.pyc
12+
*.pyo
13+
*.pyd
14+
.Python
15+
.pytest_cache
16+
.pdm-build
17+
18+
# Distribution / packaging
19+
dist
20+
build
21+
*.egg-info
22+
23+
# Development
24+
.vscode
25+
.idea
26+
*.swp
27+
*.swo
28+
29+
# Docs
30+
docs/site
31+
32+
# Notebooks
33+
notebooks/.ipynb_checkpoints
34+
35+
# Docker
36+
Dockerfile
37+
.dockerignore

Dockerfile

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
FROM python:3.11-slim
2+
3+
# Set environment variables
4+
ENV PYTHONUNBUFFERED=1 \
5+
PYTHONDONTWRITEBYTECODE=1 \
6+
PIP_NO_CACHE_DIR=1 \
7+
PIP_DISABLE_PIP_VERSION_CHECK=1 \
8+
PYTHONPATH="/app/libs/core:/app/libs/computer:/app/libs/agent:/app/libs/som:/app/libs/pylume:/app/libs/computer-server"
9+
10+
# Install system dependencies for ARM architecture
11+
RUN apt-get update && apt-get install -y --no-install-recommends \
12+
git \
13+
build-essential \
14+
libgl1-mesa-glx \
15+
libglib2.0-0 \
16+
libxcb-xinerama0 \
17+
libxkbcommon-x11-0 \
18+
cmake \
19+
pkg-config \
20+
curl \
21+
iputils-ping \
22+
net-tools \
23+
sed \
24+
&& apt-get clean \
25+
&& rm -rf /var/lib/apt/lists/*
26+
27+
# Set working directory
28+
WORKDIR /app
29+
30+
# Copy the entire project temporarily
31+
# We'll mount the real source code over this at runtime
32+
COPY . /app/
33+
34+
# Create a simple .env.local file for build.sh
35+
RUN echo "PYTHON_BIN=python" > /app/.env.local
36+
37+
# Modify build.sh to skip virtual environment creation
38+
RUN sed -i 's/python -m venv .venv/echo "Skipping venv creation in Docker"/' /app/scripts/build.sh && \
39+
sed -i 's/source .venv\/bin\/activate/echo "Skipping venv activation in Docker"/' /app/scripts/build.sh && \
40+
sed -i 's/find . -type d -name ".venv" -exec rm -rf {} +/echo "Skipping .venv removal in Docker"/' /app/scripts/build.sh && \
41+
chmod +x /app/scripts/build.sh
42+
43+
# Run the build script to install dependencies
44+
RUN cd /app && ./scripts/build.sh
45+
46+
# Clean up the source files now that dependencies are installed
47+
# When we run the container, we'll mount the actual source code
48+
RUN rm -rf /app/* /app/.??*
49+
50+
# Note: This Docker image doesn't contain the lume executable (macOS-specific)
51+
# Instead, it relies on connecting to a lume server running on the host machine
52+
# via host.docker.internal:3000
53+
54+
# Default command
55+
CMD ["bash"]

docs/Developer-Guide.md

Lines changed: 62 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -4,24 +4,29 @@
44

55
The project is organized as a monorepo with these main packages:
66
- `libs/core/` - Base package with telemetry support
7-
- `libs/pylume/` - Python bindings for Lume
8-
- `libs/computer/` - Core computer interaction library
7+
- `libs/computer/` - Computer-use interface (CUI) library
98
- `libs/agent/` - AI agent library with multi-provider support
10-
- `libs/som/` - Computer vision and NLP processing library (formerly omniparser)
11-
- `libs/computer-server/` - Server implementation for computer control
12-
- `libs/lume/` - Swift implementation for enhanced macOS integration
9+
- `libs/som/` - Set-of-Mark parser
10+
- `libs/computer-server/` - Server component for VM
11+
- `libs/lume/` - Lume CLI
12+
- `libs/pylume/` - Python bindings for Lume
1313

1414
Each package has its own virtual environment and dependencies, managed through PDM.
1515

1616
### Local Development Setup
1717

18-
1. Clone the repository:
18+
1. Install Lume CLI:
19+
```bash
20+
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/lume/scripts/install.sh)"
21+
```
22+
23+
2. Clone the repository:
1924
```bash
2025
git clone https://github.com/trycua/cua.git
2126
cd cua
2227
```
2328

24-
2. Create a `.env.local` file in the root directory with your API keys:
29+
3. Create a `.env.local` file in the root directory with your API keys:
2530
```bash
2631
# Required for Anthropic provider
2732
ANTHROPIC_API_KEY=your_anthropic_key_here
@@ -30,7 +35,7 @@ ANTHROPIC_API_KEY=your_anthropic_key_here
3035
OPENAI_API_KEY=your_openai_key_here
3136
```
3237

33-
3. Run the build script to set up all packages:
38+
4. Run the build script to set up all packages:
3439
```bash
3540
./scripts/build.sh
3641
```
@@ -41,9 +46,9 @@ This will:
4146
- Set up the correct Python path
4247
- Install development tools
4348

44-
4. Open the workspace in VSCode or Cursor:
49+
5. Open the workspace in VSCode or Cursor:
4550
```bash
46-
# Using VSCode or Cursor
51+
# For Cua Python development
4752
code .vscode/py.code-workspace
4853

4954
# For Lume (Swift) development
@@ -56,9 +61,55 @@ Using the workspace file is strongly recommended as it:
5661
- Enables debugging configurations
5762
- Maintains consistent settings across packages
5863

64+
### Docker Development Environment
65+
66+
As an alternative to running directly on your host machine, you can use Docker for development. This approach has several advantages:
67+
68+
- Ensures consistent development environment across different machines
69+
- Isolates dependencies from your host system
70+
- Works well for cross-platform development
71+
- Avoids conflicts with existing Python installations
72+
73+
#### Prerequisites
74+
75+
- Docker installed on your machine
76+
- Lume server running on your host (port 3000): `lume serve`
77+
78+
#### Setup and Usage
79+
80+
1. Build the development Docker image:
81+
```bash
82+
./scripts/run-docker-dev.sh build
83+
```
84+
85+
2. Run an example in the container:
86+
```bash
87+
./scripts/run-docker-dev.sh run computer_examples.py
88+
```
89+
90+
3. Get an interactive shell in the container:
91+
```bash
92+
./scripts/run-docker-dev.sh run --interactive
93+
```
94+
95+
4. Stop any running containers:
96+
```bash
97+
./scripts/run-docker-dev.sh stop
98+
```
99+
100+
#### How it Works
101+
102+
The Docker development environment:
103+
- Installs all required Python dependencies in the container
104+
- Mounts your source code from the host at runtime
105+
- Automatically configures the connection to use host.docker.internal:3000 for accessing the Lume server on your host machine
106+
- Preserves your code changes without requiring rebuilds (source code is mounted as a volume)
107+
108+
> **Note**: The Docker container doesn't include the macOS-specific Lume executable. Instead, it connects to the Lume server running on your host machine via host.docker.internal:3000. Make sure to start the Lume server on your host before running examples in the container.
109+
59110
### Cleanup and Reset
60111

61-
If you need to clean up the environment and start fresh:
112+
If you need to clean up the environment (non-docker) and start fresh:
62113

63114
```bash
64115
./scripts/cleanup.sh

examples/agent_examples.py

Lines changed: 11 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -5,13 +5,13 @@
55
import logging
66
import traceback
77
from pathlib import Path
8-
from datetime import datetime
98
import signal
109

1110
from computer import Computer
1211

1312
# Import the unified agent class and types
14-
from agent import ComputerAgent, AgentLoop, LLMProvider, LLM
13+
from agent import AgentLoop, LLMProvider, LLM
14+
from agent.core.computer_agent import ComputerAgent
1515

1616
# Import utility functions
1717
from utils import load_dotenv_files, handle_sigint
@@ -23,18 +23,19 @@
2323

2424
async def run_omni_agent_example():
2525
"""Run example of using the ComputerAgent with OpenAI and Omni provider."""
26-
print(f"\n=== Example: ComputerAgent with OpenAI and Omni provider ===")
26+
print("\n=== Example: ComputerAgent with OpenAI and Omni provider ===")
27+
2728
try:
2829
# Create Computer instance with default parameters
2930
computer = Computer(verbosity=logging.DEBUG)
3031

3132
# Create agent with loop and provider
3233
agent = ComputerAgent(
3334
computer=computer,
34-
# loop=AgentLoop.OMNI,
35-
loop=AgentLoop.ANTHROPIC,
36-
# model=LLM(provider=LLMProvider.OPENAI, name="gpt-4.5-preview"),
37-
model=LLM(provider=LLMProvider.ANTHROPIC, name="claude-3-7-sonnet-20250219"),
35+
# loop=AgentLoop.ANTHROPIC,
36+
loop=AgentLoop.OMNI,
37+
model=LLM(provider=LLMProvider.OPENAI, name="gpt-4.5-preview"),
38+
# model=LLM(provider=LLMProvider.ANTHROPIC, name="claude-3-7-sonnet-20250219"),
3839
save_trajectory=True,
3940
trajectory_dir=str(Path("trajectories")),
4041
only_n_most_recent_images=3,
@@ -69,14 +70,15 @@ async def run_omni_agent_example():
6970
print(f"Task {i} completed")
7071

7172
except Exception as e:
72-
logger.error(f"Error in run_anthropic_agent_example: {e}")
73+
logger.error(f"Error in run_omni_agent_example: {e}")
7374
traceback.print_exc()
7475
raise
7576
finally:
7677
# Clean up resources
7778
if computer and computer._initialized:
7879
try:
79-
await computer.stop()
80+
# await computer.stop()
81+
pass
8082
except Exception as e:
8183
logger.warning(f"Error stopping computer: {e}")
8284

examples/computer_examples.py

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,8 @@
2828
async def main():
2929
try:
3030
print("\n=== Using direct initialization ===")
31+
32+
# Create computer with configured host
3133
computer = Computer(
3234
display="1024x768", # Higher resolution
3335
memory="8GB", # More memory
@@ -48,10 +50,10 @@ async def main():
4850
print(f"Accessibility tree: {accessibility_tree}")
4951

5052
# Screen Actions Examples
51-
print("\n=== Screen Actions ===")
52-
screenshot = await computer.interface.screenshot()
53-
with open("screenshot_direct.png", "wb") as f:
54-
f.write(screenshot)
53+
# print("\n=== Screen Actions ===")
54+
# screenshot = await computer.interface.screenshot()
55+
# with open("screenshot_direct.png", "wb") as f:
56+
# f.write(screenshot)
5557

5658
screen_size = await computer.interface.get_screen_size()
5759
print(f"Screen size: {screen_size}")

libs/agent/agent/__init__.py

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -48,9 +48,7 @@
4848
# Other issues with telemetry
4949
logger.warning(f"Error initializing telemetry: {e}")
5050

51-
from .core.factory import AgentFactory
52-
from .core.agent import ComputerAgent
5351
from .providers.omni.types import LLMProvider, LLM
54-
from .types.base import Provider, AgentLoop
52+
from .types.base import AgentLoop
5553

56-
__all__ = ["AgentFactory", "Provider", "ComputerAgent", "AgentLoop", "LLMProvider", "LLM"]
54+
__all__ = ["AgentLoop", "LLMProvider", "LLM"]

libs/agent/agent/core/__init__.py

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,5 @@
11
"""Core agent components."""
22

3-
from .base_agent import BaseComputerAgent
43
from .loop import BaseLoop
54
from .messages import (
65
create_user_message,
@@ -12,7 +11,7 @@
1211
ImageRetentionConfig,
1312
)
1413
from .callbacks import (
15-
CallbackManager,
14+
CallbackManager,
1615
CallbackHandler,
1716
BaseCallbackManager,
1817
ContentCallback,
@@ -21,9 +20,8 @@
2120
)
2221

2322
__all__ = [
24-
"BaseComputerAgent",
25-
"BaseLoop",
26-
"CallbackManager",
23+
"BaseLoop",
24+
"CallbackManager",
2725
"CallbackHandler",
2826
"BaseMessageManager",
2927
"ImageRetentionConfig",

0 commit comments

Comments
 (0)