Docs review enhancements #2419

lanryweezy · 2025-07-05T02:21:19Z

No description provided.

Reviewed and applied enhancements to README.md and files in the docs/ directory. Key changes include: - Improved clarity and accuracy of explanations. - Updated outdated information and links (e.g., Node version in howto-build.md, installation link in getting-started.md). - Added troubleshooting tips and clarifications (e.g., Remote SSH for Alpine, KDE global menu). - Enhanced explanations of VSCodium-specific behaviors (e.g., PAT usage, portable mode differences, telemetry handling, extension marketplace). - Standardized formatting for consistency where appropriate.

Allows agents to use LLMs via OpenRouter in addition to local Ollama. Key changes: - `ILlmCommsService` and `LlmCommsService` updated to support multiple LLM providers ('ollama', 'openrouter'). - `LlmCommsService.sendMessage` now constructs requests (endpoint, headers, payload) based on the specified provider. - For OpenRouter, it includes `Authorization: Bearer <API_KEY>` and `HTTPReferer` headers. API key is conceptually sourced from environment variables (`OPENROUTER_API_KEY`). - Agent model configuration (in `*.agent.definition.json` or how `AgentRunnerService` interprets model strings) now supports provider prefixes (e.g., `openrouter/some-model-name`, `ollama/another-model`). `AgentRunnerService` parses this to determine the provider and model for `LlmCommsService`. - Error handling in `LlmCommsService` adapted for potential differences in OpenRouter error responses. - Conceptual test performed to verify the new multi-provider logic in `LlmCommsService` and `AgentRunnerService`.

Implements the initial phase of Modular Context Protocol (MCP) integration, allowing Weezy to act as an MCP provider for project task context. Key changes: - Defined `McpContextPayload` and `McpTaskContextPayload` interfaces for standardized MCP context structure. - Designed `IMcpContextService` interface and implemented a basic `McpContextService`: - It can retrieve project tasks by reading `PROJECT_TASKS.json` from the active workspace's `.weezy/` directory using `IFileService` and `IWorkspaceContextService`. - Formats task data into the `McpTaskContextPayload` structure. - Implemented `McpHttpServer` (conceptual, using Node.js `http` module): - Starts a local HTTP server (e.g., on `localhost:11435`). - Exposes an endpoint (e.g., `/mcp/context`) that accepts `type` (e.g., `weezy_project_tasks`) and `project_id` query parameters. - Uses `McpContextService` to fetch and return the context as a JSON response. - Includes basic request validation and error handling (400, 404, 500 status codes). - Integrated `McpHttpServer` lifecycle (conceptual `start()` and `stop()` methods) to be managed by Weezy's main application lifecycle service (`IMainProcessLifecycleService`). - Performed a conceptual test of the MCP endpoint, verifying it can serve task data in the defined MCP format.

Refactors McpContextService to source data for the 'weezy_project_tasks' context type directly from IAgentTaskStoreService instead of reading PROJECT_TASKS.json. Key changes: - `McpContextService` now injects `IAgentTaskStoreService`. - For `weezy_project_tasks` context: - It calls `agentTaskStoreService.getAllTasks()`. - Filters tasks based on a conceptual `projectId` (requires IAgentTask to include `projectId` or be implicitly scoped). - Maps `IAgentTask` objects to the `McpTaskInfo` structure for the MCP payload. - Removed direct dependency on `PROJECT_TASKS.json` for this context type in `McpContextService`. - Assumed `IAgentTask` interface will be updated to include a `projectId` for proper filtering. If not, filtering logic might need adjustment or rely on workspace context. - Conceptual test verified that the MCP endpoint now serves task data from the Memento-backed task store.

Introduces a system for AI agents to request and receive input from the human user via the AIAssistantPanel. Key changes: - Designed `IUserInputService` interface with methods like `requestChoice` and `requestText`. Includes eventing (`onInputRequested`, `onInputReceived`) for UI communication. - Implemented `UserInputService` (basic, in-memory for pending requests) to manage input requests and their corresponding Promises. - Enhanced `AIAssistantPanel`: - Subscribes to `IUserInputService.onInputRequested` to display prompts and input elements (e.g., choice buttons, text input field). - Calls `IUserInputService.submitChoice/submitText` when the user provides input. - Disables/enables normal chat input while awaiting specific agent input. - Made `UserRequestInputTool` functional: - It now uses `IUserInputService` to make requests and `await` the user's response. - The user's input is returned as the tool's result. - Modified `AgentRunnerService`'s ReAct loop: - It now `await`s the `user.requestInput` tool's execution, effectively pausing the agent's task until user input is received. - The user's response becomes the 'Observation' for the agent's next LLM call. - Updated `SupervisorAgent.agent.definition.json` prompt to guide it on using `user.requestInput` for architecture review/approval. - Performed a conceptual test of the end-to-end user input flow.

Enhances PMBot to perform detailed task planning by breaking down high-level epics into smaller, actionable sub-tasks and persisting them. Key changes: - `PMBot.agent.definition.json` updated: - Prompt now guides PMBot through a ReAct loop to read an architecture plan (from `architecture_plan_path`), identify epics/high-level tasks, generate detailed sub-tasks for each, and use `pm.upsertTaskInJsonFile` to save them. - Instructs PMBot to include a linking field (e.g., `parentEpicId`) when creating sub-tasks. - PMBot delegates a summary (e.g., sub-task count) back to SupervisorAgent upon completion. - `PMUpsertTaskInJsonFileTool` (`pm.upsertTaskInJsonFile`): - Confirmed its existing capabilities are sufficient to add new tasks with additional fields like `parentEpicId` and to manage `PROJECT_TASKS.json` (reading, parsing, adding/updating tasks, writing back). - Assumes task objects can accommodate new fields like `parentEpicId` and potentially others like `acceptanceCriteria` or `estimatedEffort` if the LLM generates them based on the prompt. - `AgentRunnerService.executeAgentTask` for PMBot: - Implemented the ReAct loop, enabling PMBot to iteratively process epics from the plan, generate sub-tasks using its LLM, and call `pm.upsertTaskInJsonFile` for each. - `SupervisorAgent.agent.definition.json` prompt tweaked to correctly interpret the summary from PMBot after detailed planning. - Conceptual test performed, verifying the flow of PMBot reading a plan, generating and persisting sub-tasks to `PROJECT_TASKS.json`, and reporting back to SupervisorAgent.

Introduces a DeveloperAgent capable of fetching tasks and creating basic placeholder file/directory structures for them. Key changes: - `DeveloperAgent.agent.definition.json` created: - Role: Software Developer (generic for now). - Tools: `pm.getTasks` (new), `project.scaffoldDirectory`, `file.write`. - `can_call`: `PMBot` (to update task status). - Prompt guides the LLM through a ReAct loop: fetch tasks, select one, plan scaffolding (directories/files), use tools to create them, and update task status via PMBot. - New `PMGetTasksTool` (`pm.getTasks`) implemented: - Reads `PROJECT_TASKS.json` from `projectPath` (using `file.read` via injected `IAgentToolsService`). - Supports basic filtering (e.g., by status). - Returns the list of tasks. - Registered in `AgentToolsService`. - `AgentRunnerService.executeAgentTask` updated to include a ReAct loop for `DeveloperAgent`. - `SupervisorAgent.agent.definition.json` conceptually updated to potentially delegate tasks to `DeveloperAgent` after detailed planning by `PMBot`. - Conceptual test performed, verifying DeveloperAgent's ability to fetch a task, create placeholder directories and files (e.g., for a React component), and update task status in `PROJECT_TASKS.json`.

Enhances DeveloperAgent to generate actual code content using its LLM and write it to files, moving beyond basic scaffolding. Key changes: - `DeveloperAgent.agent.definition.json` updated: - Prompt for the ReAct loop now explicitly instructs the LLM to generate code content for the assigned task (e.g., React component code, Python function code). - Guides the LLM to use the `file.write` tool, providing the generated code as the `content` parameter and specifying the target `filePath`. - Emphasizes that the LLM should generate complete, ready-to-write code blocks based on the task requirements. - After successful code generation and file writing, the agent is prompted to update task status via PMBot (e.g., to 'code_generation_complete'). - `AgentRunnerService` (for `DeveloperAgent` ReAct loop): - Confirmed that the existing ReAct loop structure and `file.write` tool handling are sufficient for processing potentially large code content strings generated by the LLM. - Observations from `file.write` (e.g., success message) are fed back to the LLM for its next decision. - Conceptual test performed, verifying that DeveloperAgent can take a task, use its LLM to generate code, and use `file.write` to save this code into the appropriate file in the project workspace.

Enhances DeveloperAgent to automatically format and lint code after generation, and attempt to fix lint errors iteratively. Key changes: - `DeveloperAgent.agent.definition.json` updated: - ReAct prompt now instructs the LLM to sequentially call `code.formatFile` and then `code.lintFile` after an initial `file.write` (code generation). - If `code.lintFile` returns errors, the prompt guides the LLM to analyze the errors and the current code (potentially using `file.read`), generate a corrected version, and use `file.write` (or `file.edit`) to apply the fix. - This creates a sub-loop for lint-fix-recheck until no errors remain or max iterations are hit. - `CodeFormatFileTool` (`code.formatFile`) and `CodeLintFileTool` (`code.lintFile`): - Reviewed their implementations. They use `TerminalSandboxService` for `npx prettier` and `npx eslint`. - `CodeLintFileTool`'s output parsing (for JSON or exit codes) is crucial for providing structured error details to the LLM. Confirmed its `outputSchema` reflects this structure. - Considered adding an auto-fix option to `code.lintFile` (e.g., `eslint --fix`), but decided to keep the fix logic with the LLM for now, allowing it to decide between rewriting or targeted edits (once `file.edit` is more robustly used by LLMs). - `AgentRunnerService` (for `DeveloperAgent` ReAct loop): - Confirmed the existing ReAct loop handles the iterative nature of this format-lint-fix cycle, driven by the LLM's actions. - Observations from `code.formatFile` and `code.lintFile` (especially structured lint errors) are passed back to the LLM for decision-making. - Conceptual test performed, verifying the iterative workflow of code generation, formatting, linting, and LLM-guided error correction.

Empowers QAAgent to be LLM-driven for generating test scripts, writing them to files, executing the tests, and reporting results. Key changes: - `QAAgent.agent.definition.json` refined: - ReAct prompt updated to guide the LLM through: reading the code under test (`file.read`), generating test code, writing test scripts (`file.write`), running tests (`test.runFileTests` or `test.runProjectTests`), analyzing results, and using `report.generate` (stub) before delegating to `PMBot`. - Emphasis on the LLM generating test code appropriate for the project's stack (context provided in the task input). - `test.runFileTests` and `test.runProjectTests` tools reviewed: - Confirmed their existing capability to run tests via `TerminalSandboxService`. - Acknowledged that robust parsing of diverse test runner outputs for the LLM remains an area for future enhancement. For now, LLM might need to parse raw stdout/stderr or rely on exit codes and basic summary parsing. - `AgentRunnerService.executeAgentTask` for `QAAgent`: - Confirmed the ReAct loop structure supports the sequence of actions for test generation and execution, passing necessary observations (code content, file write status, test execution output) back to the LLM. - `DeveloperAgent.agent.definition.json` (Delegation): - Prompt updated to guide DeveloperAgent, after completing code generation, formatting, and linting, to DELEGATE to `QAAgent`, providing necessary context like code file paths and task description. - Conceptual test performed, validating the end-to-end flow from DeveloperAgent delegation to QAAgent generating tests, executing them, and reporting results.

Enhances DeveloperAgent to use the sophisticated `file.edit` tool for applying targeted fixes based on QA feedback or linting errors, rather than always rewriting entire files. Key changes: - `DeveloperAgent.agent.definition.json` refined: - ReAct prompt updated to handle bug fix tasks or iterative improvements. When provided with QA reports or lint errors, it instructs the LLM to: - Use `file.read` to load the problematic code. - Analyze the issue and plan a targeted fix. - Choose an appropriate `file.edit` operation (e.g., `replace_lines`, `insert_lines`, `replace_string`) and formulate arguments (line numbers, content, find/replace strings). - After applying the fix with `file.edit`, re-run linters/formatters (`code.formatFile`, `code.lintFile`). - Delegate back to `QAAgent` for re-testing if the fix was for QA-reported issues. - `SupervisorAgent.agent.definition.json` (and conceptually `PMBot`): - Prompts updated to guide them on how to delegate bug-fixing tasks to `DeveloperAgent`. This includes providing context like the path to the problematic code file, details of the test failures or lint errors, and potentially the path to the relevant test file. - `FileEditTool` (`file.edit`): - Confirmed its description and parameter list (as provided to the LLM via `AgentToolsService.formatToolDefinitionsForPrompt`) are clear for the LLM to specify operations and arguments correctly. - Conceptual test performed, verifying the workflow: QAAgent reports a bug -> Supervisor/PMBot delegates to DeveloperAgent -> DeveloperAgent uses `file.read` to analyze, `file.edit` to apply a targeted fix, then re-lints/formats and delegates back to QAAgent for re-testing.

Improves agent resilience by refining error reporting within the ReAct loop and updating agent prompts to guide LLMs on handling recoverable errors. Key changes: - Identified common recoverable tool error patterns (e.g., file not found, network timeouts, non-critical CLI tool execution errors). - `IAgentToolResult` and tool error reporting: Ensured tools clearly distinguish between operational failures (tool itself failed, e.g., network error for `http.call`) and successful execution with a "negative" domain outcome (e.g., `code.lintFile` finding lint errors). The `error` field in `IAgentToolResult` is used for operational failures. - `AgentRunnerService` ReAct Loop Modified: - When a tool returns `success: false` with an `error` message in `IAgentToolResult`, this error information is now explicitly formatted and included in the "Observation:" provided to the LLM for its next iteration. - The prompt to the LLM now also clearly indicates that its previous action resulted in an error, alongside the error details. - Agent Prompts Updated for Error Handling: - All LLM-driven agents' `initial_prompt_template` files (ReAct guidance section) now include instructions on how to interpret and react to error observations. - LLMs are guided to: analyze the error, and then choose to: a) Retry the same tool with modified arguments, b) Try an alternative tool, or c) DELEGATE to `SupervisorAgent` if the error seems unrecoverable or they are stuck. - Introduced a conceptual `RETRY_ATTEMPTS_PER_STEP` (e.g., 1-2) in `AgentRunnerService` for a specific failed action before forcing a different action or escalation. The LLM is informed of remaining retries. - Conceptual test performed for scenarios like `http.call` timeouts and `file.read` for non-existent files, verifying that the LLM (guided by the new prompts and error-inclusive observations) can attempt retries or make more informed decisions about escalation.

google-labs-jules bot added 15 commits July 3, 2025 16:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Docs review enhancements #2419

Docs review enhancements #2419

Uh oh!

lanryweezy commented Jul 5, 2025

Uh oh!

Uh oh!

Docs review enhancements #2419

Are you sure you want to change the base?

Docs review enhancements #2419

Uh oh!

Conversation

lanryweezy commented Jul 5, 2025

Uh oh!

Uh oh!