Skip to content

Conversation

@majdyz
Copy link
Contributor

@majdyz majdyz commented Nov 14, 2025

Summary

This PR implements a comprehensive Human In The Loop (HITL) block that allows agents to pause execution and wait for human approval/modification of data before continuing.

Screen.Recording.Nov.20.2025.from.Online.Video.Cutter.mp4

Key Features

  • Added WAITING_FOR_REVIEW status to AgentExecutionStatus enum
  • Created PendingHumanReview database table for storing review requests
  • Implemented HumanInTheLoopBlock that extracts input data and creates review entries
  • Added API endpoints at /api/executions/review for fetching and reviewing pending data
  • Updated execution manager to properly handle waiting status and resume after approval

Frontend Components

  • PendingReviewCard for individual review handling
  • PendingReviewsList for multiple reviews
  • FloatingReviewsPanel for graph builder integration
  • Integrated review UI into 3 locations: legacy library, new library, and graph builder

Technical Implementation

  • Added proper type safety throughout with SafeJson handling
  • Optimized database queries using count functions instead of full data fetching
  • Fixed imports to be top-level instead of local
  • All formatters and linters pass

Test plan

  • Test Human In The Loop block creation in graph builder
  • Test block execution pauses and creates pending review
  • Test review UI appears in all 3 locations
  • Test data modification and approval workflow
  • Test rejection workflow
  • Test execution resumes after approval

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features
    • Added Human-In-The-Loop review workflows to pause executions for human validation.
    • Users can approve or reject pending tasks, optionally editing submitted data and adding a message.
    • New "Waiting for Review" execution status with UI indicators across run lists, badges, and activity views.
    • Review management UI: pending review cards, list view, and a floating reviews panel for quick access.

This commit implements a comprehensive Human In The Loop (HITL) block that allows agents to pause execution and wait for human approval/modification of data before continuing.

Key features:
- Added WAITING_FOR_REVIEW status to AgentExecutionStatus enum
- Created PendingHumanReview database table for storing review requests
- Implemented HumanInTheLoopBlock that extracts input data and creates review entries
- Added API endpoints at /api/executions/review for fetching and reviewing pending data
- Updated execution manager to properly handle waiting status and resume after approval
- Created comprehensive frontend UI components:
  - PendingReviewCard for individual review handling
  - PendingReviewsList for multiple reviews
  - FloatingReviewsPanel for graph builder integration
- Integrated review UI into 3 locations: legacy library, new library, and graph builder
- Added proper type safety throughout with SafeJson handling
- Optimized database queries using count functions instead of full data fetching

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@majdyz majdyz requested a review from a team as a code owner November 14, 2025 05:40
@majdyz majdyz requested review from Pwuts and Swiftyos and removed request for a team November 14, 2025 05:40
@github-project-automation github-project-automation bot moved this to 🆕 Needs initial review in AutoGPT development kanban Nov 14, 2025
@netlify
Copy link

netlify bot commented Nov 14, 2025

Deploy Preview for auto-gpt-docs-dev canceled.

Name Link
🔨 Latest commit 0422173
🔍 Latest deploy log https://app.netlify.com/projects/auto-gpt-docs-dev/deploys/6920841afa818c0008b53749

@coderabbitai
Copy link

coderabbitai bot commented Nov 14, 2025

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

Adds a human-in-the-loop review flow: schema/model, data APIs, a HumanInTheLoop block that pauses execution to create/fetch reviews, executor integration to enqueue and resume executions, REST endpoints and frontend components to list and act on pending reviews.

Changes

Cohort / File(s) Summary
Schema & Data Model
backend/schema.prisma
Adds PendingHumanReview model, ReviewStatus enum, and WAITING_FOR_REVIEW agent execution status; adds relations and indexes.
Human review data layer
backend/data/human_review.py
New HITL data API: ReviewResult model and functions for get/create/upsert, query pending reviews, update actions, processed flag, and helpers to surface NodeExecutionResult lists. Ownership and status transitions enforced.
Human-in-the-Loop Block
backend/blocks/human_in_the_loop.py, backend/util/test.py
New HumanInTheLoopBlock Block with Input/Output schemas; creates/reads reviews via DB client, yields results or sets node/graph to WAITING_FOR_REVIEW; test scaffolding updated to include graph_version.
Execution status & queries
backend/data/execution.py
Adds WAITING_FOR_REVIEW support to VALID_STATUS_TRANSITIONS; updates signatures/behavior of update_node_execution_status, update_node_execution_status_batch, and get_node_executions; introduces _build_node_execution_where_clause() helper and stronger transition validation.
Database manager wrappers
backend/executor/database.py
Exposes HITL functions (get_or_create_human_review, get_unprocessed_review_node_executions, has_pending_reviews_for_graph_exec, update_review_processed_status) on DatabaseManager, DatabaseManagerClient, and DatabaseManagerAsyncClient.
Execution manager integration
backend/executor/manager.py
Enqueues unprocessed review-driven node executions, includes WAITING_FOR_REVIEW in initial dispatch, treats WAITING_FOR_REVIEW executions as resumable RUNNING, and sets final graph status to WAITING_FOR_REVIEW when applicable.
Review REST API models
backend/server/v2/executions/review/model.py
Adds PendingHumanReviewModel, ReviewActionRequest (with SafeJson, size/depth validation, action consistency), and ReviewActionResponse.
Review REST API routes
backend/server/v2/executions/review/routes.py
New /review router: list pending reviews, list reviews for an execution, and POST action endpoint to approve/reject (ownership checks, atomic update, resume helper).
REST API wiring
backend/server/rest_api.py
Includes review router (note: duplicated include present in diff).
Frontend: types & hooks
frontend/src/lib/autogpt-server-api/types.ts, frontend/src/hooks/usePendingReviews.ts
Adds WAITING_FOR_REVIEW to frontend types; introduces usePendingReviews and usePendingReviewsForExecution hooks that normalize query results.
Frontend: UI components & integration
multiple files in frontend components and app views (e.g., FloatingReviewsPanel.tsx, PendingReviewCard.tsx, PendingReviewsList.tsx, Flow editor files, Runs views, badges, icons, status maps, navbar activity item, etc.)
New Floating Reviews UI, PendingReviewCard and list components; status mapping and visuals updated to represent WAITING_FOR_REVIEW (purple, pause icon); integrated panel/list into flow editors and run/detail views; refetch wired on review completion.
OpenAPI / frontend API schema
frontend/src/app/api/openapi.json
Adds review endpoints and models to OpenAPI; adds WAITING_FOR_REVIEW to AgentExecutionStatus enum; documents request/response schemas.
Store submission delete change
backend/server/v2/store/db.py, backend/server/v2/store/routes.py
delete_store_submission now returns a StoreSubmission model instead of boolean; routes updated accordingly with new response model and error responses.

Sequence Diagram(s)

sequenceDiagram
    participant Graph as Graph Execution
    participant Block as HumanInTheLoopBlock
    participant DB as DatabaseManager / Data Layer
    participant API as Review API
    participant User as User (frontend)

    Graph->>Block: execute(node_exec_id, input_data)
    Block->>DB: get_or_create_human_review(user_id, node_exec_id, graph_exec_id, payload...)
    DB-->>Block: ReviewResult | None (None => pending)
    alt review pending
        Block->>Graph: yield pause / set node status WAITING_FOR_REVIEW
        Graph->>DB: has_pending_reviews_for_graph_exec(graph_exec_id)
        Note right of DB: Frontend polls/queries DB via API
        User->>API: GET /review/pending or /review/execution/{id}
        API->>DB: get_pending_reviews_for_user / get_pending_reviews_for_execution
        DB-->>API: PendingHumanReviewModel[]
        User->>API: POST /review/{node_exec_id}/action (approve/reject)
        API->>DB: update_review_action(...)  --(atomic)-->
        DB-->>API: updated review
        API->>DB: update_graph_execution_stats / resume helper
        DB-->>Graph: queue/resume node execution
    else review ready
        Block->>Graph: yield reviewed_data/status/review_message
    end
    Graph-->>User: execution updates / final status
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Areas requiring extra attention:

  • backend/data/execution.py: VALID_STATUS_TRANSITIONS changes and new return/value semantics for update functions.
  • backend/executor/manager.py: queueing and resume logic with WAITING_FOR_REVIEW and add_unprocessed_reviews_to_queue().
  • backend/server/v2/executions/review/routes.py & data/human_review.py: atomicity/ownership checks when applying review actions and marking processed.
  • backend/server/rest_api.py: duplicated router inclusion noted in diff.
  • Frontend components/hooks: ensure refetch and status rank changes propagate consistently across views.

Suggested labels

Review effort 3/5

Suggested reviewers

  • Swiftyos
  • Pwuts

Poem

🐇 I paused the flow to sip a clover,
A human glance to make it sober.
With purple badge and gentle thump,
Approve or nudge, then onward jump —
Execution hops, the rabbit hums, "All clear, go rover!"

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 62.71% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The pull request title clearly and concisely summarizes the main feature: adding a Human In The Loop block with review workflow functionality.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@majdyz majdyz marked this pull request as draft November 14, 2025 05:40
@github-actions github-actions bot added platform/frontend AutoGPT Platform - Front end platform/backend AutoGPT Platform - Back end platform/blocks labels Nov 14, 2025
@majdyz
Copy link
Contributor Author

majdyz commented Nov 14, 2025

@coderabbitai full review

@coderabbitai
Copy link

coderabbitai bot commented Nov 14, 2025

✅ Actions performed

Full review triggered.

@qodo-merge-pro
Copy link

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
🧪 No relevant tests
🔒 Security concerns

Access control:
The resume endpoint _resume_graph_execution updates execution status without verifying the requesting user's ownership of the target graph execution. This could allow a user to influence another user's executions if they guess or obtain an ID. Enforce user scoping by fetching with the caller's user_id and validating ownership before status changes.

⚡ Recommended focus areas for review

Possible Issue

The block reads and writes review payloads with mixed shapes (sometimes a dict containing 'data' and 'message', sometimes raw data). This dual-shape handling can produce inconsistencies across creation, approval, and consumption paths. Validate the data contract for PendingHumanReview.data end-to-end and consider enforcing a consistent schema.

if existing_review and existing_review.status == "APPROVED":
    # Return the approved data (which may have been modified by the reviewer)
    # The data field now contains the approved/modified data from the review
    if (
        isinstance(existing_review.data, dict)
        and "data" in existing_review.data
    ):
        # Extract the actual data from the review data structure
        approved_data = existing_review.data["data"]
    else:
        # Fallback to the stored data directly
        approved_data = existing_review.data

    approved_data = convert(approved_data, type(input_data.data))
    yield "reviewed_data", approved_data
    yield "status", "approved"
    yield "review_message", existing_review.reviewMessage or ""

    # Clean up the review record as it's been processed
    await PendingHumanReview.prisma().delete(where={"id": existing_review.id})
    return

elif existing_review and existing_review.status == "REJECTED":
    # Return rejection status without data
    yield "status", "rejected"
    yield "review_message", existing_review.reviewMessage or ""

    # Clean up the review record
    await PendingHumanReview.prisma().delete(where={"id": existing_review.id})
    return

# No existing approved review, create a pending review
review_data = {
    "data": input_data.data,
    "message": input_data.message,
    "editable": input_data.editable,
}

await PendingHumanReview.prisma().upsert(
    where={"nodeExecId": node_exec_id},
    data={
        "create": {
            "userId": user_id,
            "nodeExecId": node_exec_id,
            "graphExecId": graph_exec_id,
            "graphId": graph_id,
            "graphVersion": graph_version,
            "data": SafeJson(review_data),
            "status": "WAITING",
        },
        "update": {"data": SafeJson(review_data), "status": "WAITING"},
    },
)

# This will effectively pause the execution here
# The execution will be resumed when the review is approved
# The manager will detect the pending review and set the status to WAITING_FOR_REVIEW
return
Async Misuse

_on_graph_execution appears to be a sync path but uses db_client.get_node_executions_count without awaiting; if this is an async call it will return a coroutine and the numeric comparison will break. Confirm whether this path is async and ensure await is used or provide a sync wrapper.

waiting_nodes_count = db_client.get_node_executions_count(
    graph_exec_id=graph_exec.graph_exec_id,
    statuses=[ExecutionStatus.WAITING_FOR_REVIEW],
)
if waiting_nodes_count > 0:
    execution_status = ExecutionStatus.WAITING_FOR_REVIEW
else:
    execution_status = ExecutionStatus.COMPLETED
Access Control

_resume_graph_execution fetches execution meta using an empty user_id and updates status without verifying ownership. Ensure user authorization is enforced before resuming executions to prevent cross-tenant interference.

async def _resume_graph_execution(graph_exec_id: str) -> None:
    """Resume a graph execution by updating its status."""
    try:
        from backend.data.execution import ExecutionStatus, update_graph_execution_stats
        from backend.util.clients import get_database_manager_async_client

        # Get the graph execution details
        db = get_database_manager_async_client()
        graph_exec = await db.get_graph_execution_meta(
            user_id="", execution_id=graph_exec_id  # We'll validate user_id separately
        )

        if not graph_exec:
            logger.error(f"Graph execution {graph_exec_id} not found")
            return

        # Update the graph execution status to QUEUED so the scheduler picks it up
        await update_graph_execution_stats(
            graph_exec_id=graph_exec_id, status=ExecutionStatus.QUEUED
        )

        logger.info(f"Resumed graph execution {graph_exec_id}")

    except Exception as e:
        logger.error(f"Failed to resume graph execution {graph_exec_id}: {e}")

@netlify
Copy link

netlify bot commented Nov 14, 2025

Deploy Preview for auto-gpt-docs canceled.

Name Link
🔨 Latest commit 0422173
🔍 Latest deploy log https://app.netlify.com/projects/auto-gpt-docs/deploys/6920841aa5cbb60008065f98

@deepsource-io
Copy link

deepsource-io bot commented Nov 14, 2025

Here's the code health analysis summary for commits 0edc669..0422173. View details on DeepSource ↗.

Analysis Summary

AnalyzerStatusSummaryLink
DeepSource JavaScript LogoJavaScript✅ Success
❗ 23 occurences introduced
🎯 5 occurences resolved
View Check ↗
DeepSource Python LogoPython✅ Success
❗ 15 occurences introduced
🎯 3 occurences resolved
View Check ↗

💡 If you’re a repository administrator, you can configure the quality gates from the settings.

@AutoGPT-Agent
Copy link

Thank you for implementing the Human In The Loop block with review workflow! The feature looks comprehensive and well-designed, with all necessary components for both backend and frontend. The code follows good practices like proper typing and error handling.

Before we can merge this PR, please:

  • Check off all the items in your test plan (or update it if some tests aren't applicable)
  • Confirm that you've executed the test plan as described

Once the checklist is completed, we can proceed with the review of your implementation.

@AutoGPT-Agent
Copy link

Thank you for this comprehensive PR implementing the Human In The Loop (HITL) block! This is an excellent addition that will enable more interactive workflows.

Feedback

  • Test Plan: Your PR includes a good test plan, but the checkboxes aren't checked off. Please make sure to run through all the tests and check them off before this can be approved.

  • Function Warning: I noticed you've added a warning comment for the get_node_executions_count function about not having a user_id check, which is good practice. Make sure that all routes using this function properly validate the user ID first.

  • Execution Status Flow: The new WAITING_FOR_REVIEW status has been properly integrated into the execution flow, including the relevant state transitions. This looks well thought out.

  • UI Integration: The floating review panel and integration into multiple parts of the UI seems well-designed and consistent.

  • Database Schema: The new PendingHumanReview model looks properly structured with the necessary relations and indexes.

Overall, this is a well-implemented feature. Please complete the test plan before this can be approved for merging.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (20)
autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/AgentRunsView/components/SelectedRunView/components/RunStatusBadge.tsx (1)

39-43: Consider a distinct icon for better visual differentiation.

The WAITING_FOR_REVIEW status uses the same PauseCircleIcon as the RUNNING status (line 34), differing only in color (blue vs yellow). While the color distinction is present, using a unique icon might improve visual scanning and reduce potential confusion between these two states.

Consider alternatives like ClockIcon (already used for QUEUED but semantically fits "waiting"), EyeIcon, or UserCircleIcon to better convey human review.

autogpt_platform/backend/backend/blocks/human_in_the_loop.py (4)

17-51: Block docstring and IO schemas are clear; optional status typing refinement

The docstring and Input/Output schemas clearly express the HITL semantics and defaults. As a small type-safety improvement, you could narrow status from str to an enum/constant set (e.g., shared status constants or a Literal["approved", "rejected"]) so downstream logic can rely on the exact allowed values.


83-108: Approved-path logic matches spec; consider guarding convert failures

The APPROVED branch correctly:

  • Retrieves the review for this node,
  • Extracts the actual data from the stored JSON (with a reasonable fallback),
  • Converts it back to the original input type, yields outputs, and then deletes the review.

Depending on how convert behaves when the reviewer significantly changes the shape/type of data, you may want to defensively handle conversion failures (e.g., wrap in try/except and either surface a clearer error or fall back to the raw approved_data) to avoid opaque runtime errors.


110-117: Verify whether reviewed_data should be emitted on rejection

In the REJECTED path you only yield status and review_message, while the Output schema also defines reviewed_data. If the block framework or downstream consumers assume that all schema fields are present, this could be a subtle source of errors; if they treat missing fields as acceptable, it’s fine.

Consider either:

  • Yielding a sentinel value (e.g., None or the original input_data.data) for reviewed_data on rejection, or
  • Explicitly documenting that reviewed_data is only present when status == "approved".

119-140: WAITING upsert behavior works but could be more status-aware

Creating/updating a PendingHumanReview with status: "WAITING" on first run matches the pause-and-review requirement. Two small refinements to consider:

  1. If an existing review already has status == "WAITING", you can early-return instead of upserting again to avoid extra writes when the block is re-entered without any human action.
  2. If additional statuses are ever added on the Prisma side (e.g., “IN_REVIEW”), this unconditional upsert would overwrite them. Making the upsert conditional on not existing_review or existing_review.status == "WAITING" and using the generated Prisma enum/typed status constants instead of raw strings would reduce the chance of status drift or accidental overwrites.
autogpt_platform/frontend/src/app/(platform)/build/components/legacy-builder/Flow/Flow.tsx (1)

1012-1015: Redundant undefined coercion.

Line 1013 uses executionId={flowExecutionID || undefined}, but flowExecutionID is already typed as GraphExecutionID | undefined (line 110-112). The || undefined is redundant.

Apply this diff to simplify:

       <FloatingReviewsPanel
-        executionId={flowExecutionID || undefined}
+        executionId={flowExecutionID}
         className="fixed bottom-24 right-4"
       />
autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/OldAgentLibraryView/components/agent-run-status-chip.tsx (1)

26-26: Note: Temporary status mapping causes label inconsistency.

WAITING_FOR_REVIEW is mapped to "queued" which will display as "Queued" in this component, while other components (e.g., NodeExecutionBadge.tsx line 26) display "Waiting for Review". The TODO comment indicates this is temporary, but consider prioritizing a consistent label to avoid user confusion.

Consider adding a dedicated status entry for WAITING_FOR_REVIEW:

 export const agentRunStatusMap: Record<
   GraphExecutionMeta["status"],
   AgentRunStatus
 > = {
   INCOMPLETE: "draft",
   COMPLETED: "success",
   FAILED: "failed",
   QUEUED: "queued",
   RUNNING: "running",
   TERMINATED: "stopped",
-  WAITING_FOR_REVIEW: "queued", // Map to queued for now
+  WAITING_FOR_REVIEW: "review",
   // TODO: implement "draft" - https://github.com/Significant-Gravitas/AutoGPT/issues/9168
 };

And update the statusData record:

const statusData: Record<
  AgentRunStatus,
  { label: string; variant: keyof typeof statusStyles }
> = {
  // ... existing entries ...
  review: { label: "Waiting for Review", variant: "info" },
};
autogpt_platform/backend/backend/executor/manager.py (1)

577-591: Make PendingHumanReview lookup more robust and observable

The PendingHumanReview check correctly drives WAITING_FOR_REVIEW vs COMPLETED, but a couple of improvements would make this safer:

  • The broad except Exception currently swallows all errors and silently treats the node as COMPLETED, which can mask schema or connectivity issues and effectively disable HITL without any signal. At minimum, log the exception (e.g., via log_metadata.exception(...)) before defaulting to COMPLETED.
  • Importing and querying PendingHumanReview directly from Prisma here bypasses the existing database abstraction (DatabaseManagerAsyncClient). Consider exposing a small helper on the DB manager (e.g., has_pending_human_review(node_exec_id)) and calling that instead, so executor code stays decoupled from Prisma models and field names.

These changes preserve the graceful fallback to COMPLETED but improve debuggability and maintainability.

autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/AgentRunsView/components/SelectedRunView/SelectedRunView.tsx (1)

18-20: Hook integration for execution-specific pending reviews is sound

Using usePendingReviewsForExecution(runId) here cleanly co-locates review data with the selected run view. Consider, as a small enhancement, surfacing error from the hook into the UI (e.g., a lightweight inline message in the Reviews tab) so users get feedback if the review API fails, instead of seeing an empty/unchanged state.

Also applies to: 39-43

autogpt_platform/frontend/src/components/organisms/FloatingReviewsPanel/FloatingReviewsPanel.tsx (1)

1-93: Tighten hook usage and close logic in FloatingReviewsPanel

Overall behavior is good, but a couple of small refinements would make this more robust:

  • usePendingReviewsForExecution(executionId || "") will still instantiate the underlying query even when executionId is undefined. It would be cleaner if usePendingReviewsForExecution itself treated a falsy graphExecId as “no-op” (returning an empty list and no network call), so callers like this component don’t need to pass an empty string.
  • In handleReviewComplete, you close the panel if pendingReviews.length <= 1 before the refetch result is applied. That’s usually fine, but if new reviews arrive concurrently, you could close while there are still pending reviews. If that matters, consider basing the close decision on the refetch result (e.g., via a then callback) instead of the pre-refetch length.

These are minor and can be addressed later, but they’ll make the panel’s behavior more predictable around edge cases.

autogpt_platform/frontend/src/hooks/usePendingReviews.ts (1)

1-32: Hooks are clean; consider making executionId optional-friendly

Both hooks nicely normalize the API response into { pendingReviews, isLoading, error, refetch }, which simplifies consumers. Given that some callers (e.g., FloatingReviewsPanel) may not always have an execution ID, it could be useful to let usePendingReviewsForExecution accept an optional graphExecId and short‑circuit to { pendingReviews: [], isLoading: false, error: undefined } when it’s absent, instead of requiring callers to pass an empty string.

autogpt_platform/frontend/src/components/organisms/PendingReviewCard/PendingReviewCard.tsx (2)

21-27: Clarify and centralize assumptions about review.data shape

You’re assuming review.data is either a { data, message } object or raw JSON, and duplicating shape checks in both the initial reviewData state and the “Instructions” section. This works, but it’s a bit fragile and any-heavy.

Consider extracting a small helper/type guard (e.g., isStructuredReviewPayload(review.data)) that returns { instructions, editableData }. That would:

  • Avoid repeated "data" in review.data / "message" in review.data checks.
  • Make it easier to evolve the payload shape without touching multiple call sites.

Also applies to: 95-105


73-83: Default reject message may overwrite user intent

For rejects you always send a message, defaulting to "Rejected by user" when reviewMessage is empty. That’s fine as a UX choice, but it does mean the backend can’t distinguish “no comment” from a generic comment.

If you want to preserve that distinction (for analytics or cleaner audit logs), consider sending undefined when the textarea is blank and having the backend fill in any default message.

autogpt_platform/backend/schema.prisma (1)

42-63: PendingHumanReview schema and relations look coherent; consider enum for status

The new PendingHumanReview model and its relations to User, AgentGraphExecution, and AgentNodeExecution look consistent:

  • @@index([userId, status]) and @@index([graphExecId, status]) align with the query patterns in the review routes.
  • @@unique([nodeExecId]) enforces the “one pending review per node execution” invariant nicely.
  • Adding WAITING_FOR_REVIEW to AgentExecutionStatus fits the new execution flow.

One small improvement would be to model PendingHumanReview.status as an enum (e.g., PendingHumanReviewStatus) instead of a bare String, to avoid typos and keep it aligned with the literals used in the API layer. This is optional but would tighten type-safety.

Also applies to: 347-356, 359-406, 409-435, 473-495

autogpt_platform/backend/backend/server/v2/executions/review/routes.py (2)

22-57: Pending reviews listing endpoint matches schema and index usage

/review/pending filters on { userId, status: "WAITING" } and orders by createdAt DESC, which lines up with the Prisma indices you added and the intended UX. The mapping into PendingHumanReviewResponse is direct and clear.

If you find yourself adding more endpoints that return this shape, consider a small helper to avoid repeating the same field mapping.


59-95: Execution-scoped pending listing is consistent with the global listing

/review/execution/{graph_exec_id} mirrors the global listing logic while scoping by graphExecId and sorting ascending by createdAt, which makes sense for a per-run review timeline. The response mapping again matches PendingHumanReviewResponse.

Same comment as above: you could factor the Prisma→response mapping into a shared function to reduce duplication.

autogpt_platform/frontend/src/app/api/openapi.json (4)

4756-4786: Add pagination and clarify response guarantees for pending reviews.

  • Consider page and page_size query params to avoid unbounded payloads when many reviews are queued.
  • Keep response typed, but document it returns only status=WAITING items to align with the path name.

Apply parameters:

       "get": {
         "tags": ["v2", "executions", "execution-review", "private"],
         "summary": "Get Pending Reviews",
         "description": "Get all pending reviews for the current user.",
         "operationId": "getV2Get pending reviews",
+        "parameters": [
+          {
+            "name": "page",
+            "in": "query",
+            "required": false,
+            "schema": { "type": "integer", "minimum": 1, "default": 1, "title": "Page" }
+          },
+          {
+            "name": "page_size",
+            "in": "query",
+            "required": false,
+            "schema": { "type": "integer", "minimum": 1, "maximum": 100, "default": 25, "title": "Page Size" }
+          }
+        ],

4788-4835: Constrain path parameter type for execution id.

Add an explicit format/pattern for graph_exec_id to improve validation and client generation.

           {
             "name": "graph_exec_id",
             "in": "path",
             "required": true,
-            "schema": { "type": "string", "title": "Graph Exec Id" }
+            "schema": { "type": "string", "format": "uuid", "title": "Graph Exec Id" }
           }

7997-8020: Tighten ReviewActionRequest: SafeJson + conditional required fields.

  • Use SafeJson for reviewed_data instead of empty schema {}.
  • Require reviewed_data when action=approve (and optionally allow message when reject).
       "ReviewActionRequest": {
         "properties": {
           "action": {
             "type": "string",
             "enum": ["approve", "reject"],
             "title": "Action",
             "description": "Action to take"
           },
           "reviewed_data": {
-            "anyOf": [{}, { "type": "null" }],
+            "anyOf": [
+              { "$ref": "#/components/schemas/SafeJson" },
+              { "type": "null" }
+            ],
             "title": "Reviewed Data",
             "description": "Modified data (only for approve action)"
           },
           "message": {
             "anyOf": [{ "type": "string" }, { "type": "null" }],
             "title": "Message",
             "description": "Optional message from the reviewer"
           }
         },
         "type": "object",
-        "required": ["action"],
+        "required": ["action"],
+        "allOf": [
+          {
+            "if": { "properties": { "action": { "const": "approve" } }, "required": ["action"] },
+            "then": { "required": ["action", "reviewed_data"] }
+          }
+        ],
         "title": "ReviewActionRequest",
         "description": "Request model for reviewing data."
       },

4756-4890: Normalize operationId casing across all endpoints—not just these three.

Verification found 75+ operationIds with spaces throughout the entire spec, not just the three in this section. While the examples you cited (getV2Get pending reviews, getV2Get pending reviews for execution, postV2Review data) are confirmed, this is a systematic issue affecting many v1 and v2 endpoints.

Recommend a batch normalization pass across the entire openapi.json to convert all operationIds to camelCase or PascalCase without spaces (e.g., getV1InitiateOAuthFlow, postV2ReviewData, etc.). This will improve compatibility with code generators and Orval configurations.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between a054740 and b8cb244.

📒 Files selected for processing (23)
  • autogpt_platform/backend/backend/blocks/human_in_the_loop.py (1 hunks)
  • autogpt_platform/backend/backend/data/execution.py (3 hunks)
  • autogpt_platform/backend/backend/executor/database.py (4 hunks)
  • autogpt_platform/backend/backend/executor/manager.py (3 hunks)
  • autogpt_platform/backend/backend/server/rest_api.py (2 hunks)
  • autogpt_platform/backend/backend/server/v2/executions/review/model.py (1 hunks)
  • autogpt_platform/backend/backend/server/v2/executions/review/routes.py (1 hunks)
  • autogpt_platform/backend/schema.prisma (5 hunks)
  • autogpt_platform/frontend/src/app/(platform)/build/components/FlowEditor/Flow/Flow.tsx (2 hunks)
  • autogpt_platform/frontend/src/app/(platform)/build/components/FlowEditor/nodes/CustomNode/components/NodeExecutionBadge.tsx (2 hunks)
  • autogpt_platform/frontend/src/app/(platform)/build/components/FlowEditor/nodes/CustomNode/helpers.ts (1 hunks)
  • autogpt_platform/frontend/src/app/(platform)/build/components/legacy-builder/Flow/Flow.tsx (2 hunks)
  • autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/AgentRunsView/components/RunsSidebar/components/RunListItem.tsx (1 hunks)
  • autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/AgentRunsView/components/SelectedRunView/SelectedRunView.tsx (4 hunks)
  • autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/AgentRunsView/components/SelectedRunView/components/RunStatusBadge.tsx (2 hunks)
  • autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/OldAgentLibraryView/components/agent-run-status-chip.tsx (1 hunks)
  • autogpt_platform/frontend/src/app/api/openapi.json (4 hunks)
  • autogpt_platform/frontend/src/components/organisms/FloatingReviewsPanel/FloatingReviewsPanel.tsx (1 hunks)
  • autogpt_platform/frontend/src/components/organisms/PendingReviewCard/PendingReviewCard.tsx (1 hunks)
  • autogpt_platform/frontend/src/components/organisms/PendingReviewsList/PendingReviewsList.tsx (1 hunks)
  • autogpt_platform/frontend/src/hooks/useAgentGraph.tsx (1 hunks)
  • autogpt_platform/frontend/src/hooks/usePendingReviews.ts (1 hunks)
  • autogpt_platform/frontend/src/lib/autogpt-server-api/types.ts (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: chromatic
  • GitHub Check: test
  • GitHub Check: types
  • GitHub Check: test (3.11)
  • GitHub Check: test (3.13)
  • GitHub Check: test (3.12)
  • GitHub Check: Check PR Status
🔇 Additional comments (21)
autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/AgentRunsView/components/SelectedRunView/components/RunStatusBadge.tsx (1)

80-82: LGTM! Appropriate text formatting for multi-word status.

The conditional handling correctly displays "Waiting for Review" in a human-readable format, while preserving the existing lowercase transformation for other statuses. Combined with the capitalize class on line 78, this will render as "Waiting For Review", which is appropriate for a multi-word status label.

autogpt_platform/backend/backend/blocks/human_in_the_loop.py (2)

1-14: Imports and helpers look appropriate

The imported dependencies are minimal and aligned with the block’s responsibilities (Prisma model, block base classes, SchemaField, SafeJson, and convert); nothing appears unused or missing for this implementation.


53-70: Constructor wiring and test metadata look consistent

The block id, description, category, schemas, and test_input/test_output all align with the documented behavior of an approval path and should integrate cleanly with the existing block infrastructure.

autogpt_platform/frontend/src/app/(platform)/build/components/FlowEditor/nodes/CustomNode/components/NodeExecutionBadge.tsx (1)

10-10: LGTM!

The status styling and text rendering for WAITING_FOR_REVIEW is consistent with the existing status handling pattern.

Also applies to: 26-26

autogpt_platform/frontend/src/app/(platform)/build/components/FlowEditor/Flow/Flow.tsx (1)

16-21: LGTM!

The integration of FloatingReviewsPanel is clean and follows React best practices. Reading flowExecutionID from URL params and passing it to the panel component is the correct approach for this feature.

Also applies to: 77-77

autogpt_platform/frontend/src/app/(platform)/build/components/FlowEditor/nodes/CustomNode/helpers.ts (1)

7-7: LGTM!

The status styling for WAITING_FOR_REVIEW follows the existing pattern correctly.

autogpt_platform/frontend/src/hooks/useAgentGraph.tsx (1)

360-366: LGTM! Status ranking logic correctly prioritizes WAITING_FOR_REVIEW.

The updated status ranking makes sense for the review workflow:

  • WAITING_FOR_REVIEW at rank 1 ensures it's prominently displayed when any node awaits review
  • Higher priority than QUEUED (rank 2) is appropriate - review state needs more immediate attention
  • All other statuses shift down by 1 to accommodate the new status

The exclusion of WAITING_FOR_REVIEW from the terminal status check (lines 549-559) is also correct, as this status represents a paused state rather than a completed execution.

autogpt_platform/frontend/src/lib/autogpt-server-api/types.ts (1)

280-281: LGTM!

The addition of WAITING_FOR_REVIEW to the status union types is straightforward and enables the new review workflow status across the frontend.

Also applies to: 418-419

autogpt_platform/backend/backend/server/rest_api.py (2)

32-32: Review router import wiring looks consistent

Importing backend.server.v2.executions.review.routes alongside other v2 routers keeps server wiring centralized and consistent; no issues here.


290-294: Execution review router mount is correct

Mounting the review router under prefix="/api/executions" with tags ["v2", "executions"] matches existing naming and keeps review endpoints grouped logically under executions.

autogpt_platform/backend/backend/executor/manager.py (2)

677-686: WAITING_FOR_REVIEW → RUNNING resume flow looks correct

Handling ExecutionStatus.WAITING_FOR_REVIEW by switching the graph back to RUNNING and persisting the status via update_graph_execution_state is consistent with the new VALID_STATUS_TRANSITIONS (WAITING_FOR_REVIEW→RUNNING) and mirrors the existing resume paths for FAILED/TERMINATED.


1033-1041: Final graph status uses node-level WAITING_FOR_REVIEW appropriately

Deriving the final graph execution status by checking get_node_executions_count(..., statuses=[ExecutionStatus.WAITING_FOR_REVIEW]) ensures that any node waiting for review pauses the entire graph (status WAITING_FOR_REVIEW), while still returning COMPLETED when no such nodes exist. This is a clear and efficient integration point for the HITL workflow.

autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/AgentRunsView/components/SelectedRunView/SelectedRunView.tsx (1)

80-84: Conditional Reviews tab UX is reasonable

Gating both the “Reviews” tab trigger and its content on pendingReviews.length > 0 avoids showing an empty tab and keeps the experience simple; wiring onReviewComplete={refetchReviews} ensures the list stays in sync after actions. No functional issues here.

Also applies to: 109-123

autogpt_platform/frontend/src/components/organisms/PendingReviewsList/PendingReviewsList.tsx (1)

1-50: PendingReviewsList component is well-scoped and typed

The list cleanly handles both the empty state and populated state, uses PendingHumanReviewResponse[] for type safety, and forwards onReviewComplete to each PendingReviewCard with stable key={review.id}. This is a solid reusable building block for review UIs.

autogpt_platform/backend/backend/data/execution.py (1)

95-122: WAITING_FOR_REVIEW transitions and node count helper align with executor logic

Adding ExecutionStatus.WAITING_FOR_REVIEW as an allowed source for RUNNING and as a target from RUNNING in VALID_STATUS_TRANSITIONS matches the executor’s resume path and ensures update_graph_execution_stats enforces the intended RUNNING ↔ WAITING_FOR_REVIEW lifecycle. The new get_node_executions_count mirrors get_node_executions filter semantics (graph_exec_id/node_id/block_ids/statuses/time range) and is a good fit for cheap existence checks like the final graph status decision in ExecutionProcessor._on_graph_execution. The explicit “no user_id check” docstring is also consistent with other internal-only helpers.

Also applies to: 1009-1038

autogpt_platform/backend/backend/executor/database.py (4)

7-26: New get_node_executions_count import is consistent with data layer usage

Importing get_node_executions_count alongside the other execution helpers keeps this module aligned with backend.data.execution; no issues spotted.


125-143: Service exposure of get_node_executions_count matches existing pattern

Wiring get_node_executions_count = _(get_node_executions_count) mirrors the other execution APIs and should integrate cleanly with the app service RPC mechanism.


200-210: Sync client exposure looks correct

DatabaseManagerClient.get_node_executions_count = _(d.get_node_executions_count) follows the same endpoint_to_sync pattern as neighboring methods; no concerns.


241-252: Async client passthrough is consistent

DatabaseManagerAsyncClient.get_node_executions_count = d.get_node_executions_count matches how other async methods are surfaced; this should be usable wherever async access is preferred.

autogpt_platform/backend/backend/server/v2/executions/review/model.py (1)

7-41: Review models align with DB shape and API usage

PendingHumanReviewResponse and ReviewActionRequest cleanly mirror the Prisma model and the frontend expectations (ids, data, status, timestamps, and optional reviewed_data / message). Literal types for both status and action should help catch misuse at compile time.

Looks good.

autogpt_platform/backend/backend/server/v2/executions/review/routes.py (1)

97-167: Review mutation logic is sound; data patch behavior is well-defined

The approve/reject endpoint correctly:

  • Ensures the review exists and belongs to the current user.
  • Rejects non-WAITING reviews with a clear 400.
  • For approve, conditionally patches only the data field of a structured payload (when review.data is a dict with a data key), falling back to replacing the entire payload otherwise.
  • Persists the updated status, data, message, and reviewedAt timestamp using SafeJson.

This gives a predictable “patch vs. replace” behavior and keeps the DB consistent with what the frontend sends.

No blocking issues here.

@majdyz
Copy link
Contributor Author

majdyz commented Nov 14, 2025

@ntindle thanks for the review, this PR is still a draft, I only vibecoded it and ask to create a PR so that rabbit can start reviewing it

@ntindle
Copy link
Member

ntindle commented Nov 14, 2025

Same page

majdyz and others added 2 commits November 14, 2025 15:25
…security and workflow improvements

## Summary
- Complete implementation of Human In The Loop (HITL) block for pausing execution pending human review
- Fix all critical security vulnerabilities and architectural issues identified in PR reviews
- Refactor codebase to follow project patterns and improve maintainability

## Backend Changes
- Add `ReviewStatus` enum to Prisma schema with proper indexing for performance
- Implement comprehensive authorization checks with graph ownership verification
- Add atomic database transactions to prevent race conditions
- Fix critical `user_id=""` bug that prevented review resumption
- Add `wasEdited` field to track data modifications during review
- Implement proper input validation with size/depth limits to prevent DoS attacks
- Create service layer separation between business logic and database operations
- Fix Pydantic v2 validator compatibility issues
- Add proper error handling and remove silent failures
- Update execution status transitions to support WAITING_FOR_REVIEW state

## Frontend Changes
- Fix WAITING_FOR_REVIEW color consistency across UI components (purple theme)
- Add missing WAITING_FOR_REVIEW status handling in ActivityItem.tsx
- Generate updated OpenAPI client with proper type safety
- Remove unsafe `as any` type casting with proper type guards

## API Improvements
- Add structured `ReviewActionResponse` model for better type generation
- Implement comprehensive request validation with security checks
- Add proper OpenAPI schema generation for better developer experience
- Support both legacy and structured data formats for backward compatibility

## Security Enhancements
- Add authorization checks to verify graph ownership before review access
- Implement size limits (1MB) and nesting depth validation (10 levels)
- Add SQL injection protection and input sanitization
- Use atomic database operations to prevent concurrent modification issues

## Testing
- Add comprehensive unit tests covering security, validation, and business logic
- Test all edge cases including race conditions and data validation
- Verify API endpoints with proper error handling and status codes

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
…ph_version parameter

The HumanInTheLoopBlock test was failing because the test framework wasn't providing the required `graph_version` parameter that the block's run method expects.

Changes:
- Add `graph_version: 1` to test framework's `extra_exec_kwargs` in backend/util/test.py
- Add test mocks for HumanInTheLoopBlock to avoid database/service dependencies during testing
- Add conditional logic to use mocks in test environment while preserving production functionality

The block now passes all tests while maintaining full production functionality for the human review workflow.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@github-actions
Copy link
Contributor

Conflicts have been resolved! 🎉 A maintainer will review the pull request shortly.

@AutoGPT-Agent
Copy link

Thanks for submitting this comprehensive PR implementing the Human In The Loop functionality! The feature looks well-designed with consideration for both backend and frontend integration.

Missing Required Elements

Before this PR can be approved, please add the standard PR checklist from the template. For a substantial code change like this, the checklist needs to be completed to ensure all quality steps have been followed.

Technical Review

The implementation looks solid, with:

  • Proper user_id validation in backend functions
  • Clear separation of concerns between data layer and API endpoints
  • Well-structured UI components for displaying and handling reviews
  • Good error handling and logging throughout

I particularly like the attention to security in the review validation logic - the checks for JSON serialization, size limits, and nesting depth are excellent practices.

Once you add the required PR checklist and mark the appropriate items as completed, this PR should be good to go!

@majdyz majdyz marked this pull request as ready for review November 18, 2025 08:13
@qodo-merge-pro
Copy link

You are nearing your monthly Qodo Merge usage quota. For more information, please visit here.

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
🧪 No relevant tests
🔒 Security concerns

Input validation hardening:
API review action enforces size and type checks, which is good. However, delete_store_submission now returns previously fetched submission data including fields like reviewer_id, internal_comments. Confirm no sensitive fields are exposed unintentionally to regular users via this response.

⚡ Recommended focus areas for review

Possible Issue

In _on_graph_execution, add_unprocessed_reviews_to_queue calls db_client.get_unprocessed_review_node_executions without awaiting. If this is async (as defined in data layer), this will enqueue a coroutine rather than results, breaking the polling loop logic.

    """Add nodes with unprocessed reviews to execution queue."""
    node_executions = db_client.get_unprocessed_review_node_executions(
        graph_exec.graph_exec_id
    )
    for node_exec in node_executions:
        node_entry = node_exec.to_node_execution_entry(
            graph_exec.user_context
        )
        execution_queue.add(node_entry)
    return len(node_executions)

while not execution_queue.empty() or add_unprocessed_reviews_to_queue() > 0:
    if cancel.is_set():
Return Type

The run method returns None when entering review (result is None). BlockOutput is typically expected to be a generator or return outputs; verify the executor tolerates a bare return (None) without marking node complete or causing runtime errors.

# Check if we're waiting for human input
if result is None:
    logger.info(
        f"HITL block pausing execution for node {node_exec_id} - awaiting human review"
    )
    try:
        # Set node status to REVIEW so execution manager can't mark it as COMPLETED
        # The VALID_STATUS_TRANSITIONS will then prevent any unwanted status changes
        await db_client.update_node_execution_status(
            node_exec_id=node_exec_id,
            status=ExecutionStatus.REVIEW,
        )
        # Execution pauses here until API routes process the review
        return
    except Exception as e:
        logger.error(
            f"Failed to update node status for HITL block {node_exec_id}: {str(e)}"
        )
        raise
Transition Filter

update_node_execution_status uses a unique where with both id and executionStatus in an update call. Prisma "where" for update_unique typically only allows unique fields; including executionStatus may cause the update to fail silently, triggering the fallback find_unique path unexpectedly.

if res := await AgentNodeExecution.prisma().update(
    where=cast(
        AgentNodeExecutionWhereUniqueInput,
        {
            "id": node_exec_id,
            "executionStatus": {"in": [s.value for s in allowed_from]},
        },
    ),
    data=_get_update_status_data(status, execution_data, stats),
    include=EXECUTION_RESULT_INCLUDE,
):
    return NodeExecutionResult.from_db(res)

if res := await AgentNodeExecution.prisma().find_unique(

@github-actions github-actions bot added the conflicts Automatically applied to PRs with merge conflicts label Nov 19, 2025
@github-actions
Copy link
Contributor

This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.

@github-actions
Copy link
Contributor

Conflicts have been resolved! 🎉 A maintainer will review the pull request shortly.

@github-actions github-actions bot removed the conflicts Automatically applied to PRs with merge conflicts label Nov 20, 2025
…e UI updates

## Backend Changes
- Fix HITL block to use proper async_update_node_execution_status wrapper for websocket events
- Update human review data layer with batch processing for better performance
- Add comprehensive test coverage for human review operations
- Streamline review processing workflow for execution resumption

## Frontend Changes
- Fix review status icon to use eyes instead of pause for better UX
- Enable real-time execution status updates in both new and legacy Flow components
- Pass execution status directly to FloatingReviewsPanel for immediate reactivity
- Fix tab switching and visibility issues in review interfaces
- Improve review workflow with proper status propagation

## Key Improvements
- Real-time websocket updates ensure UI reflects REVIEW status immediately
- Better separation between running and idle states (REVIEW is idle, not running)
- Enhanced error handling and validation in review processing
- Consistent execution status handling across different UI components

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@AutoGPT-Agent
Copy link

Thank you for this comprehensive implementation of the Human In The Loop functionality! This is a valuable addition to the platform that will enable more flexible agent workflows with user interaction.

The code implementation looks thorough and well-designed, with proper separation of concerns across the backend and frontend components. I particularly appreciate:

  • The thoughtful database schema design with appropriate indexes
  • Comprehensive error handling in the API routes
  • The consideration of security with proper user_id validation
  • Frontend integration in multiple places (graph builder, library, legacy views)

Before merging, please complete the test plan checklist in the PR description to confirm that all the listed test scenarios have been verified:

  • Test Human In The Loop block creation in graph builder
  • Test block execution pauses and creates pending review
  • Test review UI appears in all 3 locations
  • Test data modification and approval workflow
  • Test rejection workflow
  • Test execution resumes after approval

Once those boxes are checked to confirm testing has been completed, this PR will be ready to merge.

@majdyz majdyz requested review from kcze and ntindle November 20, 2025 13:38
@majdyz majdyz requested review from 0ubbe and removed request for Pwuts and Swiftyos November 20, 2025 13:58
…tion

- Restore store/routes.py delete endpoint to return boolean instead of StoreSubmission
- Restore store/db.py delete function signature and error handling
- These changes were accidentally included in HITL feature development
- Only HITL-related functionality should be in this branch

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@AutoGPT-Agent
Copy link

Thank you for this comprehensive implementation of the Human In The Loop block! The feature looks well-designed and thoroughly implemented across backend, frontend, and database layers.

However, before we can approve this PR, please add the standard PR checklist to your description and complete it. This checklist is required for all PRs that contain code changes.

The checklist should include:

  • Confirmation that you've clearly listed your changes
  • Your test plan (which you already have in your description)
  • Verification that you've tested according to the plan

Once you've added the checklist and checked off the relevant items, this PR should be ready for approval. The actual implementation looks solid and well-tested.

majdyz and others added 2 commits November 20, 2025 21:13
- Add HITL block ID (8b2a7b3c-6e9d-4a5f-8c1b-2e3f4a5b6c7d) to BETA_BLOCKS feature flag
- Block will be hidden from users by default unless beta-blocks flag is configured
- This allows controlled rollout of the Human-in-the-Loop functionality
- Beta users can access the block while it's being tested and refined

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@AutoGPT-Agent
Copy link

Thank you for this comprehensive implementation of the Human In The Loop block! The code looks well-structured with proper handling of user permissions and a good separation of concerns between frontend and backend components.

Before this can be merged, please complete the test plan checklist in your PR description by checking off the test items. All checkboxes in the PR description need to be checked for PR approval according to our guidelines.

Otherwise, the implementation is solid with proper user_id validation in backend functions, clear UI components for handling reviews, and appropriate database schema changes.

majdyz and others added 4 commits November 21, 2025 20:51
…-Gravitas/AutoGPT into feat/human-in-the-loop-block
- Backend: Fix message handling and unify API structure
  - Fix HITL block to properly yield review messages for both approved and rejected reviews
  - Unify review API structure with single `reviews` array using `approved` boolean field
  - Remove separate approved_reviews/rejected_review_ids in favor of cleaner unified approach

- Frontend: Complete UI/UX overhaul for review interface
  - Replace plain JSON textarea with type-aware input components matching run dialog styling
  - Add "Approve All" and "Reject All" buttons with smart disabled states
  - Show rejection reason input only when excluding items (simplified UX)
  - Fix Reviews tab auto-population when execution status changes to REVIEW
  - Add proper local state management for real-time input updates
  - Use design system Input components for consistent rounded styling

Key improvements:
- No more JSON syntax errors for string inputs
- Professional appearance matching platform standards
- Intuitive workflow with conditional UI elements
- Type-safe unified API structure
- Real-time input updates with proper state management

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Remove reviewData prop from PendingReviewCard usage
- Fix TypeScript error after prop removal
- Component now extracts data directly from review payload

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
…nents

Remove unnecessary comments and simplify code across HITL components:
- PendingReviewsList: Remove verbose comments, simplify logic flow
- FloatingReviewsPanel: Remove excessive commenting
- PendingReviewCard: Clean up type guard comments
- usePendingReviews: Remove redundant JSDoc comments

This improves code readability while maintaining all functionality.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@AutoGPT-Agent
Copy link

This PR implements a well-designed Human In The Loop block with comprehensive integration across the platform. The code quality looks good, and I particularly appreciate the thorough test coverage.

However, before we can merge this PR, please update the description to include the completed checklist from our PR template. This is required for all significant code changes. Please ensure all items in the checklist are checked off to confirm you've verified each requirement.

Once the checklist is added and completed, this PR should be ready for approval.

Copy link
Contributor

@Abhi1992002 Abhi1992002 Nov 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@majdyz We don’t need to make any changes in useFlow or useFlowRealtime — everything is already being saved in the node store automatically.

We just need to create a Review Panel component that imports nodeExecutionResult from the node store and checks whether it needs to be shown based on PendingReview boolean. Then, inside the same folder, we can use the autogenerated client to fetch the PendingHumanReview payloads using the node execution ID we get from the node store.

So apart from this component, we don’t need to change anything in the new builder’s code. The most important thing is that the architecture is modular — to add a new functionality, we only need to create a new file, and the existing infrastructure provides everything required. Since it uses autogenerated models, there’s no need to modify anything outside this file. If backend models change, the node store will store the updated model datatypes automatically.

Outside this new file, you only need to add two lines in:

to add the design for the review status in the custom node.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: 🆕 Needs initial review
Status: No status

Development

Successfully merging this pull request may close these issues.

6 participants