Skip to content

Feature: Generate Action Graph from Interaction Log + Scene Snapshots #10

Open
@abrichr

Description

@abrichr

Summary

Implement a module that constructs an Action Graph from an interaction log and corresponding scene snapshots. The Action Graph models UI states as nodes and UI actions as edges, capturing transitions between visual states triggered by user or agent interactions.

Supports both real and synthetic data sources.


Motivation

  • Provides a unified, structured representation of recorded UI behavior over time.
  • Enables downstream planning, summarization, visualization, and analysis.
  • Forms the backbone of OmniMCP’s process abstraction stack:
    Parser → Segments → Tracks → Scene Graph → Action Graph → Plan → Actions → API
  • Can optionally use an Interaction Log (real or synthetic) to help derive the Action Graph.
  • Can later be converted into symbolic process logs for use with PM4Py or other process mining tools.

Diagram

graph TD
  Parser --> Segments
  Segments --> Tracks
  Tracks --> SceneGraph
  SceneGraph --> ActionGraph
  InteractionLog -.-> ActionGraph
  ActionGraph --> Plan
  Plan --> Actions
  Actions --> API

  InteractionLog[Interaction Log]
Loading

Scope

Inputs

  • interaction_log (optional): List of structured user/agent interactions (click, type, scroll, etc.), each with:
    • timestamp or step
    • action_type
    • element_id or selector
    • (optional) element_description, bounding_box, value
  • scene_snapshots: List of scene graph snapshots (UI state summaries or raw graph objects), aligned with interaction steps.

Outputs

  • action_graph: A JSON or in-memory object with:
    • nodes: One per unique UI state (e.g., via hash or semantic description)
    • edges: One per interaction, with:
      • source_node_id
      • target_node_id
      • action_type, element_id, timestamp

Features

  • Node deduplication: similar scene snapshots map to the same node
  • Edge labeling with action metadata
  • Optional use of interaction log for state transition alignment
  • Support for synthetic logs to bootstrap development and testing
  • Easy export to JSON for visualization/debugging
  • Integration-ready for prompt-based planners and optional PM4Py pipeline

Example

Given:

[
  { "step": 0, "action": "type", "element_id": "email", "value": "[email protected]", "scene": "Login page with empty fields" },
  { "step": 1, "action": "type", "element_id": "password", "value": "hunter2", "scene": "Login page with email filled" },
  { "step": 2, "action": "click", "element_id": "login_button", "scene": "Login page with both fields filled" },
  { "step": 3, "action": "wait", "duration": 2, "scene": "Dashboard with welcome message" }
]

The resulting Action Graph:

{
  "nodes": [
    { "id": "n0", "description": "Login page with empty fields" },
    { "id": "n1", "description": "Login page with email filled" },
    { "id": "n2", "description": "Login page with both fields filled" },
    { "id": "n3", "description": "Dashboard with welcome message" }
  ],
  "edges": [
    { "source": "n0", "target": "n1", "action": "type", "element": "email", "step": 0 },
    { "source": "n1", "target": "n2", "action": "type", "element": "password", "step": 1 },
    { "source": "n2", "target": "n3", "action": "click", "element": "login_button", "step": 2 }
  ]
}

Tasks

  • Define ActionGraph data model (nodes, edges)
  • Implement graph construction logic
  • Handle node deduplication (exact match or fuzzy hash of scene descriptions)
  • Add support for synthetic log generation (for testing)
  • Add JSON export + optional visualization hooks
  • Unit tests with synthetic and real logs

Notes

  • Later extensions may include loops, branches, and hierarchical grouping to derive high-level process graphs.
  • LLM prompting may be used to generate symbolic descriptions for nodes or edge annotations.
  • Should remain model-agnostic; planner integration will be handled in downstream stages.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions