Skip to content

History Event Changes #1368

@shangyian

Description

@shangyian

This proposal simplifies our history event model by reducing what's captured in ActivityType and EntityType to only high-level entities. It should make the audit log of state transitions easier to work with, especially if we're thinking about user subscriptions to specific event streams.

EntityType Changes

Many of our current EntityType values represent internal structures or relationships that are ultimately tied to a node. For example: AVAILABILITY, BACKFILL, COLUMN_ATTRIBUTE, DEPENDENCY, LINK, MATERIALIZATION, PARTITION, and QUERY — these are all either metadata on a node or sub-resources owned by a node. If a change originates from or affects a node, we treat it as a NODE event — and use metadata to express what specifically changed.

This means we can reduce EntityType to:

class EntityType(StrEnum):
    NODE = "node"
    NAMESPACE = "namespace"
    ENGINE = "engine"
    CATALOG = "catalog"
    TAG = "tag"

ActivityType Changes

We can replace the current mix of CRUD and specialized ActivityType values with a small set of CRUD-style actions. Specific behaviors can be captured through additional metadata on the event:

class ActivityType(StrEnum):
    CREATE = "create"
    DELETE = "delete"
    RESTORE = "restore"
    UPDATE = "update"

The remaining actions like TAG, SET_ATTRIBUTE etc would all be treated as UPDATE events, with additional metadata stored in details to indicate the specific type of update. For example:

{
  "activity_type": "update",         // one of: create, delete, restore, update
  "entity_type": "node",             // high-level entity like node, namespace, etc.
  "entity_id": 123,
  "created_at": "2025-04-17T10:00:00Z",
  "user": "djuser",
  "details": {
    "change_type": "set_attribute",           // optional — further classifies the update, see below
    "metadata": {
      ... // change-specific fields
    }
  }
}

Proposed change_type values, based on what we support today:

Change Type Notes
TAG Adding a tag to the node
AVAILABILITY Posting availability for a node
BACKFILL Starting a backfill for a node
COLUMN_ATTRIBUTE Updating a column attribute on the node
DEPENDENCY When an upstream change triggers an update to this node
LINK Changing a dimension link on the node
MATERIALIZATION Adding or updating materialization for the node
PARTITION Setting a partition column on the node

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions