-
Notifications
You must be signed in to change notification settings - Fork 17
Description
This proposal simplifies our history event model by reducing what's captured in ActivityType
and EntityType
to only high-level entities. It should make the audit log of state transitions easier to work with, especially if we're thinking about user subscriptions to specific event streams.
EntityType Changes
Many of our current EntityType values represent internal structures or relationships that are ultimately tied to a node. For example: AVAILABILITY
, BACKFILL
, COLUMN_ATTRIBUTE
, DEPENDENCY
, LINK
, MATERIALIZATION
, PARTITION
, and QUERY
— these are all either metadata on a node or sub-resources owned by a node. If a change originates from or affects a node, we treat it as a NODE
event — and use metadata to express what specifically changed.
This means we can reduce EntityType to:
class EntityType(StrEnum):
NODE = "node"
NAMESPACE = "namespace"
ENGINE = "engine"
CATALOG = "catalog"
TAG = "tag"
ActivityType Changes
We can replace the current mix of CRUD and specialized ActivityType
values with a small set of CRUD-style actions. Specific behaviors can be captured through additional metadata on the event:
class ActivityType(StrEnum):
CREATE = "create"
DELETE = "delete"
RESTORE = "restore"
UPDATE = "update"
The remaining actions like TAG
, SET_ATTRIBUTE
etc would all be treated as UPDATE events, with additional metadata stored in details
to indicate the specific type of update. For example:
{
"activity_type": "update", // one of: create, delete, restore, update
"entity_type": "node", // high-level entity like node, namespace, etc.
"entity_id": 123,
"created_at": "2025-04-17T10:00:00Z",
"user": "djuser",
"details": {
"change_type": "set_attribute", // optional — further classifies the update, see below
"metadata": {
... // change-specific fields
}
}
}
Proposed change_type
values, based on what we support today:
Change Type | Notes |
---|---|
TAG |
Adding a tag to the node |
AVAILABILITY |
Posting availability for a node |
BACKFILL |
Starting a backfill for a node |
COLUMN_ATTRIBUTE |
Updating a column attribute on the node |
DEPENDENCY |
When an upstream change triggers an update to this node |
LINK |
Changing a dimension link on the node |
MATERIALIZATION |
Adding or updating materialization for the node |
PARTITION |
Setting a partition column on the node |