-
Notifications
You must be signed in to change notification settings - Fork 19.7k
Open
Labels
🐞 bugSomething isn't workingSomething isn't working
Description
Self Checks
- I have read the Contributing Guide and Language Policy.
- This is only for bug report, if you would like to ask a question, please head to Discussions.
- I have searched for existing issues search for existing issues, including closed ones.
- I confirm that I am using English to submit this report, otherwise it will be closed.
- 【中文用户 & Non English User】请使用英语提交,否则会被关闭 :)
- Please do not modify this template :) and fill in all the required fields.
Dify version
v1.11.4
Cloud or Self Hosted
Self Hosted (Docker)
Steps to reproduce
Prerequisites
- A running Dify environment (API + Redis + PostgreSQL/MySQL)
- A Workflow application containing a Trigger Plugin node
- The Workflow has been saved before, with corresponding
workflow_plugin_triggerrecords in the database
Reproduction Steps
Step 1: Get test data
# Find an existing trigger node_id from workflow config or logs
# Assume node_id = "1767869326455"
# Confirm the record exists in database
psql -c "SELECT * FROM workflow_plugin_trigger WHERE node_id = '1767869326455';"Step 2: Simulate data inconsistency state
# 1. Check current cache content (if exists)
redis-cli GET "plugin_trigger_nodes:{app_id}:1767869326455"
# 2. Delete DB record but keep the cache (simulating stale cache after transaction rollback)
psql -c "DELETE FROM workflow_plugin_trigger WHERE node_id = '1767869326455';"
# 3. Manually create a stale cache entry (if cache doesn't exist)
redis-cli SET "plugin_trigger_nodes:{app_id}:1767869326455" '{"record_id":"fake-id-123","node_id":"1767869326455","provider_id":"test/provider","event_name":"TEST_EVENT","subscription_id":"old-subscription-id"}' EX 3600Step 3: Trigger sync operation
# Save the workflow via API or UI
curl -X POST "http://localhost:5001/console/api/apps/{app_id}/workflows/draft" \
-H "Authorization: Bearer {token}" \
-H "Content-Type: application/json" \
-d '{"graph": ...}'Step 4: Observe the issue
Check API logs:
INFO - Found trigger node: node_id=1767869326455, subscription_id=xxx
INFO - Total trigger nodes found in workflow graph: 1
INFO - Creating 0 new trigger records <-- BUG: Found 1 node but creating 0 records
INFO - Sync completed: created=0, updated=0, deleted=0
Step 5: Confirm the bug
# Database still has no record
psql -c "SELECT COUNT(*) FROM workflow_plugin_trigger WHERE node_id = '1767869326455';"
# Returns 0
# But stale cache still exists
redis-cli GET "plugin_trigger_nodes:{app_id}:1767869326455"
# Returns the fake data✔️ Expected Behavior
- The sync process should create the missing database record regardless of cache state.
- The database should be the source of truth, not Redis cache.
- Stale cache entries should be detected and cleaned before creating new records.
- Cache should only be updated AFTER successful database commit.
❌ Actual Behavior
- When Redis cache exists but DB record is missing (stale cache), the sync process incorrectly skips record creation.
- The code trusts cache existence as proof of DB record existence, which is incorrect.
- No stale cache detection or cleanup mechanism exists.
- Cache is written BEFORE database commit, causing potential data inconsistency on transaction failure.
Root Cause Analysis
Issue 1: Cache-first decision logic (Line 208-230)
not_found_in_cache: list[Mapping[str, Any]] = []
for node_info in nodes_in_graph:
node_id = node_info["node_id"]
if not redis_client.get(f"{cls.__PLUGIN_TRIGGER_NODE_CACHE_KEY__}:{app.id}:{node_id}"):
not_found_in_cache.append(node_info)
continue # Cache hit = skip, never added to processing list
nodes_not_found = [
node_info for node_info in not_found_in_cache # Only processes cache-miss nodes
if node_info["node_id"] not in nodes_id_in_db
]Issue 2: Cache written before DB commit (Line 232-252)
for node_info in nodes_not_found:
session.flush()
redis_client.set(...) # Cache written first
session.commit() # If this fails, cache is now staleIssue 3: Distributed lock not properly acquired (Line 217)
redis_client.lock(f"...lock", timeout=10) # Only creates lock object, never calls acquire()How Stale Cache is Produced
- Transaction failure after cache write: Cache is written during
flush(), but ifcommit()fails, the cache becomes stale. - Distributed lock not effective: The lock object is created but
acquire()is never called, allowing race conditions where one thread writes cache while another rolls back the database.
Additional Context
| Scenario | Current Behavior | Expected Behavior |
|---|---|---|
| Cache exists, DB missing | Skip creation (trust cache) | Clear stale cache + create record |
| DB commit fails | Cache already written (inconsistent) | Cache not written |
| Concurrent sync | Lock ineffective (race condition) | Properly serialized |
Affected File
api/services/trigger/trigger_service.py- Method:
TriggerService.sync_plugin_trigger_relationships()
Related Keys
- Cache Key:
plugin_trigger_nodes:{app_id}:{node_id} - Lock Key:
plugin_trigger_nodes:apps:{app_id}:lock - DB Table:
workflow_plugin_trigger
dosubot
Metadata
Metadata
Assignees
Labels
🐞 bugSomething isn't workingSomething isn't working