[Bug] Redis cache inconsistency causes plugin trigger records not being created

### Self Checks

- [x] I have read the [Contributing Guide](https://github.com/langgenius/dify/blob/main/CONTRIBUTING.md) and [Language Policy](https://github.com/langgenius/dify/issues/1542).
- [x] This is only for bug report, if you would like to ask a question, please head to [Discussions](https://github.com/langgenius/dify/discussions/categories/general).
- [x] I have searched for existing issues [search for existing issues](https://github.com/langgenius/dify/issues), including closed ones.
- [x] I confirm that I am using English to submit this report, otherwise it will be closed.
- [x] 【中文用户 & Non English User】请使用英语提交，否则会被关闭 ：）
- [x] Please do not modify this template :) and fill in all the required fields.

### Dify version

v1.11.4

### Cloud or Self Hosted

Self Hosted (Docker)

### Steps to reproduce

### Prerequisites
1. A running Dify environment (API + Redis + PostgreSQL/MySQL)
2. A Workflow application containing a Trigger Plugin node
3. The Workflow has been saved before, with corresponding `workflow_plugin_trigger` records in the database

### Reproduction Steps

**Step 1: Get test data**
```bash
# Find an existing trigger node_id from workflow config or logs
# Assume node_id = "1767869326455"

# Confirm the record exists in database
psql -c "SELECT * FROM workflow_plugin_trigger WHERE node_id = '1767869326455';"
```

**Step 2: Simulate data inconsistency state**
```bash
# 1. Check current cache content (if exists)
redis-cli GET "plugin_trigger_nodes:{app_id}:1767869326455"

# 2. Delete DB record but keep the cache (simulating stale cache after transaction rollback)
psql -c "DELETE FROM workflow_plugin_trigger WHERE node_id = '1767869326455';"

# 3. Manually create a stale cache entry (if cache doesn't exist)
redis-cli SET "plugin_trigger_nodes:{app_id}:1767869326455" '{"record_id":"fake-id-123","node_id":"1767869326455","provider_id":"test/provider","event_name":"TEST_EVENT","subscription_id":"old-subscription-id"}' EX 3600
```

**Step 3: Trigger sync operation**
```bash
# Save the workflow via API or UI
curl -X POST "http://localhost:5001/console/api/apps/{app_id}/workflows/draft" \
  -H "Authorization: Bearer {token}" \
  -H "Content-Type: application/json" \
  -d '{"graph": ...}'
```

**Step 4: Observe the issue**

Check API logs:
```log
INFO - Found trigger node: node_id=1767869326455, subscription_id=xxx
INFO - Total trigger nodes found in workflow graph: 1
INFO - Creating 0 new trigger records    <-- BUG: Found 1 node but creating 0 records
INFO - Sync completed: created=0, updated=0, deleted=0
```

**Step 5: Confirm the bug**
```bash
# Database still has no record
psql -c "SELECT COUNT(*) FROM workflow_plugin_trigger WHERE node_id = '1767869326455';"
# Returns 0

# But stale cache still exists
redis-cli GET "plugin_trigger_nodes:{app_id}:1767869326455"
# Returns the fake data
```

### ✔️ Expected Behavior

1. The sync process should create the missing database record regardless of cache state.
2. The database should be the source of truth, not Redis cache.
3. Stale cache entries should be detected and cleaned before creating new records.
4. Cache should only be updated AFTER successful database commit.

### ❌ Actual Behavior

1. When Redis cache exists but DB record is missing (stale cache), the sync process incorrectly skips record creation.
2. The code trusts cache existence as proof of DB record existence, which is incorrect.
3. No stale cache detection or cleanup mechanism exists.
4. Cache is written BEFORE database commit, causing potential data inconsistency on transaction failure.

### Root Cause Analysis

**Issue 1: Cache-first decision logic (Line 208-230)**
```python
not_found_in_cache: list[Mapping[str, Any]] = []
for node_info in nodes_in_graph:
    node_id = node_info["node_id"]
    if not redis_client.get(f"{cls.__PLUGIN_TRIGGER_NODE_CACHE_KEY__}:{app.id}:{node_id}"):
        not_found_in_cache.append(node_info)
        continue  # Cache hit = skip, never added to processing list

nodes_not_found = [
    node_info for node_info in not_found_in_cache  # Only processes cache-miss nodes
    if node_info["node_id"] not in nodes_id_in_db
]
```

**Issue 2: Cache written before DB commit (Line 232-252)**
```python
for node_info in nodes_not_found:
    session.flush()
    redis_client.set(...)  # Cache written first
session.commit()           # If this fails, cache is now stale
```

**Issue 3: Distributed lock not properly acquired (Line 217)**
```python
redis_client.lock(f"...lock", timeout=10)  # Only creates lock object, never calls acquire()
```

### How Stale Cache is Produced

1. **Transaction failure after cache write**: Cache is written during `flush()`, but if `commit()` fails, the cache becomes stale.
2. **Distributed lock not effective**: The lock object is created but `acquire()` is never called, allowing race conditions where one thread writes cache while another rolls back the database.

---

## Additional Context

| Scenario | Current Behavior | Expected Behavior |
|----------|-----------------|-------------------|
| Cache exists, DB missing | Skip creation (trust cache) | Clear stale cache + create record |
| DB commit fails | Cache already written (inconsistent) | Cache not written |
| Concurrent sync | Lock ineffective (race condition) | Properly serialized |

### Affected File
- `api/services/trigger/trigger_service.py`
- Method: `TriggerService.sync_plugin_trigger_relationships()`

### Related Keys
- Cache Key: `plugin_trigger_nodes:{app_id}:{node_id}`
- Lock Key: `plugin_trigger_nodes:apps:{app_id}:lock`
- DB Table: `workflow_plugin_trigger`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug] Redis cache inconsistency causes plugin trigger records not being created #31130

Self Checks

Dify version

Cloud or Self Hosted

Steps to reproduce

Prerequisites

Reproduction Steps

✔️ Expected Behavior

❌ Actual Behavior

Root Cause Analysis

How Stale Cache is Produced

Additional Context

Affected File

Related Keys

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Scenario	Current Behavior	Expected Behavior
Cache exists, DB missing	Skip creation (trust cache)	Clear stale cache + create record
DB commit fails	Cache already written (inconsistent)	Cache not written
Concurrent sync	Lock ineffective (race condition)	Properly serialized

[Bug] Redis cache inconsistency causes plugin trigger records not being created #31130

Description

Self Checks

Dify version

Cloud or Self Hosted

Steps to reproduce

Prerequisites

Reproduction Steps

✔️ Expected Behavior

❌ Actual Behavior

Root Cause Analysis

How Stale Cache is Produced

Additional Context

Affected File

Related Keys

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions