You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
revert(Session): remove database_ref_ to fix Alpine/musl crashes
Revert Session class to the simple design from b5835fa (Dec 2025),
removing the Napi::ObjectReference database_ref_ that was added in
commit a151cb6 and subsequent "fixes" (fb283df through 611d330).
Root cause: N-API reference manipulation during GC finalization corrupts
V8's JIT page allocations on Alpine/musl. The database_ref_ field caused
napi_delete_reference to be called from Session destructors during GC,
which is documented as unsafe in nodejs/node-addon-api#660.
The simple design works because:
- Session uses raw DatabaseSync* pointer (no ObjectReference)
- db.close() calls DeleteAllSessions() which cleans up SQLite sessions
- Session operations check database_->IsOpen() before proceeding
- This matches both Node.js upstream (uses weak references) and
better-sqlite3 (uses raw pointers)
Validated: 793 tests pass locally, 5 consecutive runs pass on Alpine.
-**Key Constraints**: Must identify root cause (memory corruption, race condition, or CI environment issue)
8
8
-**Success Validation**: 10 consecutive CI runs without native crashes
9
9
10
-
## Current Status: FINAL FIX IMPLEMENTED
10
+
## Current Status: ROOT CAUSE IDENTIFIED - REVERT REQUIRED
11
11
12
-
**Five distinct root causes found and fixed.**Testing shows no crashes in local Alpine Docker tests.
12
+
**The "fixes" since commit `a151cb6` have made things WORSE.**The root cause is adding `Napi::ObjectReference database_ref_` to Session class. This causes N-API reference manipulation during GC finalization, which corrupts V8 on Alpine/musl.
13
13
14
-
### Evidence of Original Flakiness
15
-
16
-
Same commit (`80fa40d`) showed different failures across runs:
**Key insight**: Calling `Napi::ObjectReference::Reset()` during GC finalization (destructor path) is NOT safe on Alpine/musl. Even the ObjectReference destructor calling `napi_delete_reference` can cause issues.
57
57
58
-
**The bug**: When `db.close()` calls `DeleteAllSessions()`, it directly sets `session->session_ = nullptr`, causing `Session::Delete()` to return early and never call `database_ref_.Reset()`.
58
+
### 4. The Stable Version Had No `database_ref_`
59
59
60
-
**Fix**: Call `database_ref_.Reset()` in `DeleteAllSessions()` after cleaning up each session.
60
+
At commit `50354e1` (stable), Session class was simple:
61
+
```cpp
62
+
classSession : publicNapi::ObjectWrap<Session> {
63
+
sqlite3_session *session_ = nullptr;
64
+
DatabaseSync *database_ = nullptr; // Just a raw pointer, NO ObjectReference
65
+
};
66
+
```
61
67
62
-
**Commit**: `fb283df`
68
+
---
63
69
64
-
### Bug 2c: Mutex Deadlock Causing SIGSEGV
70
+
## Timeline of Changes
65
71
66
-
**The bug**: `DeleteAllSessions()` held `sessions_mutex_` while calling `database_ref_.Reset()`. Reset can trigger GC, which finalizes other Session objects, which call `Delete()` → `RemoveSession()` → tries to lock already-held mutex → **undefined behavior**.
72
+
| Date | Commit | Description | CI Status |
73
+
|------|--------|-------------|-----------|
74
+
| Dec 17 | `50354e1` | Stable release 0.3.0 | ✅ Green |
75
+
| Jan 12 | - | Last confirmed green CI | ✅ Green |
76
+
| Jan 26 | `a151cb6` | Added `database_ref_` to Session | ❌ Crashes started |
77
+
| Jan 26 | `fb283df` | Release database_ref_ in DeleteAllSessions | ❌ Still crashing |
78
+
| Jan 27 | `dadbb86` | Release mutex before GC operations | ❌ Still crashing |
79
+
| Jan 27 | `3a6aaff` | Prevent double-free in DeleteAllSessions | ❌ Still crashing |
80
+
| Jan 27 | `e1a3fcb` | Prevent dangling pointers | ❌ Still crashing |
81
+
| Jan 29 | `413c93f` | Hold refs during cleanup | ❌ Still crashing |
82
+
| Jan 29 | `611d330` | Skip Reset() in destructor | ❌ Still crashing |
67
83
68
-
**Fix**: Release mutex before the cleanup loop.
84
+
**Every "fix" commit has failed to resolve the issue because they all keep `database_ref_`.**
69
85
70
-
**Commit**: `dadbb86`
86
+
---
71
87
72
-
### Bug 2d: Dangling Pointers During Iteration
88
+
## Recommended Fix: Revert to Simple Design
73
89
74
-
**The bug**: Calling `database_ref_.Reset()` during iteration could trigger GC, which may finalize Session JS objects still in the iteration list, creating dangling pointers.
90
+
### Option 1: Full Revert (Recommended)
75
91
76
-
**Initial fix attempt**: Removed `Reset()` calls entirely from `DeleteAllSessions()`.
92
+
Revert Session-related changes back to `b5835fa` (before `a151cb6`):
### Bug 2e: V8 JIT Corruption on Alpine During Jest Cleanup
98
+
The simple design:
99
+
- Session uses raw `DatabaseSync*` pointer (no ObjectReference)
100
+
-`DeleteAllSessions()` just cleans up SQLite sessions
101
+
- Session::Delete() removes from database's session list
102
+
- No N-API reference manipulation in destructors
81
103
82
-
**The bug**: When Sessions are GC'd during Jest cleanup (process exit), their `Napi::ObjectReference` destructors call `Reset()`. On Alpine/musl, this corrupts V8's JIT page allocations.
104
+
### Option 2: Match Node.js Upstream Pattern
83
105
84
-
**How it manifested**:
106
+
If we want Session to survive database GC, implement weak reference pattern:
107
+
1. Don't hold strong reference to database
108
+
2. Check if database is still valid before operations
4. V8 JIT corruption: `Check failed: it != jit_page_->allocations_.end()`
90
-
5.**SIGSEGV** or **SIGABRT**
111
+
### Why Raw Pointers Are OK
91
112
92
-
**Partial Fix**: Hold strong references to Session JS objects during `DeleteAllSessions()` cleanup, preventing GC from destroying them while we iterate.
113
+
The stable version used raw pointers and worked because:
114
+
1.`db.close()` calls `DeleteAllSessions()` which cleans up SQLite sessions
115
+
2. Session operations check `database_->IsOpen()` before proceeding
116
+
3. If user GC's database without calling close() while holding sessions, that's undefined behavior (same as better-sqlite3)
**The bug**: Even with the `DeleteAllSessions()` fix, crashes still occurred when Sessions were GC'd without an explicit `db.close()` call. The root cause: `Session::Delete()` was calling `database_ref_.Reset()` during GC finalization, which corrupts V8's JIT on Alpine/musl.
122
+
All these changes should be reverted:
99
123
100
-
**Key Insight**: The `Napi::ObjectReference` destructor is designed to be GC-safe. But **explicitly** calling `Reset()` during GC finalization is NOT safe on Alpine/musl.
124
+
**src/sqlite_impl.h:**
125
+
- Added `Napi::ObjectReference database_ref_;` to Session class
126
+
- Added `bool in_destructor_ = false;` flag
101
127
102
-
**The Fix**: Add an `in_destructor_` flag to Session. When `Delete()` is called from the destructor, skip the explicit `Reset()` call and let the ObjectReference destructor handle cleanup naturally.
0 commit comments