-
Notifications
You must be signed in to change notification settings - Fork 58
Open
Description
System info:
- OS: Fedora Linux 42
- CPU: 12th Gen Intel® Core™ i5-12400F (12)
- Node:
24.11.1 - @libsql/client:
0.15.15
Steps to reproduce:
- Created the following table:
create table if not exists user_indexed (
id integer primary key autoincrement,
first_name text not null,
last_name text not null,
email text not null,
birthday text,
is_pro integer not null default 0,
deleted_at text,
created_at text default CURRENT_TIMESTAMP,
updated_at text
) strict;- Run the following seed script (
@faker-js/fakeris used to generate random data):
import { memoryUsage } from 'node:process';
import { createClient } from '@libsql/client';
import { faker } from '@faker-js/faker';
const MEMORY_LIMIT = 2_147_483_648; // 2GB
const USER_COUNT = 500_000;
const BATCH_SIZE = 20_000;
const db = createClient({
url: 'file:db.sqlite'
});
// set pragmas
db.execute('pragma journal_mode = wal;')
// generate user data
const users = [];
for (let i = 0; i < USER_COUNT; i++) {
const firstName = faker.person.firstName();
const lastName = faker.person.lastName();
const email = faker.internet.email({ firstName, lastName }).toLowerCase();
const birthday = faker.date.birthdate().toISOString();
const is_pro = faker.number.int({ min: 0, max: 1 });
const today = (new Date()).toISOString();
const created_at = faker.date.between({ from: '2010-01-01', to: today }).toISOString();
const updated_at = faker.date.between({ from: created_at, to: today }).toISOString();
const is_deleted = faker.datatype.boolean();
const deleted_at = is_deleted ? faker.date.between({ from: updated_at, to: today }).toISOString() : null;
users.push({
firstName,
lastName,
email,
birthday,
is_pro,
deleted_at,
created_at,
updated_at
});
}
// create insert statements
const stmts = [];
for (const user of users) {
stmts.push({
sql: `
insert into user_indexed (
first_name,
last_name,
email,
birthday,
is_pro,
deleted_at,
created_at,
updated_at
) values (
:firstName,
:lastName,
:email,
:birthday,
:is_pro,
:deleted_at,
:created_at,
:updated_at
)
`,
args: user
});
}
try {
console.time('seeding');
let count = 0;
while (stmts.length > 0) {
const memUsageInBytes = memoryUsage.rss();
console.log(`memory usage: ${formatBytes(memUsageInBytes)}`)
if (memUsageInBytes >= MEMORY_LIMIT) {
console.error('exceeded memory limit. exiting');
process.exit(1);
}
const batch = stmts.splice(0, BATCH_SIZE);
await db.batch(batch);
count += BATCH_SIZE;
console.log(`seeded ${count} rows...`)
}
console.log('DB has been seeded');
console.timeEnd('seeding');
} catch (e) {
throw e;
}
function formatBytes(bytes) {
const kb = 1024;
const mb = kb * 1024;
const gb = mb * 1024;
if (bytes < kb) {
return bytes + ' B';
} else if (bytes < mb) {
return (bytes / kb).toFixed(2) + ' KB';
} else if (bytes < gb) {
return (bytes / mb).toFixed(2) + ' MB';
} else {
return (bytes / gb).toFixed(2) + ' GB';
}
}- Track the memory usage. From my results, it seems to balloon with every batch of insertions:
$ node scripts/seed-leak.js
memory usage: 372.71 MB
seeded 20000 rows...
memory usage: 483.02 MB
seeded 40000 rows...
memory usage: 623.32 MB
seeded 60000 rows...
memory usage: 734.32 MB
seeded 80000 rows...
memory usage: 852.62 MB
seeded 100000 rows...
memory usage: 963.37 MB
seeded 120000 rows...
memory usage: 1.06 GB
seeded 140000 rows...
memory usage: 1.18 GB
seeded 160000 rows...
memory usage: 1.28 GB
seeded 180000 rows...
memory usage: 1.40 GB
seeded 200000 rows...
memory usage: 1.51 GB
seeded 220000 rows...
memory usage: 1.63 GB
seeded 240000 rows...
memory usage: 1.75 GB
seeded 260000 rows...
memory usage: 1.85 GB
seeded 280000 rows...
memory usage: 1.97 GB
seeded 300000 rows...
memory usage: 2.08 GB
exceeded memory limit. exiting
Some observations that may be important:
- I did not experience the memory leak when replacing
@libsql/clientwith[email protected]. - I did not experience the memory leak when replacing
@libsql/clientwithlibsql, which has a better-sqlite3-compatible API (however, it was considerably slower than better-sqlite3). - In both cases, I wrapped inserts in transactions.
Metadata
Metadata
Assignees
Labels
No labels