This repository was archived by the owner on Jan 7, 2025. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 7
/
Copy pathREADME
1768 lines (1273 loc) · 70.3 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
# LMDB Manual
###### \[in package LMDB\]
## LMDB ASDF System
- Version: 0.1
- Description: Bindings to LMDB, the Lightning Memory-mapped Database.
- Licence: MIT, see COPYING.
- Author: Fernando Borretti <[email protected]>, James Anderson <[email protected]>, Gábor Melis <[email protected]>
- Maintainer: Fernando Borretti <[email protected]>
- Homepage: [https://github.com/antimer/lmdb](https://github.com/antimer/lmdb)
- Bug tracker: [https://github.com/antimer/lmdb/issues](https://github.com/antimer/lmdb/issues)
- Source control: [GIT]([email protected]:antimer/lmdb.git)
## Links
Here is the [official repository](https://github.com/antimer/lmdb)
and the [HTML
documentation](http://melisgl.github.io/mgl-pax-world/lmdb-manual.html)
for the latest version.
## Introduction
[LMDB](http://www.lmdb.tech/doc/), the Lightning Memory-mapped
Database, is an [ACID](https://en.wikipedia.org/wiki/ACID) key-value
database with
[MVCC](https://en.wikipedia.org/wiki/Multiversion_concurrency_control).
It is a small C library ("C lmdb" from now on), around which LMDB is
a Common Lisp wrapper. LMDB covers most of C lmdb's functionality,
has a simplified API, much needed @LMDB/SAFETY checks, and
comprehensive documentation.
Compared to other key-value stores, LMDB's distuingishing features
are:
- Transactions span multiple keys.
- Embedded. It has no server but can be used concurrently not only
by multiple threads but by multiple OS processes, too.
- Extremely high read performance: millions of transactions per
second.
- Very low maintenance.
Other notable things:
- With its default - the most durable - settings, it has average
write performance, which is bottlenecked by `fsync()`.
- Readers don't block readers or writers, but there is at most one
writer at a time.
- Extremely simple, crash-proof design.
- The entire database (called *environment*) is backed by a single
memory-mapped file, with a
[copy-on-write](https://en.wikipedia.org/wiki/Copy-on-write)
[B+ tree](https://en.wikipedia.org/wiki/B%2B_tree).
- No transaction log.
- It is very much like [Berkeley
DB](https://en.wikipedia.org/wiki/Berkeley_DB) done right, without
the fluff and much improved administration.
Do read the [Caveats](http://www.lmdb.tech/doc/), though. On the
Lisp side, this library **will not work with virtual threads**
because LMDB's write locking is tied to native threads.
Using LMDB is easy:
```
(with-temporary-env (*env*)
(let ((db (get-db "test")))
(with-txn (:write t)
(put db "k1" #(2 3))
(print (g3t db "k1")) ; => #(2 3)
(del db "k1"))))
```
More typically, the environment and databases are opened once so
that multiple threads and transactions can access them:
```
(defvar *test-db*)
(unless *env*
(setq *env* (open-env "/tmp/lmdb-test-env/" :if-does-not-exist :create))
(setq *test-db* (get-db "test" :value-encoding :utf-8)))
(with-txn (:write t)
(put *test-db* 1 "hello")
(print (g3t *test-db* 1)) ; => "hello"
(del *test-db* 1))
```
Note how :VALUE-ENCODING sneaked in above. This was so to make G3T
return a string instead of an octet vector.
LMDB treats keys and values as opaque byte arrays to be hung on a B+
tree, and only requires a comparison function to be defined over
keys. LMDB knows how to serialize the types `(UNSIGNED-BYTE 64)` and
[STRING][type] (which are often used as keys so sorting must work as
expected). Serialization of the rest of the datatypes is left to the
client. See @LMDB/ENCODINGS for more.
## Design and implementation
### Safety
The lmdb C API trusts client code to respect its rules. Being C,
managing object lifetimes is the biggest burden. There are also
rules that are documented, but not enforced. This Lisp wrapper tries
to enforce these rules itself and manage object lifetimes in a safe
way to avoid data corruption. How and what it does is described in
the following.
##### Environments
- OPEN-ENV checks that the same path is not used in multiple open
environments to prevent locking issues documented in
[Caveats](http://www.lmdb.tech/doc/).
- CLOSE-ENV waits until all @ACTIVE-TRANSACTIONs are finished before
actually closing the environment. Alternatively, if OPEN-ENV was
called with :SYNCHRONIZED NIL, to avoid the overhead of
synchronization, the environment is closed only when garbage
collected.
##### Transactions
- Checks are made to detect illegal operations on parent
transactions (see LMDB-ILLEGAL-ACCESS-TO-PARENT-TXN-ERROR).
- Access to closed transactions is reliably detected.
- C LMDB allows read transactions to be used in multiple threads.
The synchronization cost of performing this safely (i.e. without
risking access to closed and freed transaction objects) is
significant so this is not supported.
##### Databases
- [mdb\_dbi\_open()](http://www.lmdb.tech/doc/group__mdb.html#gac08cad5b096925642ca359a6d6f0562a)
is wrapped by GET-DB in a transaction and is protected by a mutex
to comply with C lmdb's requirements:
A transaction that opens a database must finish (either
commit or abort) before another transaction may open it.
Multiple concurrent transactions cannot open the same
database.
- [mdb\_dbi\_close()](http://www.lmdb.tech/doc/group__mdb.html#ga52dd98d0c542378370cd6b712ff961b5)
is too dangerous to be exposed as explained in the GET-DB
documentation.
- For similar reasons, DROP-DB is wrapped in WITH-ENV.
- [mdb\_env\_set\_mapsize()](http://www.lmdb.tech/doc/group__mdb.html#gaa2506ec8dab3d969b0e609cd82e619e5),
[mdb\_env\_set\_max\_readers()](http://www.lmdb.tech/doc/group__mdb.html#gae687966c24b790630be2a41573fe40e2),
and [mdb\_env\_set\_maxdbs()](http://www.lmdb.tech/doc/group__mdb.html#gaa2fc2f1f37cb1115e733b62cab2fcdbc)
are only available through OPEN-ENV because they either require that
there are no write transactions or do not work on open environments.
##### Cursors
- As even read transactions are restricted to a single thread, so
are cursors. Using a cursor from a thread other than the one in
which it was created (i.e. the thread of its transaction) raises
LMDB-CURSOR-THREAD-ERROR. In return for this restriction, access
to cursors belonging to closed transactions is reliably detected.
##### Signal handling
The C lmdb library handles system calls being interrupted (`EINTR`
and `EAGAIN`), but unwinding the stack from interrupts in the middle
of LMDB calls can leave the in-memory data structures such as
transactions inconsistent. If this happens, their further use risks
data corruption. For this reason, calls to LMDB are performed with
interrupts disabled. For SBCL, this means SB-SYS:WITHOUT-INTERRUPTS.
It is an error when compiling LMDB if an equivalent facility is not
found in the Lisp implementation. A warning is signalled if no
substitute is found for SB-SYS:WITH-INTERRUPTS because this makes
the body of WITH-ENV, WITH-TXN, WITH-CURSOR and similar
uninterruptible.
Operations that do not modify the database (G3T, CURSOR-FIRST,
CURSOR-VALUE, etc) are async unwind safe, and for performance they
are called without the above provisions.
Note that the library is not reentrant, so don't call LMDB from
signal handlers.
### Deviations from the C lmdb API
The following are the most prominent deviations and omissions from
the C lmdb API in addition to those listed in @LMDB/SAFETY.
##### Environments
- [mdb\_reader\_list()](http://www.lmdb.tech/doc/group__mdb.html#ga8550000cd0501a44f57ee6dff0188744)
is not implemented.
- [mdb\_env\_copy()](http://www.lmdb.tech/doc/group__mdb.html#ga5d51d6130325f7353db0955dbedbc378)
and its close kin are not yet implemented.
##### Transactions
- Read-only WITH-TXNs are turned into noops when "nested" (unless
IGNORE-PARENT).
##### Databases
- [mdb\_set\_compare()](http://www.lmdb.tech/doc/group__mdb.html#ga68e47ffcf72eceec553c72b1784ee0fe)
and [mdb\_set\_dupsort()](http://www.lmdb.tech/doc/group__mdb.html#gacef4ec3dab0bbd9bc978b73c19c879ae)
are not exposed. If they are needed, implement a foreign comparison
function and call LIBLMDB:SET-COMPARE or LIBLMDB:SET-DUPSORT
directly or perhaps change the encoding of the data.
- Working with multiple contiguous values with DUPFIXED is not yet
implemented. This functionality would belong in PUT, CURSOR-PUT,
CURSOR-NEXT and CURSOR-VALUE.
- PUT, CURSOR-PUT do not support the
[`RESERVE`](http://www.lmdb.tech/doc/group__mdb__put.html#gac0545c6aea719991e3eae6ccc686efcc)
flag.
## Library versions
- [function] LMDB-FOREIGN-VERSION
Return the version of the C lmdb library as a string like `0.9.26`.
Wraps [mdb\_version()](http://www.lmdb.tech/doc/group__mdb.html#ga0e5d7298fc39b3c187fffbe30264c968).
- [function] LMDB-BINDING-VERSION
Return a string representing the version of C lmdb based on which
the CFFI bindings were created. The version string has the same
format as LMDB-FOREIGN-VERSION.
## Environments
An environment (class ENV) is basically a single memory-mapped file
holding all the data, plus some flags determining how we interact
it. An environment can have multiple databases (class DB), each of
which is a B+ tree within the same file. An environment is like a
database in a relational db, and the databases in it are like tables
and indices. The terminology comes from [Berkeley
DB](https://docs.oracle.com/cd/E17276_01/html/programmer_reference/env.html).
### Environments reference
- [class] ENV
An environment object through which a memory-mapped
data file can be accessed. Always to be created by OPEN-ENV.
- [reader] ENV-PATH ENV (:PATH)
The location of the memory-mapped file and the
environment lock file.
- [reader] ENV-MAX-DBS ENV (:MAX-DBS)
The maximum number of named databases in the
environment. Currently a moderate number is cheap, but a huge
number gets expensive: 7-120 words per transaction, and every
GET-DB does a linear search of the opened database.
- [reader] ENV-MAX-READERS ENV (:MAX-READERS)
The maximum number of threads/reader slots. See
the documentation of the [reader lock
table](http://lmdb.tech/doc/group__readers.html) for more.
- [reader] ENV-MAP-SIZE ENV (:MAP-SIZE)
Specifies the size of the data file in bytes.
- [reader] ENV-MODE ENV (:MODE)
- [reader] ENV-FLAGS ENV (:FLAGS)
A plist of the options as captured by OPEN-ENV.
For example, `(:FIXED-MAP NIL :SUBDIR T ...)`.
### Opening and closing environments
- [variable] *ENV-CLASS* ENV
The default class OPEN-ENV instaniates. Must be a subclass of ENV.
This provides a way to associate application specific data with ENV
objects.
- [function] OPEN-ENV PATH &KEY (CLASS \*ENV-CLASS\*) (IF-DOES-NOT-EXIST :ERROR) (SYNCHRONIZED T) (MAX-DBS 1) (MAX-READERS 126) (MAP-SIZE (\* 1024 1024)) (MODE 436) (SUBDIR T) (SYNC T) (META-SYNC T) READ-ONLY (TLS T) (READ-AHEAD T) (LOCK T) (MEM-INIT T) FIXED-MAP WRITE-MAP MAP-ASYNC
Create an ENV object through which the LMDB environment can be
accessed and open it. To prevent corruption, an error is signalled
if the same data file is opened multiple times. However, the checks
performed do not work on remote filesystems (see ENV-PATH).
LMDB-ERROR is signalled if opening the environment fails for any
other reason.
Unless explicitly noted, none of the arguments persist (i.e. they
are not saved in the data file).
PATH is the filesystem location of the environment files (see SUBDIR
below for more). Do not use LMDB data files on remote filesystems,
even between processes on the same host. This breaks `flock()` on
some OSes, possibly memory map sync, and certainly sync between
programs on different hosts.
IF-DOES-NOT-EXIST determines what happens if ENV-PATH does not
exists:
- :ERROR: An error is signalled.
- :CREATE: A new memory-mapped file is created ensuring that all
containing directories exist.
- `NIL`: Return NIL without doing anything.
See CLOSE-ENV for the description of SYNCHRONIZED.
- MAX-DBS: The maximum number of named databases in the environment.
Currently a moderate number is cheap, but a huge number gets
expensive: 7-120 words per transaction, and every GET-DB does a
linear search of the opened database.
- MAP-SIZE: Specifies the size of the data file in bytes. The new
size takes effect immediately for the current process, but will
not be persisted to any others until a write transaction has been
committed by the current process. Also, only map size increases
are persisted into the environment. If the map size is increased
by another process, and data has grown beyond the range of the
current mapsize, starting a new transaction (see WITH-TXN) will
signal LMDB-MAP-RESIZED-ERROR. If zero is specified for MAP-SIZE,
then the persisted size is used from the data file. Also see
LMDB-MAP-FULL-ERROR.
- MODE: Unix file mode for files created. The default is `#o664`.
Has no effect when opening an existing environment.
The rest of the arguments correspond to LMDB environment flags and
are available in the plist ENV-FLAGS.
- SUBDIR: If SUBDIR, then the path is a directory which holds the
`data.mdb` and the `lock.mdb` files. If SUBDIR is NIL, the path
is the filename of the data file and the lock file has the same
name plus a `-lock` suffix.
- SYNC: If NIL, don't `fsync` after commit. This optimization means
a system crash can corrupt the database or lose the last
transactions if buffers are not yet flushed to disk. The risk is
governed by how often the system flushes dirty buffers to disk and
how often SYNC-ENV is called. However, if the filesystem preserves
write order (very few do) and the WRITE-MAP (currently
unsupported) flag is not used, transactions exhibit
ACI (atomicity, consistency, isolation) properties and only lose
D (durability). I.e. database integrity is maintained, but a
system crash may undo the final transactions.
- META-SYNC: If NIL, flush system buffers to disk only once per
transaction, but omit the metadata flush. Defer that until the
system flushes files to disk, the next commit of a non-read-only
transaction or SYNC-ENV. This optimization maintains database
integrity, but a system crash may undo the last committed
transaction. I.e. it preserves the ACI (atomicity, consistency,
isolation) but not D (durability) database property.
- READ-ONLY: Map the data file in read-only mode. It is an error to
try to modify anything in it.
- TLS: Setting it to NIL allows each OS thread to have multiple
read-only transactions (see WITH-TXN's IGNORE-PARENT argument). It
also allows and transactions not to be tied to a single thread,
but that's quite dangerous, see @LMDB/SAFETY.
- READ-AHEAD: Turn off readahead as in `madvise(MADV_RANDOM)`. Most
operating systems perform read-ahead on read requests by default.
This option turns it off if the OS supports it. Turning it off may
help random read performance when the DB is larger than RAM and
system RAM is full. This option is not implemented on Windows.
- LOCK: Data corruption lurks here. If NIL, don't do any locking. If
concurrent access is anticipated, the caller must manage all
concurrency itself. For proper operation the caller must enforce
single-writer semantics, and must ensure that no readers are using
old transactions while a writer is active. The simplest approach
is to use an exclusive lock so that no readers may be active at
all when a writer begins.
- MEM-INIT: If NIL, don't initialize `malloc`ed memory before
writing to unused spaces in the data file. By default, memory for
pages written to the data file is obtained using `malloc`. While
these pages may be reused in subsequent transactions, freshly
`malloc`ed pages will be initialized to zeroes before use. This
avoids persisting leftover data from other code (that used the
heap and subsequently freed the memory) into the data file. Note
that many other system libraries may allocate and free memory from
the heap for arbitrary uses. E.g., stdio may use the heap for file
I/O buffers. This initialization step has a modest performance
cost so some applications may want to disable it using this flag.
This option can be a problem for applications which handle
sensitive data like passwords, and it makes memory checkers like
Valgrind noisy. This flag is not needed with WRITE-MAP, which
writes directly to the mmap instead of using malloc for pages.
- FIXED-MAP (experimental): This flag must be specified when
creating the environment and is stored persistently in the data
file. If successful, the memory map will always reside at the same
virtual address and pointers used to reference data items in the
database will be constant across multiple invocations. This option
may not always work, depending on how the operating system has
allocated memory to shared libraries and other uses.
Unsupported flags (an error is signalled if they are changed from
their default values):
- WRITE-MAP: Use a writable memory map unless READ-ONLY is set. This
is faster and uses fewer mallocs, but loses protection from
application bugs like wild pointer writes and other bad updates
into the database. Incompatible with nested transactions. This may
be slightly faster for DBs that fit entirely in RAM, but is slower
for DBs larger than RAM. Do not mix processes with and without
WRITE-MAP on the same environment. This can defeat
durability (SYNC-ENV, etc).
- MAP-ASYNC: When using WRITE-MAP, use asynchronous flushes to disk.
As with SYNC NIL, a system crash can then corrupt the database or
lose the last transactions. Calling #sync ensures on-disk database
integrity until next commit.
Open environments have a finalizer attached to them that takes care
of freeing foreign resources. Thus, the common idiom:
```
(setq *env* (open-env "some-path"))
```
is okay for development, too. No need to always do WITH-ENV,
which does not mesh with threads anyway.
Wraps [mdb\_env\_create()](http://www.lmdb.tech/doc/group__mdb.html#gaad6be3d8dcd4ea01f8df436f41d158d4)
and [mdb\_env\_open()](http://www.lmdb.tech/doc/group__mdb.html#ga32a193c6bf4d7d5c5d579e71f22e9340).
- [function] CLOSE-ENV ENV &KEY FORCE
Close ENV and free the memory. Closing an already closed ENV has no effect.
Since accessing @LMDB/TRANSACTIONS, @LMDB/DATABASES and
@LMDB/CURSORS after closing their environment would risk database
curruption, CLOSE-ENV makes sure that they are not in use. There are
two ways this can happen:
- If ENV was opened :SYNCHRONIZED (see OPEN-ENV), then CLOSE-ENV
waits until there are no @ACTIVE-TRANSACTIONs in ENV before
closing it. This requires synchronization and introduces some
overhead, which might be noticable for workloads involving lots of
quick read transactions. It is an LMDB-ERROR to attempt to close
an environment in a WITH-TXN to avoid deadlocks.
- On the other hand, if SYNCHRONIZED was NIL, then - unless FORCE is
true - calling CLOSE-ENV signals an LMDB-ERROR to avoid the
@LMDB/SAFETY issues involved in closing the environment.
Environments opened with :SYNCHRONIZED NIL are only closed when
they are garbage collected and their finalizer is run. Still, for
production it might be worth it to gain the last bit of
performance.
Wraps [mdb\_env\_close()](http://www.lmdb.tech/doc/group__mdb.html#ga4366c43ada8874588b6a62fbda2d1e95).
- [variable] *ENV* NIL
The default ENV for macros and function that take an environment
argument.
- [macro] WITH-ENV (ENV PATH &REST OPEN-ENV-ARGS) &BODY BODY
Bind the variable ENV to a new enviroment returned by OPEN-ENV
called with PATH and OPEN-ENV-ARGS, execute BODY, and CLOSE-ENV. The
following example binds the default environment:
```
(with-env (*env* "/tmp/lmdb-test" :if-does-not-exist :create)
...)
```
- [function] OPEN-ENV-P ENV
See if ENV is open, i.e. OPEN-ENV has been called on it without a
corresponding CLOSE-ENV.
### Miscellaneous environment functions
- [function] CHECK-FOR-STALE-READERS &OPTIONAL (ENV \*ENV\*)
Check for stale entries in the reader lock table. See
[Caveats](http://www.lmdb.tech/doc/). This function is called
automatically by OPEN-ENV. If other OS processes or threads
accessing ENV abort without closing read transactions, call this
function periodically to get rid off them. Alternatively, close all
environments accessing the data file.
Wraps [mdb\_reader\_check()](http://www.lmdb.tech/doc/group__mdb.html#ga366923d08bb384b3d9580a98edf5d668).
- [function] ENV-STATISTICS &OPTIONAL (ENV \*ENV\*)
Return statistics about ENV as a plist.
- :PAGE-SIZE: The size of a database page in bytes.
- :DEPTH: The height of the B-tree.
- :BRANCH-PAGES: The number of internal (non-leaf) pages.
- :LEAF-PAGES: The number of leaf pages.
- :OVERFLOW-PAGES: The number of overflow pages.
- :ENTRIES: The number of data items.
Wraps [mdb\_env\_stat()](http://www.lmdb.tech/doc/group__mdb.html#gaf881dca452050efbd434cd16e4bae255).
- [function] ENV-INFO &OPTIONAL (ENV \*ENV\*)
Return information about ENV as a plist.
- :MAP-ADDRESS: Address of memory map, if fixed (see OPEN-ENV's
FIXED-MAP).
- :MAP-SIZE: Size of the memory map in bytes.
- :LAST-PAGE-NUMBER: Id of the last used page.
- :LAST-TXN-ID: Id of the last committed transaction.
- :MAXIMUM-READERS: The number of reader slots.
- :N-READERS: The number of reader slots current used.
Wraps [mdb\_env\_info()](http://www.lmdb.tech/doc/group__mdb.html#ga18769362c7e7d6cf91889a028a5c5947).
- [function] SYNC-ENV &OPTIONAL (ENV \*ENV\*)
Flush the data buffers to disk as in calling `fsync()`. When ENV
had been opened with :SYNC NIL or :META-SYNC NIL, this may be handy
to force flushing the OS buffers to disk, which avoids potential
durability and integrity issues.
Wraps [mdb\_env\_sync()](http://www.lmdb.tech/doc/group__mdb.html#ga85e61f05aa68b520cc6c3b981dba5037).
- [function] ENV-MAX-KEY-SIZE &OPTIONAL (ENV \*ENV\*)
Return the maximum size of keys and @DUPSORT data in bytes. Depends
on the compile-time constant `MDB_MAXKEYSIZE` in the C library. The
default is 511. If this limit is exceeded LMDB-BAD-VALSIZE-ERROR is
signalled.
Wraps [mdb\_env\_get\_maxkeysize()](http://www.lmdb.tech/doc/group__mdb.html#gaaf0be004f33828bf2fb09d77eb3cef94).
- [macro] WITH-TEMPORARY-ENV (ENV &REST OPEN-ENV-ARGS) &BODY BODY
Run BODY with an open temporary environment bound to ENV. In more
detail, create an environment in a fresh temporary directory in an
OS specific location. OPEN-ENV-ARGS is a list of keyword arguments
and values for OPEN-ENV. This macro is intended for testing and
examples.
```
(with-temporary-env (*env*)
(let ((db (get-db "test")))
(with-txn (:write t)
(put db "k1" #(2 3))
(print (g3t db "k1")) ; => #(2 3)
(del db "k1"))))
```
Since data corruption in temporary environments is not a concern,
unlike WITH-ENV, WITH-TEMPORARY-ENV closes the environment even if
it was opened with :SYNCHRONIZED NIL (see OPEN-ENV and
CLOSE-ENV).
## Transactions
The LMDB environment supports transactional reads and writes. By
default, these provide the standard ACID (atomicity, consistency,
isolation, durability) guarantees. Writes from a transaction are not
immediately visible to other transactions. When the transaction is
committed, all its writes become visible atomically for future
transactions even if Lisp crashes or there is power failure. If the
transaction is aborted, its writes are discarded.
Transactions span the entire environment (see ENV). All the updates
made in the course of an update transaction - writing records across
all databases, creating databases, and destroying databases - are
either completed atomically or rolled back.
Write transactions can be nested. Child transactions see the
uncommitted writes of their parent. The child transaction can commit
or abort, at which point its writes become visible to the parent
transaction or are discarded. If the parent aborts, all of the
writes performed in the context of the parent, including those from
committed child transactions, are discarded.
- [macro] WITH-TXN (&KEY (ENV '\*ENV\*) WRITE IGNORE-PARENT (SYNC T) (META-SYNC T)) &BODY BODY
Start a transaction in ENV, execute BODY. Then, if the transaction
is open (see OPEN-TXN-P) and BODY returned normally, attempt to
commit the transaction. Next, if BODY performed a non-local exit or
committing failed, but the transaction is still open, then abort it.
It is explicitly allowed to call COMMIT-TXN or ABORT-TXN within
WITH-TXN.
Transactions provide ACID guarantees (with SYNC and META-SYNC both
on). They span the entire environment, they are not specific to
individual DB.
- If WRITE is NIL, the transaction is read-only and no writes (e.g.
PUT) may be performed in the transaction. On the flipside, many
read-only transactions can run concurrently (see ENV-MAX-READERS),
while write transactions are mutually exclusive. Furthermore, the
single write transaction can also run concurrently with read
transactions, just keep in mind that read transactions hold on to
the state of the environment at the time of their creation and
thus prevent pages since replaced from being reused.
- If IGNORE-PARENT is true, then in an enclosing WITH-TXN, instead
of creating a child transaction, start an independent transaction.
- If SYNC is NIL, then no flushing of buffers will take place after
a commit as if the environment had been opened with :SYNC NIL.
- Likewise, META-SYNC is the per-transaction equivalent of the
OPEN-ENV's META-SYNC.
Also see @LMDB/NESTING-TRANSACTIONS.
Wraps [mdb\_txn\_begin()](http://www.lmdb.tech/doc/group__mdb.html#gad7ea55da06b77513609efebd44b26920).
- [glossary-term] active transaction
The active transaction in some environment and thread is the
transaction of the innermost WITH-TXN being executed in the thread
that belongs to the environment. In most cases, this is simply the
enclosing WITH-TXN, but if WITH-TXNs with different :ENV arguments
are nested, then it may not be:
```
(with-temporary-env (env)
(let ((db (get-db "db" :env env)))
(with-temporary-env (inner-env)
(with-txn (:env env :write t)
(with-txn (:env inner-env)
(put db #(1) #(2)))))))
```
In the above example, DB is known to belong to ENV so although the
immediately enclosing transaction belongs to INNER-ENV, PUT is
executed in context of the outer, write transaction because that's
the innermost in ENV.
Operations that require a transaction always attempt to use the
active transaction even if it is not open (see OPEN-TXN-P).
- [function] OPEN-TXN-P &OPTIONAL ENV
See if there is an active transaction and it is open, i.e.
COMMIT-TXN or ABORT-TXN have not been called on it. Also, RESET-TXN
without a corresponding RENEW-TXN closes the transaction.
- [function] TXN-ID
The ID of TXN. IDs are integers incrementing from 1. For a
read-only transaction, this corresponds to the snapshot being read;
concurrent readers will frequently have the same transaction ID.
Only committed write transactions increment the ID. If a transaction
aborts, the ID may be re-used by the next writer.
- [function] COMMIT-TXN &OPTIONAL ENV
Commit the innermost enclosig transaction (or @ACTIVE-TRANSACTION
belonging to ENV if ENV is specified) or signal an error if it is
not open. If TXN is not nested in another transaction, committing
makes updates performed visible to future transactions. If TXN is a
child transaction, then committing makes updates visible to its
parent only. For read-only transactions, committing releases the
reference to a historical version environment, allowing reuse of
pages replaced since.
Wraps [mdb\_txn\_commit()](http://www.lmdb.tech/doc/group__mdb.html#ga846fbd6f46105617ac9f4d76476f6597).
- [function] ABORT-TXN &OPTIONAL ENV
Close TXN by discarding all updates performed, which will then not
be visible to either parent or future transactions. Aborting an
already closed transaction is a noop. Always succeeds.
Wraps [mdb\_txn\_abort()](http://www.lmdb.tech/doc/group__mdb.html#ga73a5938ae4c3239ee11efa07eb22b882).
- [function] RENEW-TXN &OPTIONAL ENV
Renew TXN that was reset by RESET-TXN. This acquires a new reader
lock that had been released by RESET-TXN. After renewal, it is as if
TXN had just been started.
Wraps [mdb\_txn\_renew()](http://www.lmdb.tech/doc/group__mdb.html#ga6c6f917959517ede1c504cf7c720ce6d).
- [function] RESET-TXN &OPTIONAL ENV
Abort the open, read-only TXN, release the reference to the
historical version of the environment, but make it faster to start
another read-only transaction with RENEW-TXN. This is accomplished
by not deallocating some data structures, and keeping the slot in
the reader table. Cursors opened within the transaction must not be
used again, except if renewed (see RENEW-CURSOR). If TXN is an open,
read-only transaction, this function always succeeds.
Wraps [mdb\_txn\_reset()](http://www.lmdb.tech/doc/group__mdb.html#ga02b06706f8a66249769503c4e88c56cd).
### Nesting transactions
When WITH-TXNs are nested (i.e. one is executed in the dynamic
extent of another), we speak of nested transactions. Transaction can
be nested to arbitrary levels. Child transactions may be committed
or aborted independently from their parent transaction (the
immediately enclosing WITH-TXN). Committing a child transaction only
makes the updates made by it visible to the parent. If the parent
then aborts, the child's updates are aborted too. If the parent
commits, all child transactions that were not aborted are committed,
too.
Actually, the C lmdb library only supports nesting write
transactions. To simplify usage, the Lisp side turns read-only
WITH-TXNs nested in another WITH-TXNs into noops.
```
(with-temporary-env (*env*)
(let ((db (get-db "test" :value-encoding :uint64)))
;; Create a top-level write transaction.
(with-txn (:write t)
(put db "p" 0)
;; First child transaction
(with-txn (:write t)
;; Writes of the parent are visible in children.
(assert (= (g3t db "p") 0))
(put db "c1" 1))
;; Parent sees what the child committed (but it's not visible to
;; unrelated transactions).
(assert (= (g3t db "c1") 1))
;; Second child transaction
(with-txn (:write t)
;; Sees writes from the parent that came from the first child.
(assert (= (g3t db "c1") 1))
(put db "c1" 2)
(put db "c2" 2)
(abort-txn)))
;; Create a top-level read transaction to check what was committed.
(with-txn ()
;; Since the second child aborted, its writes are discarded.
(assert (= (g3t db "p") 0))
(assert (= (g3t db "c1") 1))
(assert (null (g3t db "c2"))))))
```
COMMIT-TXN, ABORT-TXN, and RESET-TXN all close the
@ACTIVE-TRANSACTION (see OPEN-TXN-P). When the active transaction is
not open, database operations such as G3T, PUT, DEL signal
LMDB-BAD-TXN-ERROR. Furthermore, any @LMDB/CURSORS created in the
context of the transaction will no longer be valid (but see
CURSOR-RENEW).
An LMDB parent transaction and its cursors must not issue operations
other than COMMIT-TXN and ABORT-TXN while there are active child
transactions. As the Lisp side does not expose transaction objects
directly, performing @LMDB/BASIC-OPERATIONS in the parent
transaction is not possible, but it is possible with @LMDB/CURSORS
as they are tied to the transaction in which they were created.
IGNORE-PARENT true overrides the default nesting semantics of
WITH-TXN and creates a new top-level transaction, which is not a
child of the enclosing WITH-TXN.
- Since LMDB is single-writer, on nesting an IGNORE-PARENT write
transaction in another write transaction, LMDB-BAD-TXN-ERROR is
signalled to avoid the deadlock.
- Nesting a read-only WITH-TXN with IGNORE-PARENT in another
read-only WITH-TXN is LMDB-BAD-RSLOT-ERROR error with the TLS
option because it would create two read-only transactions in the
same thread.
Nesting a read transaction in another transaction would be an
LMDB-BAD-RSLOT-ERROR according to the C lmdb library, but a
read-only WITH-TXN with IGNORE-PARENT NIL nested in another WITH-TXN
is turned into a noop so this edge case is papered over.
## Databases
### The unnamed database
LMDB has a default, unnamed database backed by a B+ tree. This db
can hold normal key-value pairs and named databases. The unnamed
database can be accessed by passing NIL as the database name to
GET-DB. There are some restrictions on the flags of the unnamed
database, see LMDB-INCOMPATIBLE-ERROR.
### DUPSORT
A prominent feature of LMDB is the ability to associate multiple
sorted values with keys, which is enabled by the DUPSORT argument of
GET-DB. Just as a named database is a B+ tree associated with a
key (its name) in the B+ tree of the unnamed database, so do these
sorted duplicates form a B+ tree under a key in a named or the
unnamed database. Among the @LMDB/BASIC-OPERATIONS, PUT and DEL are
equipped to deal with duplicate values, but G3T is too limited, and
@LMDB/CURSORS are needed to make full use of DUPSORT.
When using this feature the limit on the maximum key size applies to
duplicate data, as well. See ENV-MAX-KEY-SIZE.
### Database API
- [variable] *DB-CLASS* DB
The default class that GET-DB instantiates. Must a subclass of DB.
This provides a way to associate application specific data with DB
objects.
- [function] GET-DB NAME &KEY (CLASS \*DB-CLASS\*) (ENV \*ENV\*) (IF-DOES-NOT-EXIST :CREATE) KEY-ENCODING VALUE-ENCODING INTEGER-KEY REVERSE-KEY DUPSORT INTEGER-DUP REVERSE-DUP DUPFIXED
Open the database with NAME in the open environment ENV, and return
a DB object. If NAME is NIL, then the @LMDB/THE-UNNAMED-DATABASE is
opened.
If GET-DB is called with the same name multiple times, the returned
DB objects will be associated with the same database (although they
may not be EQ). The first time GET-DB is called with any given name
and environment, it must not be from an open transaction. This is
because GET-DB starts a transaction itself to comply with C lmdb's
requirements on
[mdb\_dbi\_open()](http://www.lmdb.tech/doc/group__mdb.html#gac08cad5b096925642ca359a6d6f0562a) (see
@LMDB/SAFETY). Since dbi handles are cached within ENV, subsequent
calls do not involve `mdb_dbi_open()` and are thus permissible
within transactions.
CLASS designates the class which will instantiated. See *DB-CLASS*.
If IF-DOES-NOT-EXIST is :CREATE, then a new named database is
created. If IF-DOES-NOT-EXIST is :ERROR, then an error is signalled
if the database does not exists.
KEY-ENCODING and VALUE-ENCODING are both one of NIL, :UINT64,
:OCTETS or :UTF-8. KEY-ENCODING is set to :UINT64 when INTEGER-KEY
is true. VALUE-ENCODING is set to :UINT64 when INTEGER-DUP is true.
Note that changing the encoding does *not* reencode already existing
data. See @LMDB/ENCODINGS for the full semantics.
GET-DB may be called more than once with the same NAME and ENV, and
the returned DB objects will have the same underlying C lmdb
database, but they may have different KEY-ENCODING and
VALUE-ENCODING.
The following flags are for database creation, they do not have any
effect in subsequent calls (except for the
@LMDB/THE-UNNAMED-DATABASE).
- INTEGER-KEY: Keys in the database are C `unsigned` or `size_t`
integers encoded in native byte order. Keys must all be either
`unsigned` or `size_t`, they cannot be mixed in a single database.
- REVERSE-KEY: Keys are strings to be compared in reverse order,
from the end of the strings to the beginning. By default, keys are
treated as strings and compared from beginning to end.
- DUPSORT: Duplicate keys may be used in the database (or, from
another perspective, keys may have multiple values, stored in
sorted order). By default, keys must be unique and may have only a
single value. Also, see @DUPSORT.
- INTEGER-DUP: This option specifies that duplicate data items are
binary integers, similarly to INTEGER-KEY. Only matters if
DUPSORT.
- REVERSE-DUP: This option specifies that duplicate data items
should be compared as strings in reverse order. Only matters if
DUPSORT.
- DUPFIXED: This flag may only be used in combination DUPSORT. When
true, data items for this database must all be the same size,
which allows further optimizations in storage and retrieval.
Currently, the wrapper functions that could take advantage of
this (e.g. PUT, CURSOR-PUT, CURSOR-NEXT and CURSOR-VALUE), do not.
No function to close a database (an equivalent to
[mdb\_dbi\_close()](http://www.lmdb.tech/doc/group__mdb.html#ga52dd98d0c542378370cd6b712ff961b5))
is provided due to subtle races and corruption it could cause when
an `MDB_dbi` (unsigned integer, similar to an fd) is assigned by a
subsequent open to another named database.
Wraps [mdb\_dbi\_open()](http://www.lmdb.tech/doc/group__mdb.html#gac08cad5b096925642ca359a6d6f0562a).
- [class] DB
A database in an environment (class ENV). Always to
be created by GET-DB.
- [reader] DB-NAME DB (:NAME)
The name of the database.
- [reader] DB-KEY-ENCODING DB (:KEY-ENCODING)
The ENCODING that was passed as KEY-ENCODING to
GET-DB.
- [reader] DB-VALUE-ENCODING DB (:VALUE-ENCODING)
The ENCODING that was passed as VALUE-ENCODING
to GET-DB.
- [function] DROP-DB NAME PATH &KEY OPEN-ENV-ARGS (DELETE T)
Empty the database with NAME in the environment denoted by PATH. If
DELETE, then delete the database. Since closing a database is
dangerous (see GET-DB), DROP-DB opens and closes the environment
itself.
Wraps [mdb\_drop()](http://www.lmdb.tech/doc/group__mdb.html#gab966fab3840fc54a6571dfb32b00f2db).
- [function] DB-STATISTICS DB
Return statistics about the database.
Wraps [mdb\_stat()](http://www.lmdb.tech/doc/group__mdb.html#gae6c1069febe94299769dbdd032fadef6).
## Encoding and decoding data
In the C lmdb library, keys and values are opaque byte vectors
only ever inspected internally to maintain the sort order (of keys
and also duplicate values if @DUPSORT). The client is given the
freedom and the responsibility to choose how to perform conversion
to and from byte vectors.
LMDB exposes this full flexibility while at the same time providing
reasonable defaults for the common cases. In particular, with the
KEY-ENCODING and VALUE-ENCODING arguments of GET-DB, the
data (meaning the key or value here) encoding can be declared
explicitly.
Even if the encoding is undeclared, it is recommended to use a
single type for keys (and duplicate values) to avoid unexpected
conflicts that could arise, for example, when the UTF-8 encoding of
a string and the :UINT64 encoding of an integer coincide. The same
consideration doubly applies to named databases, which share the key
space with normal key-value pairs in the default database (see
@LMDB/THE-UNNAMED-DATABASE).
Together, :UINT64 and :UTF-8 cover the common cases for keys. They
trade off dynamic typing for easy sortability (using the default C
lmdb behaviour). On the other hand, when sorting is not
concern (either for keys and values), serialization may be done more
freely. For this purpose, using an encoding of :OCTETS or NIL with
[cl-conspack](https://github.com/conspack/cl-conspack) is
recommended because it works with complex objects, it encodes object
types, it is fast and space-efficient, has a stable specification
and an alternative implementation in C. For example:
```
(with-temporary-env (*env*)
(let ((db (get-db "test")))
(with-txn (:write t)
(put db "key1" (cpk:encode (list :some "stuff" 42)))
(cpk:decode (g3t db "key1")))))
=> (:SOME "stuff" 42)
```
Note that multiple DB objects with different encodings can be
associated with the same C lmdb database, which declutters the code:
```
(defvar *cpk-encoding*
(cons #'cpk:encode (alexandria:compose #'cpk:decode #'mdb-val-to-octets)))
(with-temporary-env (*env*)
(let ((next-id-db (get-db "test" :key-encoding *cpk-encoding*
:value-encoding :uint64))
(db (get-db "test" :key-encoding *cpk-encoding*
:value-encoding *cpk-encoding*)))
(with-txn (:write t)
(let ((id (or (g3t next-id-db :next-id) 0)))
(put next-id-db :next-id (1+ id))
(put db id (list :some "stuff" 42))
(g3t db id)))))
=> (:SOME "stuff" 42)