-
Notifications
You must be signed in to change notification settings - Fork 212
Description
I make a no-CGO Go SQLite driver, by compiling the amalgamation to Wasm, then loading the result with wazero.
To compile SQLite, I use wasi-sdk which uses wasi-libc, based on musl. I'd heard that musl is slow(er than glibc), which is true, to a point.
musl uses SWAR on a size_t
to implement various functions in string.h
. This is fine, except size_t
is just 32-bit on Wasm, whereas most CPUs these days are 64-bit.
I found that implementing a few of those functions with Wasm SIMD128 can make them go around 4x faster.
Other functions don't even use SWAR; redoing those can make them 16x faster.
I found that using SIMD intrinsics (rather than SWAR) seemingly makes it easier to avoid UB, but the code would definitely benefit from more eyeballs. The code (MIT licensed) is at: https://github.com/ncruces/go-sqlite3/blob/main/sqlite3/libc/string.h
I also reimplemented qsort
but that's another issue.
See this for some benchmarks on both x86-64 and Aarch64 (wazero only).
Reviewing this, testing it across other runtimes and platforms, figuring out where it fits and your build infra are all a significant time investment, so opening an issue just to see if there's even interest in something like this (and hopefully find if someone would be willing to pick up some of the work I'm less familiar with).