Skip to content

Using SIMD for string.h #580

@ncruces

Description

@ncruces

I make a no-CGO Go SQLite driver, by compiling the amalgamation to Wasm, then loading the result with wazero.

To compile SQLite, I use wasi-sdk which uses wasi-libc, based on musl. I'd heard that musl is slow(er than glibc), which is true, to a point.

musl uses SWAR on a size_t to implement various functions in string.h. This is fine, except size_t is just 32-bit on Wasm, whereas most CPUs these days are 64-bit.

I found that implementing a few of those functions with Wasm SIMD128 can make them go around 4x faster.

Other functions don't even use SWAR; redoing those can make them 16x faster.

I found that using SIMD intrinsics (rather than SWAR) seemingly makes it easier to avoid UB, but the code would definitely benefit from more eyeballs. The code (MIT licensed) is at: https://github.com/ncruces/go-sqlite3/blob/main/sqlite3/libc/string.h

I also reimplemented qsort but that's another issue.

See this for some benchmarks on both x86-64 and Aarch64 (wazero only).

Reviewing this, testing it across other runtimes and platforms, figuring out where it fits and your build infra are all a significant time investment, so opening an issue just to see if there's even interest in something like this (and hopefully find if someone would be willing to pick up some of the work I'm less familiar with).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions