Skip to content

Initial GoLang Support #211

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Feb 23, 2025
Merged

Initial GoLang Support #211

merged 7 commits into from
Feb 23, 2025

Conversation

ashvardanian
Copy link
Owner

@ashvardanian ashvardanian commented Feb 23, 2025

Together with @MarkReedZ we've added basic GoLang bindings to StringZilla, which look surprisingly fast compared to native GoLang strings. We currently use the new cGo annotations available in Go 1.24:

Cgo has gained new capabilities in Go 1.24, supporting new C function annotations to improve runtime performance. Among them, #cgo noescape cFunctionName is used to inform the compiler that the memory passed to cFunctionname will not escape; #cgo nocallback cFunctionName indicates that this C function will not call back any Go functions. In addition, Cgo's inspection of multiple incompatible declarations of C functions has become more stringent. When there are incompatible declarations in different files, errors can be detected and reported more timely and accurately.

I was using an Intel Sapphire Rapids machine on AWS for preliminary testing and benchmarking. I've precompiled StringZilla with dynamic dispatch enabled, linked to the thin GoLang binding layer:

$ ~/StringZilla/golang$ CGO_CFLAGS="-I$(pwd)/../include" \
        CGO_LDFLAGS="-L$(pwd)/../build_golang -lstringzilla_shared" \
        LD_LIBRARY_PATH="$(pwd)/../build_golang:$LD_LIBRARY_PATH" \
        go run ../scripts/bench.go  --input ../leipzig1M.txt --split lines --seed 42

... and compared to native GoLang strings on some key operations:

Benchmarking on `../leipzig1M.txt` with seed 42.
Total input length: 129644797
Total lines: 1000000
Average line length: 128.64
Running benchmark using `testing.Benchmark`.
strings.Contains              :      309           3818144 ns/op
sz.Contains                   :      664           1881251 ns/op
strings.Index                 :      325           3669081 ns/op
sz.Index                      :      624           1990093 ns/op
strings.LastIndex             :       12          85201713 ns/op
sz.LastIndex                  :      494           2306318 ns/op
strings.IndexAny              :  6321228             181.0 ns/op
sz.IndexAny                   : 10608960             112.6 ns/op
strings.Count                 :      156           8015292 ns/op
sz.Count (non-overlap)        :      285           4206698 ns/op
sz.Count (overlap)            :      284           4204370 ns/op

So if you are processing a lot of text in Go, try doing so with StringZilla and stay tuned for the upcoming 4.0 release #201 🥳

@ashvardanian ashvardanian merged commit b818d1e into main Feb 23, 2025
7 checks passed
@ashvardanian ashvardanian deleted the main-golang branch February 23, 2025 16:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants