Skip to content

opt:(encoder) use std strconv.AppendInt for better performance on arm #789

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 21, 2025

Conversation

AsterDY
Copy link
Collaborator

@AsterDY AsterDY commented Apr 13, 2025

What type of PR is this?

Check the PR title.

  • This PR title match the format: <type>(optional scope): <description>
  • The description of this PR title is user-oriented and clear enough for others to understand.
  • Attach the PR updating the user documentation if the current PR requires user awareness at the usage level. User docs repo

(Optional) Translate the PR title into Chinese.

(Optional) More detailed description for this PR(en: English/zh: Chinese).

en: according to benchmark, strconv.AppendItn() has better performance than native implementation in alg.I64toa:

goos: darwin
goarch: arm64
pkg: github.com/bytedance/sonic/internal/encoder/alg
cpu: Apple M3 Pro
BenchmarkI64toa/sonic-1-12              48961936                22.01 ns/op            0 B/op          0 allocs/op
BenchmarkI64toa/std-1-12                329461662                3.600 ns/op           0 B/op          0 allocs/op
BenchmarkI64toa/sonic-16-12             55322376                21.48 ns/op            0 B/op          0 allocs/op
BenchmarkI64toa/std-16-12               438972664                2.730 ns/op           0 B/op          0 allocs/op
BenchmarkI64toa/sonic-256-12            55268763                21.42 ns/op            0 B/op          0 allocs/op
BenchmarkI64toa/std-256-12              205196311                5.884 ns/op           0 B/op          0 allocs/op
BenchmarkI64toa/sonic-4096-12           54943064                21.90 ns/op            0 B/op          0 allocs/op
BenchmarkI64toa/std-4096-12             207751256                5.740 ns/op           0 B/op          0 allocs/op
BenchmarkI64toa/sonic-65536-12          54216253                21.70 ns/op            0 B/op          0 allocs/op
BenchmarkI64toa/std-65536-12            175604583                6.849 ns/op           0 B/op          0 allocs/op
BenchmarkI64toa/sonic-1048576-12        53292178                22.54 ns/op            0 B/op          0 allocs/op
BenchmarkI64toa/std-1048576-12          134011879                8.747 ns/op           0 B/op          0 allocs/op
BenchmarkI64toa/sonic-16777216-12       53392560                22.75 ns/op            0 B/op          0 allocs/op
BenchmarkI64toa/std-16777216-12         136341493                9.195 ns/op           0 B/op          0 allocs/op
BenchmarkI64toa/sonic-268435456-12      47283418                24.78 ns/op            0 B/op          0 allocs/op
BenchmarkI64toa/std-268435456-12        125863491                9.415 ns/op           0 B/op          0 allocs/op
BenchmarkI64toa/sonic-4294967296-12             46340396                24.68 ns/op            0 B/op          0 allocs/op
BenchmarkI64toa/std-4294967296-12               125847438                9.580 ns/op           0 B/op          0 allocs/op
BenchmarkI64toa/sonic-68719476736-12            48397423                24.95 ns/op            0 B/op          0 allocs/op
BenchmarkI64toa/std-68719476736-12              100000000               10.32 ns/op            0 B/op          0 allocs/op
BenchmarkI64toa/sonic-1099511627776-12          48338691                24.72 ns/op            0 B/op          0 allocs/op
BenchmarkI64toa/std-1099511627776-12            100000000               11.15 ns/op            0 B/op          0 allocs/op
BenchmarkI64toa/sonic-17592186044416-12         47782588                25.10 ns/op            0 B/op          0 allocs/op
BenchmarkI64toa/std-17592186044416-12           100000000               11.25 ns/op            0 B/op          0 allocs/op
BenchmarkI64toa/sonic-281474976710656-12        50566502                24.11 ns/op            0 B/op          0 allocs/op
BenchmarkI64toa/std-281474976710656-12          100000000               11.70 ns/op            0 B/op          0 allocs/op
BenchmarkI64toa/sonic-4503599627370496-12       45584118                24.80 ns/op            0 B/op          0 allocs/op
BenchmarkI64toa/std-4503599627370496-12         96807373                12.33 ns/op            0 B/op          0 allocs/op
BenchmarkI64toa/sonic-72057594037927936-12      49537647                23.57 ns/op            0 B/op          0 allocs/op
BenchmarkI64toa/std-72057594037927936-12        95043858                12.13 ns/op            0 B/op          0 allocs/op
BenchmarkI64toa/sonic-1152921504606846976-12            49055928                25.24 ns/op            0 B/op          0 allocs/op
BenchmarkI64toa/std-1152921504606846976-12              91301910                12.96 ns/op            0 B/op          0 allocs/op

zh(optional):

(Optional) Which issue(s) this PR fixes:

(optional) The PR that updates user documentation:

@AsterDY AsterDY merged commit 07d1345 into main Apr 21, 2025
84 checks passed
@AsterDY AsterDY deleted the opt/encoder_vm branch April 21, 2025 10:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants