Skip to content

Merging v1.3 development changes into master #420

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 245 commits into from
May 27, 2025
Merged

Merging v1.3 development changes into master #420

merged 245 commits into from
May 27, 2025

Conversation

ugorji
Copy link
Owner

@ugorji ugorji commented May 27, 2025

codec v1.3 uses generics and monomorphization to get much improved performance without the need for seldom-used codecgen. Performance numbers are impressive.

Removing codecgen dramatically reduces the work involved in making changes (where each change has to be effectively supported in codecgen mode).

Ugorji Nwoke and others added 30 commits March 1, 2025 11:06
Actions:
- remove codecgen (we can get most of the perf guarantees by better inlining/generics)
- remove other modules: go/ go/codec/bench go/codec/codecgen (keeping only go/codec)
- remove sort helpers (use generics and slices.XXX)
- remove cross-lib benchmarks out into repo: go-codec-bench
- refactor testing and benchmarks for easier sharing with go-codec-bench
…rmance

Key changes:
- try to parameterize around: (En|De)coder --> (En|Dec)Driver --> (IO/Byte)(Reader|Writer)
- global shared state is managed in Handle
- Do not store state: only support restoring to initial state
- For side encode/decode, each (En|De)coder will keep a byte encoder/decoder for side functions

We now have simple format working. We will then upgrade other drivers and enable them.
This allows us easily exclude files when working on a specific format.
Use generics to refactor some code out to helper parameterized
functions for sharing with other formats.
decoder.init(Handle) -> driver.init(Handle) --> reader.init()
decoder.reset() --> driver.reset() ?? is this needed??? maybe not. We only ever call .reset
decoder.reset(bytes/io) --> driver.reset(bytes/io) --> reader.reset(bytes/io)

driver.init() will call Make() on its reader/writer
decoder.init() will call Make() on its driver

Move side(En|De)coder,  rtidFn(s) into the shared(En|De)coder, so that it is used by drivers, decoder.
- remove isBytes() from decReader and encWriter.
  Instead, at creation of a decoder, we will (based on the context),
  determine whether bytes or io.
- clean up how code is shared between formats.
  keep all those shared code at bottom of file,
  so we just change the names.
We don't need to build a different one for each encoder/decoder.

Instead, all encoder/decoder share it globally, and it is
initialized at startup.
It is mostly used for extensions. Folks without extensions
shoulc not have to pay the price for it.
- multiple panicHdl.errorf callers should be replaced by a method that doesn't call fmt.Errorf
- sideEncode/sideDecode should use a reflect.Value directly (as opposed to wrapping it in a interface{})
- instead of creating a new string for a single byte, look it up in a pre-allocated string and substring that
This dramatically reduces alloc and improves run time.
Per GC guide, it's best to put pointers and values containing pointers
(e.g. interfaces, some slices of pointers, fields that are values containing pointers, etc)
together around the top of a struct.

This reduces GC pressure, since a value will only be scanned up to the last
pointer containing part of it.
ugorji added 29 commits May 20, 2025 14:53
…stly

- readn{2,3,4,8} return was escaping to heap as we called readb internally
- - now call readxb, so we just return a slice and convert that directly to an array
- fillbuf was previously pre-allocating if it saw that
  less than bufsize may be available on a write.
- - now, fillbuf only allocates when z.wc == len, and not enough left
    in capacity to handle another bufsize write.

Both changes improved performance for binary handles using bufio by a fair amount (~10-15%).
Instead, benchmarks pass in -tzc or configure testZeroCopy=true directly
- detach2Bytes no longer tries to decode into a secondary slice
- decodeBytesInto will let you know if you decoded into a diff slice (dBytesIntoState), and
  takes a mustFit parameter to ensure that you can copy the bytes into it (or raise an error)
- introduce tmpCopyBytes for copying field names (of < 56 chars) into a temp slice for transient use
- encode(...): do not handle nil or reflect.Value (encodeValue handles all that)
- introduce usableStructFieldNameBytes that may copy the struct field to prevent overwrite
- simplify isCanTransient and its use
- use aliases in helper_(unsafe|not_unsafe)
- simplify oneShotAddrRV to just check if flagCanTransient is set
- use regular for-loop and not range-over-int
- update go.mod to 1.21
@ugorji ugorji merged commit c121472 into master May 27, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant