Skip to content

Conversation

@cpunion
Copy link
Contributor

@cpunion cpunion commented Jan 16, 2026

Summary

This PR implements a register-based closure context (ctx) mechanism to significantly improve closure call performance by avoiding TLS/global memory access when possible. This addresses issue #1497.

Background

Previously, closure contexts were passed exclusively through a global or TLS slot, which required memory loads on every closure call. This approach worked but had performance overhead and potential thread-safety issues in certain scenarios.

Changes

Core Implementation

  1. Architecture-Specific Reserved Registers

    • Added per-architecture reserved registers for closure ctx:
      • amd64: r12
      • arm64: x26
      • 386: esi
      • riscv64: x27
      • Other platforms fall back to TLS/global
  2. Closure Representation

    • Maintains {fn: *func, data: unsafe.Pointer} layout
    • fn: original Go signature (no explicit ctx parameter)
    • data: nil for plain functions, heap-allocated context for free variables
  3. Calling Convention

    • Before call: write_ctx(data) - writes to reserved register or fallback
    • Callee reads ctx at function entry (if needed) and caches in local
    • After call: restore_ctx() - restores previous ctx to avoid corrupting caller
    • This is critical for nested closures and callbacks that may re-enter Go
  4. Register Preservation

    • Added -ffixed-* flags to clang to prevent LLVM register allocation from clobbering the ctx register
    • Applies to both native use() and UseTarget() paths
  5. Wrapper Functions

    • Only created for adaptation cases:
      • Legacy explicit-ctx functions: __llgo_stub.<fn>$ctx
      • Raw function pointers: __llgo_stub._llgo_func$<hash>
    • Regular function declarations are not wrapped

Test Coverage

Added comprehensive test cases in cl/_testgo/ and cl/_testcall/:

  • Basic closure calls
  • Nested closures (including sync.Once.Do patterns)
  • Interface method closures
  • Goroutine with closures
  • Defer with closures
  • Reflect-based closures
  • C function calls within closures (register pollution tests)
  • Closure value passing
  • Method expressions and values

Documentation

Updated doc/closure.md with:

  • Design goals and rationale
  • Representation details
  • Calling convention
  • Architecture-specific register mapping
  • Performance considerations

Performance Impact

  • Reduced closure call overhead by eliminating TLS/global memory access on supported platforms
  • Thread-local on supported OSes with TLS fallback
  • Bare-metal/wasm falls back to process-global slot

Testing

All existing tests pass. New test cases specifically validate:

  • Closure context preservation across nested calls
  • Register pollution scenarios (C function calls, nested closures)
  • Interface method closures
  • Goroutine safety with closures
  • sync.Once and sync.WaitGroup compatibility

Issue References

Fixes #1497

@gemini-code-assist
Copy link

Note

The number of changes in this pull request is too large for Gemini Code Assist to generate a summary.

@xgopilot
Copy link

xgopilot bot commented Jan 16, 2026

Code Review Summary

This PR implements a well-designed register-based closure context mechanism. The architecture is sound with appropriate register selection (callee-saved registers), proper save/restore semantics, and comprehensive test coverage.

Strengths:

  • Comprehensive test coverage across all closure patterns
  • Register pollution tests for both C interop and nested closures
  • Proper thread safety through per-goroutine register isolation
  • Efficient wrapper generation with tail call optimization

Minor Issues to Address:

  • Unused noop() function in regpollute/in.go
  • Missing direct method call tests in go/in.go vs defer/in.go
  • Documentation could clarify wrapper naming variations

Overall, this is a solid implementation ready for merge after addressing minor issues.

// Test scenario: calling another closure before reading free variables
// This may cause x26 register to be clobbered

func noop() {}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This noop() function is declared but never used in the test file. Consider either removing it or adding a test case that uses it (e.g., a noop call between closure operations to test register preservation across non-closure calls).

// go with interface method value
im := i.Add
go im(8)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency with defer/in.go, consider adding direct method call tests:

// go with direct method call (not via method value)
go s.Add(7)
go sv.Inc(8)

The defer test includes these (lines 63-64), so having parity would ensure the go statement handles direct method calls the same way.

@cpunion cpunion force-pushed the closure-ctxreg branch 2 times, most recently from 92e598c to 26e9b99 Compare January 16, 2026 00:21
- Fix crosscompile.use() to set LLVMTarget/GOOS/GOARCH for non-wasm cross-compilation
- Add unified compiler(compilingCSource bool) method that handles --target flag
- Add NativeCCompile flag to compile C files for host OS during cross-compilation
- Clone flag slices in compiler() to prevent side effects
- Remove redundant targetArgs(), cTargetArgs(), cCompiler() methods
- Enable NativeCCompile in cabi tests to avoid sysroot issues on Darwin

This fixes TestBuild failures when cross-compiling from Darwin to Linux.
Add aliases for common LLVM target triple arch names:
- x86_64 -> amd64
- aarch64 -> arm64
- armv5/armv6/armv7 -> arm
- i386/i686 -> 386
- wasm32 -> wasm

This fixes TestABI failures where llgo was not applying cabi type
transformations because arch names from LLVM target triples (e.g.
x86_64-unknown-linux-gnu, armv7-unknown-linux) were not matched
in the switch statement.
- Add target_arch.go with getCanonicalArchName helper for normalizing arch names
- Update ctxreg.go to export ReserveFlags for cross-compilation register handling
- Update ssa/expr.go and ssa_test.go for related changes
- Fix explicitTargetTriple to detect cross-compilation by GOOS/GOARCH
- Add NativeCCompile documentation (only for cabi_test)
- Restore TestRun* tests and add TestFromTestcall
- Add filterLinkerWarnings to filter ld64.lld warnings in test output
- Add expect.txt.new to .gitignore
- Fix import alias (envllvm -> llvm)
@codecov
Copy link

codecov bot commented Jan 16, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.66%. Comparing base (f6337d4) to head (a6aef86).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1568      +/-   ##
==========================================
- Coverage   91.01%   89.66%   -1.36%     
==========================================
  Files          45       46       +1     
  Lines       11971    12213     +242     
==========================================
+ Hits        10896    10951      +55     
- Misses        899     1075     +176     
- Partials      176      187      +11     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

In the new register-based closure ABI, context is passed via register,
not as a function parameter. The reflect.Value.call was incorrectly
prepending the context pointer to the FFI argument list when calling
closures, causing argument misalignment and nil pointer dereference.

Fixes text/template println error when executing templates.
Map riscv32 to riscv64 in goarchFromTriple so ESP32-C3 and other
RISC-V 32-bit targets can use the x27 context register instead of
falling back to global variable.
Remove leading unsafe.Pointer parameter from MakeFunc FFI signature
and update argument indexing to match the new register-based closure
calling convention. All closures now use register-based context instead
of passing context as a function parameter.

Fixes texttemplate and reflectmake demos.
Fix single-element struct handling for riscv32 ABI by properly
flattening pointer and i32 types to i32, and float types to float
on ilp32f/ilp32d ABIs.

This fixes TestABI failures for esp32c3, riscv32_ilp32, riscv32_ilp32f,
and riscv32_ilp32d targets.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Discussion: Closure Context Passing - Comparing Stub, Register, and Official Go Approaches

1 participant