feat: deferral extension CUDA tracegen#2601
feat: deferral extension CUDA tracegen#2601stephenh-axiom-xyz wants to merge 2 commits intodevelop-v2.0.0-betafrom
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
|
Claude finished @stephenh-axiom-xyz's task in 8m 24s —— View job Code Review: Deferral Extension CUDA Tracegen
Overall this is a well-structured PR that follows the established CUDA tracegen patterns in the codebase. The kernel/Rust split is clean, ABI bindings are consistent, and the GPU prover extension wiring correctly mirrors the CPU path. Below are my findings. Issues1. In-place cub::DeviceReduce::ReduceByKey(
d_temp_storage, temp_storage_bytes,
d_records_fp16, d_records_fp16, // same buffer for input and output keys
d_counts, d_counts, // same buffer for input and output values
...
);CUB's 2. Canonicity #ifdef CUDA_DEBUG
assert(found);
#endifIf all bytes of the input match 3. Poseidon2 buffer capacity formula — pub fn poseidon2_buffer_capacity(max_trace_height: usize) -> usize {
max_trace_height.next_power_of_two() * 2 * (DIGEST_SIZE * 2)
}This yields 4. Missing const uint32_t row_idx = blockIdx.x * blockDim.x + threadIdx.x;
RowSlice row(trace + row_idx, height);
if (row_idx >= num_records) {
row.fill_zero(0, sizeof(DeferralCallCols<uint8_t>));
return;
}When Minor Observations5. assert(record_idx < capacity && "DeferralPoseidon2Buffer overflow");This uses an unconditional 6. Unused
7. Builder aliasing and API changes — The cfg_if::cfg_if! {
if #[cfg(feature = "cuda")] {
pub use self::cuda::Rv32DeferralGpuBuilder as Rv32DeferralBuilder;
} else {
pub use self::Rv32DeferralCpuBuilder as Rv32DeferralBuilder;
}
}Tests correctly use Things That Look Good
SummaryThe PR is solid and follows established patterns. The main item worth addressing is #2 (canonicity assert gating) — while the AIR constraints should catch invalid inputs, making the assert unconditional would provide better error diagnostics. Items #1 and #3 are worth documenting with comments. Everything else looks correct. |
Resolves INT-6241.
Overview
This PR adds CUDA trace generation support for the deferral extension and wires it into the deferral GPU prover path. It also updates test call sites and CUDA workflows so the new path is exercised in CI.
1) Deferral Extension: CUDA Tracegen Implementation
Files
extensions/deferral/circuit/build.rsextensions/deferral/circuit/Cargo.tomlextensions/deferral/circuit/src/lib.rsextensions/deferral/circuit/src/cuda_abi.rsextensions/deferral/circuit/src/extension/mod.rsextensions/deferral/circuit/src/extension/cuda.rsextensions/deferral/circuit/src/call/mod.rsextensions/deferral/circuit/src/call/cuda.rsextensions/deferral/circuit/src/count/mod.rsextensions/deferral/circuit/src/count/cuda.rsextensions/deferral/circuit/src/output/mod.rsextensions/deferral/circuit/src/output/cuda.rsextensions/deferral/circuit/src/poseidon2/mod.rsextensions/deferral/circuit/src/poseidon2/cuda.rsextensions/deferral/circuit/cuda/include/def_types.hextensions/deferral/circuit/cuda/include/def_poseidon2_buffer.cuhextensions/deferral/circuit/cuda/src/call.cuextensions/deferral/circuit/cuda/src/count.cuextensions/deferral/circuit/cuda/src/output.cuextensions/deferral/circuit/cuda/src/poseidon2.cuextensions/deferral/circuit/cuda/src/canonicity.cuhWhat changed
openvm-rv32im-circuit/cudafor GPU builder/prover compatibility.count,call,output, andposeidon2.DeferralGpuProverExtso the deferral extension now installs all required GPU chips.Reviewer focus
count, poseidon2 record index/counters).2) Builder API and Call-Site Updates
Files
extensions/deferral/circuit/src/extension/mod.rscrates/continuations/src/tests/e2e.rsextensions/deferral/tests/src/lib.rscrates/sdk/src/tests.rsWhat changed
Rv32DeferralCpuBuilder.Rv32DeferralBuilderthat resolves toRv32DeferralGpuBuilderwith CUDA andRv32DeferralCpuBuilderwithout CUDA.Rv32DeferralBuilderinstead of a CPU-only builder.#[cfg(not(feature = "cuda"))]gate from deferral integration tests so they can run in CUDA builds.Sdk::new(...)instead ofCpuSdk::new(...), removing a CUDA-related TODO.Reviewer focus
3) CUDA Workflow Updates
Files
.github/workflows/extension-tests.cuda.yml.github/workflows/continuations.cuda.yml.github/workflows/sdk.cuda.ymlWhat changed
deferralto CUDA extension test matrix andpaths-filterinextension-tests.cuda.yml.root-proveralongsidecudain continuations CUDA workflow.--profile=heavyfrom SDK CUDA workflow nextest invocation (kept--test-threads=1).Reviewer focus
cuda,root-prover) matches intended test surface.