Release v0.9.4 · NVIDIA/MatX

Note: MatX is approaching a 1.0 release with several major updates. 1.0 will contain CUDA JIT capabilities that allow better kernel fusion and overall improvements in kernel runtimes. Along with the JIT capabilities, most files have changes that allow for efficient improvements in the kernels. MatX 1.0 will require C++20 support in both the CUDA and host compilers. CUDA 11.8 support will no longer be supported.

Notable Changes:

apply() and apply_idx() operators for writing lambda-based custom operators

Full Changelog

Add profiling unit tests and fix timer safety by @cliffburdick in #1060
Fixed-size reductions by @cliffburdick in #1061
Fix gcc warning by @cliffburdick in #1062
Added enum documentation for all operators by @cliffburdick in #1063
Support ND operators and transforms to/from python by @cliffburdick in #1064
Add prerun_done_ flag to prevent duplicate PreRun executions in transform operators by @cliffburdick in #1065
Fix some iterator issues that come up with CCCL ToT by @miscco in #1066
Properly use an if constexpr to guard segemented CUB algorithms by @miscco in #1067
Fix cuTENSORNet/cuDSS library path and update to new cuTensorNet API by @cliffburdick in #1069
Added apply() operator by @cliffburdick in #1072
Update stdd docs by @cliffburdick in #1076
Update release container to CUDA 13.0.1 by @tmartin-gh in #1068
Add apply_idx operator for index-aware computations by @cliffburdick in #1077
Fix missing include of <cuda/std/utility> by @miscco in #1078

Full Changelog: v0.9.3...v0.9.4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

v0.9.4

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Notable Changes:

Full Changelog

Contributors

Uh oh!