Helper function to print tensor inside CUDA kernels, designed for debugging inside CUDAGraph.
pip install git+https://github.com/flashinfer-ai/debug-print.git
See example.py.
Helper function to print tensor inside CUDA kernels, designed for debugging inside CUDAGraph.
pip install git+https://github.com/flashinfer-ai/debug-print.git
See example.py.