debug-print Helper function to print tensor inside CUDA kernels, designed for debugging inside CUDAGraph. Installation pip install git+https://github.com/flashinfer-ai/debug-print.git Usage See example.py.