Skip to content

lweitkamp/qlora_dequantize_triton

Repository files navigation

QLoRA Dequantization Kernels in Triton

My attempts at a double-dequant kernel for QLoRA weights (see: Unsloth Puzzles). Results compared against bitsandbytes CUDA dequantization. See my blog post for more.

Performance Speedup

About

QLoRA dequantization kernels in Triton.

Resources

Stars

Watchers

Forks