-
Notifications
You must be signed in to change notification settings - Fork 285
Issues: flashinfer-ai/flashinfer
Deprecation Notice: Python 3.8 Wheel Support to End in future...
#682
opened Dec 18, 2024 by
yzh119
Open
2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Can flashinfer's CutlassSegmentGEMMSM90Run function be used for LoRA computing on H20?
#1034
opened Apr 23, 2025 by
chenhongyu2048
flashinfer.decode.single_decode_with_kv_cache: Floating point exception (core dumped)
#1027
opened Apr 20, 2025 by
MenHimChan
[Bug] FP8 scaling factors (k_scale/v_scale) not taking effect in BatchPrefillWithPagedKVCacheWrapper
#1023
opened Apr 17, 2025 by
cscyuge
Low performance of POD Attention compared to BatchPrefillWithPagedKVCache
#1022
opened Apr 17, 2025 by
Edenzzzz
top_k_top_p_sampling_from_logits
incompatible with torch.compile + CUDAGraph
#978
opened Mar 28, 2025 by
sharvil
CUDA error: too many blocks in cooperative launch(72) when use multi GPU
#938
opened Mar 13, 2025 by
Zzzer0o
RMSNorm failed with error code no kernel image is available for execution on the device
#920
opened Mar 7, 2025 by
nanmi
[Bugfix] CUDAGraph Compatibility in AppendPagedKVCacheKernel for Variable-Length Inputs
#919
opened Mar 7, 2025 by
SungBalance
ValueError: Invalid mode: forward_mode=<ForwardMode.TARGET_VERIFY: 6>
#916
opened Mar 6, 2025 by
tchaton
flashinfer.prefill.single_prefill_with_kv_cache meets error when running on A100
#915
opened Mar 6, 2025 by
FengzhuoZhang
Previous Next
ProTip!
Follow long discussions with comments:>50.