-
Notifications
You must be signed in to change notification settings - Fork 59
Pull requests: nod-ai/shark-ai
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Expand .create() and .add_to_archive() to handle ShardedTensors with InferenceTensor shards
#1884
opened Jul 18, 2025 by
Alex-Vasile
Loading…
Add an MLIR kernel for properly handling FP4 casting
#1882
opened Jul 18, 2025 by
KyleHerndon
Loading…
[sharktank][llm] Restrict page dimension to device_block_count
#1880
opened Jul 18, 2025 by
Groverkss
Loading…
[sharktank] Add sharding of fp4 quantized toy Llama theta
#1876
opened Jul 18, 2025 by
sogartar
Loading…
[sharktank] Make sharded-split tensor support quantized tensors
#1875
opened Jul 18, 2025 by
sogartar
Loading…
Bump IREE requirement pins to 3.6.0rc20250718
#1874
opened Jul 18, 2025 by
shark-pr-automator
bot
Loading…
[sharktank] Yet another refactor of assert_tensor_close for tree support and integer dtypes
#1858
opened Jul 17, 2025 by
sogartar
Loading…
Bump aiohttp from 3.11.3 to 3.12.14 in /shortfin
dependencies
Pull requests that update a dependency file
python
Pull requests that update python code
#1824
opened Jul 15, 2025 by
dependabot
bot
Loading…
shortfin_apps.sd / builder: Change driver name from amdgpu to hip.
#1811
opened Jul 12, 2025 by
monorimet
Loading…
Cast attn_output to h dtype in paged_llama_attention_block
#1761
opened Jul 4, 2025 by
sebvince
Loading…
Previous Next
ProTip!
What’s not been updated in a month: updated:<2025-06-19.