nod-ai / shark-ai Public

Notifications You must be signed in to change notification settings
Fork 59
Star 42

Code
Issues 186
Pull requests 129
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Pull requests: nod-ai/shark-ai

Labels 16 Milestones 1

New pull request New

129 Open 1,454 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Expand .create() and .add_to_archive() to handle ShardedTensors with InferenceTensor shards

#1884 opened Jul 18, 2025 by Alex-Vasile

Loading…

Use int8 for handling float8_e4m3fn compatiblity

#1883 opened Jul 18, 2025 by KyleHerndon

Loading…

Add an MLIR kernel for properly handling FP4 casting

#1882 opened Jul 18, 2025 by KyleHerndon

Loading…

[sharktank][llm] Restrict page dimension to device_block_count

#1880 opened Jul 18, 2025 by Groverkss

Loading…

[sharktank] Add sharding of fp4 quantized toy Llama theta

#1876 opened Jul 18, 2025 by sogartar

Loading…

[sharktank] Make sharded-split tensor support quantized tensors

#1875 opened Jul 18, 2025 by sogartar

Loading…

Bump IREE requirement pins to 3.6.0rc20250718

#1874 opened Jul 18, 2025 by shark-pr-automator bot

Loading…

[sharktank] Add FP4 quantized tensor split and cat

#1873 opened Jul 18, 2025 by sogartar

Loading…

Refactor decoder with stateful tools

#1871 opened Jul 18, 2025 by rsuderman

Loading…

Minor fix for token selector reservation

#1867 opened Jul 17, 2025 by rsuderman

Loading…

[sharktank] Yet another refactor of assert_tensor_close for tree support and integer dtypes

#1858 opened Jul 17, 2025 by sogartar

Loading…

[sharktank] Add toy Llama FP4 quantization

#1857 opened Jul 17, 2025 by sogartar

Loading…

Add native scorer

#1842 opened Jul 15, 2025 by zeeshanhaque21 • Draft

Fix view override for QuantizedTensor

#1835 opened Jul 15, 2025 by paulzzy

Loading…

Cleanup scheduler to update rather than add / remove

#1832 opened Jul 15, 2025 by rsuderman • Draft

Start adding fp4 quantization accuracy test

#1829 opened Jul 15, 2025 by aviator19941 • Draft

Lisal.mooncake update write back

#1827 opened Jul 15, 2025 by lisaliu1

Loading…

Bump aiohttp from 3.11.3 to 3.12.14 in /shortfin dependencies

Pull requests that update a dependency file

python

Pull requests that update python code

#1824 opened Jul 15, 2025 by dependabot bot

Loading…

Set required Python version to ">=3.11"

#1815 opened Jul 14, 2025 by marbre

Loading…

shortfin_apps.sd / builder: Change driver name from amdgpu to hip.

#1811 opened Jul 12, 2025 by monorimet

Loading…

Sharktank extend rotary mask

#1809 opened Jul 11, 2025 by stbaione

Loading…

Rewrite token selection strategy

#1793 opened Jul 9, 2025 by zeeshanhaque21 • Draft

Add a paramter that enables QuaRot for GEMMs

#1779 opened Jul 9, 2025 by KyleHerndon

Loading…

[tuner] add support for attention op

#1772 opened Jul 8, 2025 by bangtianliu

Loading…

Cast attn_output to h dtype in paged_llama_attention_block

#1761 opened Jul 4, 2025 by sebvince

Loading…

Previous 1 2 3 4 5 6 Next

Previous Next

ProTip! What’s not been updated in a month: updated:<2025-06-19.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!