Skip to content

Sliding Window Forward #151

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Jul 10, 2025
Merged

Sliding Window Forward #151

merged 10 commits into from
Jul 10, 2025

Conversation

micmelesse
Copy link
Collaborator

@micmelesse micmelesse commented Jul 10, 2025

Sliding Window implmentation for fwd, varlen_fwd & fwd_kvcache

micmelesse added 10 commits July 1, 2025 10:25
test case

set up debug inputs

add fwd ref

one mask ref

fwd first pass

save

ref doesnot work for bigger seqlens

save new version

some causal cases failing

found bad cases

working new attn

new atten works

new attn_fwd works

reorg n_extra_tokens

use seqlen_delta_qk

ref fwd works

add sliding window to bwd ref

test kvcache

decode ref work with everything except sliding window

add debug code for 12 failing sliding window cases for decode

attention_decode_forward_ref_impl mostly works except for alibi

fix alibi in attention_decode_forward_ref_impl

ref works with normal, varlen & kvcache

move stuff around

figure out masking

old attn inner

two inner functions

remove load_fn

do Lk - Lq like ref

unify IS_CAUSAL code in epilogue

clean up

add args

rm inference stuff

simplify compute_masking

simpler compute mask

stub out returning front masking variables

remove pointer pass

compute ptrs inloop

compute block min and max

window stub inside inner mask loop

trying to use attn_fwd_mask causes issues

fix compiler bug when front masking

gen specifc types

add sliding window and debug statements

use identity for v

add more taste cases

add comments

save

use k_max_token for clarity

disable debug configs

basic NON-CAUSAL SLIDING WINDOW

non causal sliding window works on the all the shapes

non sliding window working in fwd

clean up fused bwd

seperate old fwd_prefill

move configs to utils.py
@micmelesse
Copy link
Collaborator Author

main test suite passing

image

@micmelesse
Copy link
Collaborator Author

test.py passes

image

@micmelesse
Copy link
Collaborator Author

Perf is not changed for llama config
image

@micmelesse micmelesse marked this pull request as ready for review July 10, 2025 18:14
@micmelesse micmelesse merged commit 60c7334 into main_perf Jul 10, 2025
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant