-
Notifications
You must be signed in to change notification settings - Fork 3k
[Core][Ref] PagedAttention reference implementation #28815
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
PiotrKrzem
wants to merge
213
commits into
openvinotoolkit:master
Choose a base branch
from
PiotrKrzem:feature/paged_reference
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+1,958
−46
Open
Changes from 77 commits
Commits
Show all changes
213 commits
Select commit
Hold shift + click to select a range
3c6e10f
[ADD] Paged Attention Reference impl
PiotrKrzem e6faae8
[ADD] Testing suite, add missing params to PagedAttn, fix KV caching
PiotrKrzem 6d25c4a
[FIX] Merge conflict
PiotrKrzem 52d4b63
Merge branch 'master' into feature/paged_reference
PiotrKrzem fbc22b0
[ADD] Update reference to correctly compute both outputs with RoPE ro…
PiotrKrzem 69d4796
[FIX] Remove staged artifacts
PiotrKrzem d3b31fc
Merge branch 'master' into feature/paged_reference
PiotrKrzem b8c9742
[FIX] Add RoPE to testing suite
PiotrKrzem 687d6aa
[FIX] Add missing test cases:
PiotrKrzem 0986e6d
[FIX] Test build
PiotrKrzem dfe28b0
[FIX] Single op graph
PiotrKrzem c582c82
[FIX] Visitor test
PiotrKrzem abbea22
[FIX] Use reference_tests::Tensor in tests
PiotrKrzem 6ccf30d
[FIX] test case name:
PiotrKrzem a9ebf4f
[FIX] Separate extension, update dependencies
PiotrKrzem 7c2c449
[FIX] Compilation errrors
PiotrKrzem fc38fcb
[FIX] Refactor for unit testing
PiotrKrzem 3e41603
[FIX] Re-add 40 tests with computed RoPE
PiotrKrzem 4cbb5fd
[FIX] Remove from ops16, apply review comments
PiotrKrzem ad45c38
[FIX] Build errors from refactor
PiotrKrzem b0d4c0c
[FIX] Inline funcs to supress warning
PiotrKrzem ddd129e
[FIX] Clang
PiotrKrzem 5c21ba1
[FIX] Comparison dtype error
PiotrKrzem 1c12a5d
[FIX] Add unit funcs to named namespace
PiotrKrzem 8091215
[FIX] Rename namespace
PiotrKrzem e1ff55d
[FIX] clang
PiotrKrzem 7bf188a
[FIX] template func lookup
PiotrKrzem 7f0acd1
Merge branch 'master' into feature/paged_reference
PiotrKrzem 6a22c16
[FIX] Add func to headers to avoid unused warn
PiotrKrzem 2d460e0
[FIX] GPU build namespace err
PiotrKrzem 4aa34e5
Merge branch 'master' into feature/paged_reference
mlukasze 1626784
[FIX] Explicitly call ref func
PiotrKrzem c90519c
[FIX] Clang
PiotrKrzem 80cb9a9
[FIX] Param list err
PiotrKrzem 08c7a1f
[FIX] Params err pt2
PiotrKrzem 64a0fd2
[FIX] Review comments exc internal namespace
PiotrKrzem 8225f28
[FIX] Remove internal namespace
PiotrKrzem d6ee9ef
[FIX] Remove PagedAttn from internal opset
PiotrKrzem cf3e05a
[FIX] Remove from internal namespace pt2
PiotrKrzem 933c596
[FIX} Cleanup of v16 and remaining artifacts
PiotrKrzem 9d6c9a1
Merge branch 'master' into feature/paged_reference
mlukasze 145fbd3
Merge branch 'master' into feature/paged_reference
PiotrKrzem 4d1cf62
[FIX] Tests
PiotrKrzem b679e8f
Merge branch 'feature/paged_reference' of https://github.com/PiotrKrz…
PiotrKrzem 4087530
Update src/core/reference/include/openvino/reference/paged_attention.hpp
mmikolajcz 4184529
Merge branch 'openvinotoolkit:master' into feature/paged_reference
PiotrKrzem 97033ac
Update src/core/src/op/paged_attention.cpp
PiotrKrzem b6015fd
Update src/core/src/op/paged_attention.cpp
PiotrKrzem 8709d3a
[FIX] Minor fixes to tests and ref
PiotrKrzem 2818126
[FIX] Build testing suite
PiotrKrzem c70fe1d
[FIX] Multiplies initial value
PiotrKrzem a437d1d
Fix some issues with reference test cases. Some issues are still ther…
mmikolajcz 0b5aeaf
Initial draft of functional shared single layer tests for PagedAttention
mmikolajcz b018c5a
Improve PagedAttention test structure and test case naming
mmikolajcz 1624be2
Apply changes to reference impl
mmikolajcz 0e649d4
Apply requested changes
mmikolajcz eb85af4
Add scale func tests
mmikolajcz 4dbb868
Merge branch 'master' into feature/paged_reference
PiotrKrzem 0de5177
[FIX] k,v heads, fixed alibi formula, cache copy, minor review fixes
PiotrKrzem fa43f64
[FIX] Struct for params, ultimate code purify
PiotrKrzem a627646
[FIX] int32 build errors
PiotrKrzem b132ee2
[FIX] size_t iont32_t mismatch fix
PiotrKrzem f64ee26
[FIX] Key int32_t error
PiotrKrzem f948bfd
[FIX] Tests compilation vec2str
PiotrKrzem 2a360e3
Split k and v head size
mmikolajcz 03dcf55
[ADD] Cache manager simulation, eviction, 2 new inputs, 2 new outputs…
PiotrKrzem fdf61ca
Merge branch 'master' into feature/paged_reference
PiotrKrzem d33e037
[FIX] Build bugfix
PiotrKrzem 9375a33
[FIX] sliding_window unused
PiotrKrzem 6484555
[FIX] 5th output build errors
PiotrKrzem d25e073
[FIX] Type prop tests with new inputs
PiotrKrzem fb44591
[FIX] Compatibility rank checks, minor code fixes, tests classes fixes
PiotrKrzem 38fcda4
Merge branch 'master' into feature/paged_reference
PiotrKrzem b94f552
[FIX] Type prop tests after merge
PiotrKrzem 16825a3
[FIX] Namespace error
PiotrKrzem 9094308
[FIX] Namespace error
PiotrKrzem ae2c84c
[FIX] Clang
PiotrKrzem 14f9c31
[REVERT] Revert 5 outputs, review comments
PiotrKrzem 100321f
[ADD] Debug prints, new requested test cases
PiotrKrzem dc31bb9
[FIX] Commented tests for clarity:
PiotrKrzem fcfd3e3
Merge branch 'master' into feature/paged_reference
PiotrKrzem c8b8010
[WIP] Add cache manager on apr with genai
PiotrKrzem 1cea130
Merge branch 'master' into feature/paged_reference
mlukasze f682490
[FIX] Compilation errors
PiotrKrzem 5af811d
[FIX] Cache eviction with working block logic
PiotrKrzem 49b0316
[ADD] Inserter of cache into models, replace all key_cache. and value…
PiotrKrzem ec0d3d6
[FIX] Rewire and improve compiled model and sync infer request for ca…
PiotrKrzem b1a04bb
[FIX] Clean code, minor logic fixes
PiotrKrzem 657d401
[FIX] CompiledModel dependency
PiotrKrzem 0998633
[FIX] Build errors
PiotrKrzem adb2d81
[FIX] Clang
PiotrKrzem ba6de21
[FIX] Remove relocation artifact
PiotrKrzem c26b31a
Merge branch 'master' into feature/paged_reference
PiotrKrzem 5a4d8db
[FIX] set_out
PiotrKrzem 7dbfc79
Merge branch 'master' into feature/paged_reference
PiotrKrzem 36752fe
[FIX] Shape inference
PiotrKrzem 826f550
Merge branch 'feature/paged_reference' of https://github.com/PiotrKrz…
PiotrKrzem 10a68e2
Merge branch 'master' into feature/paged_reference
PiotrKrzem 5d30421
Merge branch 'master' into feature/paged_reference
PiotrKrzem 6123418
[FIX] Name to iName to index insertion of cache
PiotrKrzem 155373b
[ADD/FIX] Tests for CM, fix building errors, clang
PiotrKrzem e2a7678
[FIX] Gods of Cmake please let this work
PiotrKrzem 52607db
[FIX] Clang
PiotrKrzem 5fab303
[FIX] Android build error
PiotrKrzem 4fe3c45
[FIX] Android build pt2
PiotrKrzem c02e65d
[FIX] Cmake pt 2
PiotrKrzem b8961af
Merge branch 'master' into feature/paged_reference
PiotrKrzem 4875de5
[FIX] Android clang error pt3
PiotrKrzem e75886c
git pushMerge branch 'feature/paged_reference' of https://github.com/…
PiotrKrzem bdaad83
Merge branch 'master' into feature/paged_reference
PiotrKrzem 5cfef8c
Merge branch 'master' into feature/paged_reference
PiotrKrzem a0d1257
Merge branch 'master' into feature/paged_reference
PiotrKrzem fda736b
Merge branch 'master' into feature/paged_reference
mlukasze 1a5ec8d
[WIP][ADD] CacheManager version 2 with globally managed single memory…
PiotrKrzem b2135ef
Merge branch 'master' into feature/paged_reference
PiotrKrzem 0fd4677
[FIX] Remove redundane cache
PiotrKrzem 7552cce
[FIX] Move reference cache to core
PiotrKrzem f40f1f0
[FIX] PagedCache build errors pt1
PiotrKrzem 0578a3d
Merge branch 'feature/paged_reference' of https://github.com/PiotrKrz…
PiotrKrzem 41e6a39
Merge branch 'master' into feature/paged_reference
PiotrKrzem c2db68a
Merge branch 'master' into feature/paged_reference
mlukasze 086f358
[FIX] Use node as the key ID, fix memory access error, fix build erro…
PiotrKrzem cc14b54
Merge branch 'feature/paged_reference' of https://github.com/PiotrKrz…
PiotrKrzem 33fa48d
[FIX] Build fixes pt 3
PiotrKrzem e06a603
[FIX] Review comments, remove Ref tests for CPU tests, fix liner errors
PiotrKrzem 3abb9f5
[ADD] Ref vs CPU test
PiotrKrzem 7f0eff3
Merge branch 'master' into feature/paged_reference
PiotrKrzem 23fc025
[FIX] Linker errors from circular dependencies, cache uninitialized e…
PiotrKrzem 2ef1e13
git pushMerge branch 'feature/paged_reference' of https://github.com/…
PiotrKrzem 4eea05a
[FIX] Compile node error
PiotrKrzem f66f366
[FIX] Build error with new ID
PiotrKrzem e141585
[FIX] Link PCM to OV_API
PiotrKrzem 894881c
[FIX] Shape infer with new inputs
PiotrKrzem b9c0938
[FIX] Arguemnts list error
PiotrKrzem 3d87b42
[ADD] Debug message for C++ shape infer
PiotrKrzem d032535
[FIX] Macos whitespace
PiotrKrzem 50374ba
[ADD] Debug flags for PA inputs
PiotrKrzem af5d423
[FIX] Debug prints
PiotrKrzem 5a6aa18
[FIX] More debug prints
PiotrKrzem 08a5e27
[FIX] Even more debug prints
PiotrKrzem bd00bbe
Merge branch 'master' into feature/paged_reference
PiotrKrzem e27262c
[FIX] 21 inputs shape infer error
PiotrKrzem dc87a3d
[FIX] Tensor accessor
PiotrKrzem c6c4183
[FIX] Remove check for static shape for past lens
PiotrKrzem 287b160
[FIX] Allow 2-5 rank cache, limit to 4 rank for ref
PiotrKrzem 4f4d435
Merge branch 'master' into feature/paged_reference
PiotrKrzem 94e1653
Update attach_cache_manager_to_paged_attention.cpp
PiotrKrzem 8f7faf4
Update attach_cache_manager_to_paged_attention.hpp
PiotrKrzem 5bdd66c
Update paged_attention.hpp
PiotrKrzem 7b3fd7d
Update paged_attention.hpp
PiotrKrzem 4b4e302
Update paged_cache_manager.cpp
PiotrKrzem 537ea9b
Update paged_attention.hpp
PiotrKrzem 41d69c2
Merge branch 'master' into feature/paged_reference
PiotrKrzem e4a143c
Merge branch 'master' into feature/paged_reference
PiotrKrzem 1e380f8
Merge branch 'master' into feature/paged_reference
PiotrKrzem fb4b9f7
[FIX] Force undo changes, fix without namespace
PiotrKrzem c884ca2
[FIX] Clang
PiotrKrzem 144d549
Update simplify_shape_of_sub_graph.hpp
PiotrKrzem 89df626
Merge branch 'master' into feature/paged_reference
PiotrKrzem 12d874c
[DEBUG] Temporary revert of changes to check conditional compilation CI
PiotrKrzem 3258736
Merge branch 'master' into feature/paged_reference
PiotrKrzem c51c607
[FIX] Double down by style aligning to other common opt
PiotrKrzem beb5c11
Merge branch 'feature/paged_reference' of https://github.com/PiotrKrz…
PiotrKrzem 6675ee3
[FIX] Namespace change fix for CM
PiotrKrzem 5aafcbe
[FIX] Style
PiotrKrzem 03d2cfe
[FIX] Ref uses util PCM
PiotrKrzem 1a95889
Try fix CC build 1
praasz 6f92adb
Fix CC build 2
praasz 8e6cbaa
Try fix CC build 3
praasz 0325f02
[WIP][FIX] Review suggestions pt1
PiotrKrzem a905c13
[WIP][FIX] Review suggestions pt2
PiotrKrzem 3a17578
[WIP][FIX] Review suggestions pt3
PiotrKrzem 762f8e8
[WIP][FIX] Clang
PiotrKrzem c4cf062
[WIP][FIX] Convert fix, ov alignedbuffer introduction
PiotrKrzem 04dea11
[WIP][FIX] Clang
PiotrKrzem f382256
Merge branch 'master' into feature/paged_reference
PiotrKrzem 5a785a0
Merge branch 'master' into feature/paged_reference
PiotrKrzem f8c21be
Merge branch 'master' into feature/paged_reference
mlukasze f69e0e2
Merge branch 'master' into feature/paged_reference
PiotrKrzem e1f1d46
Update src/tests/functional/base_func_tests/src/base/utils/generate_i…
PiotrKrzem 275824d
[FIX][WIP] Resolve remaining majority of issues, blocked by relocatio…
PiotrKrzem f40b062
[FIX] Style
PiotrKrzem df9940d
Merge branch 'feature/paged_reference' of https://github.com/PiotrKrz…
PiotrKrzem 198b0fb
[FIX][WIP] Opaque ptr, conversions, minor fixes
PiotrKrzem 2962170
[FIX] Clang
PiotrKrzem 5cd5b22
Merge branch 'master' into feature/paged_reference
PiotrKrzem b64dda2
Merge branch 'master' into feature/paged_reference
PiotrKrzem b978c8c
[WIP][FIX] Build
PiotrKrzem 6face9e
[FIX] Clang
PiotrKrzem f833c4b
Merge branch 'master' into feature/paged_reference
PiotrKrzem 652f72c
Merge branch 'master' into feature/paged_reference
PiotrKrzem 9fc5ec3
[FIX] CPU Ref tests XAttn
PiotrKrzem 97cb81b
[FIX] Shape inference of 3rd shape, CPU tests
PiotrKrzem b5d173e
Merge branch 'master' into feature/paged_reference
PiotrKrzem f6f20ae
[FIX] C4273 build error
PiotrKrzem 942696c
[FIX] Clang
PiotrKrzem d9fa322
[FIX] Clang pt2
PiotrKrzem 0732b5e
[FIX] Clang3
PiotrKrzem e15e1f2
[FIX][WIP] CPU reference tests
PiotrKrzem 152fb68
[FIX][WIP] Fix for cache management pt2 for CPU func tests
PiotrKrzem 9e2576c
[FIX][WIP] Disable SDPA transformation for Ref comparison
PiotrKrzem 89f8ec1
[FIX][WIP] Fix shape building and simplify code for PA CPU tests
PiotrKrzem ac05f85
[FIX] Shape inference critical error for dynamic evictable sizes
PiotrKrzem 6481a3a
Merge branch 'master' into feature/paged_reference
PiotrKrzem 1320ec6
[FIX] Rewrite tests for clear comparison
PiotrKrzem c680446
Merge branch 'feature/paged_reference' of https://github.com/PiotrKrz…
PiotrKrzem c89e8ea
[FIX] Revert old test class to master
PiotrKrzem 956751c
[FIX] C4273 fix for Windows build pt2
PiotrKrzem 0a21789
[FIX] Provide void handle definiton inclass
PiotrKrzem f24bdda
[FIX] Clang, test fixture includes
PiotrKrzem 85ada87
[FIX] Neutralize quantization transformation for KV cache
PiotrKrzem 5385ca9
[FIX] Test only statis data types:
PiotrKrzem 2e94de1
Merge branch 'openvinotoolkit:master' into feature/paged_reference
PiotrKrzem File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is required?
The getters/setters should be for attributes?
is not same as node::get_output_element_type()