Skip to content

Add ggml_roll #1274

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 18, 2025
Merged

Add ggml_roll #1274

merged 2 commits into from
Jun 18, 2025

Conversation

Acly
Copy link
Collaborator

@Acly Acly commented Jun 13, 2025

This adds ggml_roll (torch.roll) as a new operation. It shifts tensor elements with wrap behavior.

It is used by eg. SWIN transformer to implement shift for its window attention.

Comment on lines +1805 to +1814
// Move tensor elements by an offset given for each dimension. Elements that
// are shifted beyond the last position are wrapped around to the beginning.
GGML_API struct ggml_tensor * ggml_roll(
struct ggml_context * ctx,
struct ggml_tensor * a,
int shift0,
int shift1,
int shift2,
int shift3);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we change the signature like so:

Suggested change
// Move tensor elements by an offset given for each dimension. Elements that
// are shifted beyond the last position are wrapped around to the beginning.
GGML_API struct ggml_tensor * ggml_roll(
struct ggml_context * ctx,
struct ggml_tensor * a,
int shift0,
int shift1,
int shift2,
int shift3);
// Move tensor elements by an offset given for each dimension. Elements that
// are shifted beyond the last position are wrapped around to the beginning.
// If tensor b is provided, the shifts become: shiftX' = shiftX + b[X]
GGML_API struct ggml_tensor * ggml_roll(
struct ggml_context * ctx,
struct ggml_tensor * a,
struct ggml_tensor * b, // optional I64 [GGML_MAX_DIMS]
int shift0,
int shift1,
int shift2,
int shift3);

This in theory would allow to pass the shifts as dynamic graph input data, which makes the graph static in terms of nodes (and op params). Could be useful for graph reuse.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't bother with this unless there is some use case that you are thinking about, otherwise this is likely to end increasing complexity for no real reason. If you decide to add this, for clarity I would recommend adding a different function that takes only the tensor, rather than trying to do everything with a single function and optional parameters.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let's merge as is for now.

@slaren slaren merged commit 9e4bee1 into ggml-org:master Jun 18, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants