-
Notifications
You must be signed in to change notification settings - Fork 110
update op of cat #769
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update op of cat #769
Conversation
I think this modification no longer supports input tensors that are not contiguous. |
Thank you for your suggestion. I have updated the code. I am not sure if it meets your requirements? If convenient, please provide more suggestions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
|
||
for j in range(4): | ||
if j < num_tensors_in_batch: | ||
tensor = tensors_in_batch[j].contiguous() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend reusing the flag_gems.contiguous
to ensure that the triton kernel runs, even when it is called explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is flag_gems.contiguous ? I haven't seen any places that have used it?
src/flag_gems/ops/cat.py
Outdated
@triton.jit | ||
def copy_func(x): | ||
return x | ||
def copy_func_kernel( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the copy_func_kernel
still useful? If not, I suggest deleting it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sry, i didnt notice that, I will remove it!
PR Category
Type of Change
Description
Issue
Progress
Performance