Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update cudnn convolution kernel #10440

Open
wants to merge 13 commits into
base: master
Choose a base branch
from
Open

Update cudnn convolution kernel #10440

wants to merge 13 commits into from

Conversation

linzs148
Copy link
Contributor

@linzs148 linzs148 commented Mar 6, 2024

No description provided.

@linzs148 linzs148 requested a review from mosout March 6, 2024 07:23
@linzs148 linzs148 added the op label Mar 6, 2024
Copy link
Contributor

github-actions bot commented Mar 6, 2024

Code got formatted by CI. Please request CI again if you still want to have this PR merged. If the PR is from a forked repo, please download the patch files from the GitHub Actions web page and apply them locally.

@linzs148 linzs148 changed the title Update cudnn convolution kernel Add cudnn-frontend dependency Mar 6, 2024
@linzs148 linzs148 changed the title Add cudnn-frontend dependency Update cudnn convolution kernel Mar 17, 2024
oneflow/core/device/cudnn_util.h Outdated Show resolved Hide resolved
void Compute(user_op::KernelComputeContext* ctx, user_op::OpKernelState*,
const user_op::OpKernelCache* cache) const override {
// process context data
auto input = ctx->Tensor4ArgNameAndIndex("in", 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不可变对象比如in要用const auto*,可变对象比如tmp_buffer要用auto*,下面类似的地方都要这样

.SetIsMatchedHob(user_op::HobDeviceType() == DeviceType::kCUDA \
&& user_op::HobEnvBool("ONEFLOW_KERNEL_ENABLE_CUDNN_V8", false)) \
.SetInferTmpSizeFn([](user_op::InferContext* ctx) -> size_t { \
auto& input = ctx->InputTensorDesc("in", 0); \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const,下同

oneflow/core/device/cudnn_conv_util.h Outdated Show resolved Hide resolved

private:
void Compute(user_op::KernelComputeContext* ctx) const override {
auto input = ctx->Tensor4ArgNameAndIndex("x", 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上

.SetIsMatchedHob(user_op::HobDeviceType() == DeviceType::kCUDA
&& user_op::HobEnvBool("ONEFLOW_KERNEL_ENABLE_CUDNN_V8", false))
.SetInferTmpSizeFn([](user_op::InferContext* ctx) -> size_t {
auto& input = ctx->InputTensorDesc("x", 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上

@mosout
Copy link
Contributor

mosout commented Apr 25, 2024

另外还有很多地方用了auto,可以看一下能加const的都加上const,部分函数声明中的参数在函数体中是不可变的也都加const&

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants