Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Funcionality TensorOp 80+ s8 * s8 + s32 => {s32, s8} not working #1981

Open
IzanCatalan opened this issue Dec 11, 2024 · 0 comments
Open
Labels
? - Needs Triage bug Something isn't working

Comments

@IzanCatalan
Copy link

IzanCatalan commented Dec 11, 2024

Describe the bug
A clear and concise description of what the bug is.
Hi, I have checked that fprop conv2d is not working with integers for an Nvidia A100. I have read in https://github.com/NVIDIA/cutlass/blob/main/media/docs/functionality.md that with Ampere gpus (sm80) is possible to perform convolution with integers as is also test in https://github.com/NVIDIA/cutlass/blob/main/test/unit/conv/device/conv2d_fprop_implicit_gemm_s8nhwc_s8nhwc_s32nhwc_tensor_op_s32_sm80.cu

However, when I modify https://github.com/NVIDIA/cutlass/blob/main/examples/16_ampere_tensorop_conv2dfprop/ampere_tensorop_conv2dfprop.cu example putting there the same config does not work, and all I get is the following output:

Got cutlass error: Error Invalid Problem at: 656
This is a reference to the lines:

  ImplicitGemm implicit_gemm_op;

  size_t workspace_size = implicit_gemm_op.get_workspace_size(arguments);

  // Allocate workspace memory
  cutlass::device_memory::allocation<int8_t> workspace(workspace_size);

  result.status = implicit_gemm_op.can_implement(arguments);
  CUTLASS_CHECK(result.status);

I debugged a little bit inside the ImplicitGemm class and the error comes from lines 107-109:

   Status status = UnderlyingKernel::Mma::IteratorA::can_implement(args.problem_size);
    if (Status::kSuccess != status) {
      return status;
    }

Is this behaviour normal? Why, in theory, according to the functionality readme (and test), can I perform int8t convolution, but it seems not to be working? What data types are available for fprop conv2d for Amperes (sm80) and Volta architectures (sm70)?

Can be the same problem with data types found if, instead of configuring a convolution with a C++ example like example16, I use cutlass with Python?

And related to the last question, for Python, is the same ImplicitGemm class use it or is there any other class called when is perform a conv2d?

Thanks.

Izan.

@IzanCatalan IzanCatalan added ? - Needs Triage bug Something isn't working labels Dec 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
? - Needs Triage bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant