-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
leading dimension of sgemm in blas #34
Comments
Yes. If I may guess what you're trying to do, you're trying to multiply a row-major matrix with another row-major matrix, am I right? So, the BLAS definition for the cuBLAS library follows heavily the gonum BLAS interface definitions - this includes alll the conditions. The main reason for doing so is compatibility - it's designed such that you can just drop in replace gonum's BLAS or OpenBLAS... I understand this makes row-major matrix multiplication more difficult. The current workaround is to use |
Actually, that was all conjecture on what you were trying to do. I'm also considering an alternative which is to folllow the cuBLAS specs more closely wrt checks. So, if you could share with me the code that you wrote that led to the error, I'd be grateful. |
This should be fixed now @snowwalf . Can you check? |
In a simple language I think, If you are using row-major representation then the number of "columns" will be leading dimension and vice versa in column-major representation number of "rows". |
yup |
C = α op ( A ) op ( B ) + β C
where α and β are scalars, and A , B and C are matrices stored in column-major format with dimensions op ( A ) m × k , op ( B ) k × n and C m × n , respectively. Also, for matrix A
op ( A ) = A if transa == CUBLAS_OP_N A T if transa == CUBLAS_OP_T A H if transa == CUBLAS_OP_C
and op ( B ) is defined similarly for matrix B
Code line https://github.com/gorgonia/cu/blob/master/blas/blas.go#L3514
It seams that ldc always will be n.
However, according Nvidia cublas api document ( https://docs.nvidia.com/cuda/cublas/index.html#cublas-lt-t-gt-gemm ), ldc always will be m.
The same situation as lda and ldb.
So when I follow the Nvidia api doc, the params check will failed in blas.go. When I use opposite rule of lda/ldb/ldc from the doc, an error like "On entry to SGEMM parameter number 10 had an illegal value" will be returned.
It makes me confused. Would you please help me?
Thanks!
The text was updated successfully, but these errors were encountered: