Skip to content

Releases: ROCm/rocSPARSE

rocSPARSE 3.4.0 for ROCm 6.4.1

20 May 13:16
4953add
Compare
Choose a tag to compare

rocSPARSE code for ROCm 6.4.1 did not change. The library was rebuilt for the updated ROCm 6.4.1 stack.

rocSPARSE 3.4.0 for ROCm 6.4.0

11 Apr 13:35
4953add
Compare
Choose a tag to compare

Added

  • Added support for rocsparse_matrix_type_triangular in rocsparse_spsv
  • Added test filters smoke, regression, and extended for emulation tests.
  • Added rocsparse_[s|d|c|z]csritilu0_compute_ex routines for iterative ILU
  • Added rocsparse_[s|d|c|z]csritsv_solve_ex routines for iterative triangular solve
  • Added GPU_TARGETS to replace the now deprecated AMDGPU_TARGETS in cmake files
  • Added BSR format to the SpMM generic routine rocsparse_spmm

Changed

  • By default, build rocsparse shared library using --offload-compress compiler option which compresses the fat binary. This significantly reduces the shared library binary size.

Optimized

  • Improved the performance of rocsparse_spmm when used with row order for B and C dense matrices and the row split algorithm, rocsparse_spmm_alg_csr_row_split.
  • Improved the adaptive CSR sparse matrix-vector multiplication algorithm when the sparse matrix has many empty rows at the beginning or at the end of the matrix. This improves the routines rocsparse_spmv and rocsparse_spmv_ex when the adaptive algorithm rocsparse_spmv_alg_csr_adaptive is used.
  • Improved stream CSR sparse matrix-vector multiplication algorithm when the sparse matrix size (number of rows) decreases. This improves the routines rocsparse_spmv and rocsparse_spmv_ex when the stream algorithm rocsparse_spmv_alg_csr_stream is used.
  • Compared to rocsparse_[s|d|c|z]csritilu0_compute, the routines rocsparse_[s|d|c|z]csritilu0_compute_ex introduce a number of free iterations. A free iteration is an iteration that does not compute the evaluation of the stopping criteria, if enabled. This allows the user to tune the algorithm for performance improvements.
  • Compared to rocsparse_[s|d|c|z]csritsv_solve, the routines rocsparse_[s|d|c|z]csritsv_solve_ex introduce a number of free iterations. A free iteration is an iteration that does not compute the evaluation of the stopping criteria. This allows the user to tune the algorithm for performance improvements.
  • Improved user documentation

Resolved issues

  • Fixed an issue in rocsparse_spgemm, rocsparse_[s|d|c|z]csrgemm, and rocsparse_[s|d|c|z]bsrgemm where incorrect results could be produced when rocSPARSE was built with optimization level O0. This was caused by a bug in the hash tables that could allow keys to be inserted twice.
  • Fixed an issue in the routine rocsparse_spgemm when using rocsparse_spgemm_stage_symbolic and rocsparse_spgemm_stage_numeric, where the routine would crash when alpha and beta were passed as host pointers and where beta != 0.
  • Fixed an issue in rocsparse_bsrilu0 where the algorithm was running out of bounds of the bsr_val array.

Upcoming changes

  • Deprecated rocsparse_[s|d|c|z]csritilu0_compute routines. Users should use the newly added rocsparse_[s|d|c|z]csritilu0_compute_ex routines going forward.
  • Deprecated rocsparse_[s|d|c|z]csritsv_solve routines. Users should use the newly added rocsparse_[s|d|c|z]csritsv_solve_ex routines going forward.
  • Deprecated AMDGPU_TARGETS using in cmake files. Users should use GPU_TARGETS going forward.

rocSPARSE 3.3.0 for ROCm 6.3.3

19 Feb 17:47
9f64dd5
Compare
Choose a tag to compare

rocSPARSE code for ROCm 6.3.3 did not change. The library was rebuilt for the updated ROCm 6.3.3 stack.

rocSPARSE 3.3.0 for ROCm 6.3.2

28 Jan 15:44
9f64dd5
Compare
Choose a tag to compare

rocSPARSE code for ROCm 6.3.2 did not change. The library was rebuilt for the updated ROCm 6.3.2 stack.

rocSPARSE 3.3.0 for ROCm 6.3.1

20 Dec 16:13
ed5da19
Compare
Choose a tag to compare

rocSPARSE code for ROCm 6.3.1 did not change. The library was rebuilt for the updated ROCm 6.3.1 stack.

rocSPARSE 3.3.0 for ROCm 6.3.0

04 Dec 22:38
ed5da19
Compare
Choose a tag to compare

Added

  • Add rocsparse_create_extract_descr, rocsparse_destroy_extract_descr, rocsparse_extract_buffer_size, rocsparse_extract_nnz, and rocsparse_extract APIs to allow extraction of the upper or lower part of sparse CSR or CSC matrices.
  • Support for the gfx1151, gfx1200, and gfx1201 architectures.

Changed

  • Change the default compiler from hipcc to amdclang in install script and cmake files.
  • Change address sanitizer build targets so that only gfx908:xnack+, gfx90a:xnack+, gfx940:xnack+, gfx941:xnack+, and gfx942:xnack+ are built when BUILD_ADDRESS_SANITIZER=ON is configured.

Optimized

  • Improved user documentation

Resolved issues

  • Fixed the csrmm merge path algorithm so that diagonal is clamped to the correct range.
  • Fixed a race condition in bsrgemm that could on rare occasions cause incorrect results.
  • Fixed an issue in hyb2csr where the CSR row pointer array was not being properly filled when n=0, coo_nnz=0, or ell_nnz=0.
  • Fixed scaling in rocsparse_Xhybmv when only performing y=beta*y, for example, where alpha==0 in y=alpha*Ax+beta*y.
  • Fixed rocsparse_Xgemmi failures when the y grid dimension is too large. This occured when n >= 65536.
  • Fixed the gfortran dependency for the azurelinux operating system.

rocSPARSE 3.2.1 for ROCm 6.2.4

06 Nov 19:55
2d0f575
Compare
Choose a tag to compare

Added

  • Support for the gfx1151 architecture

rocSPARSE 3.2.0 for ROCm 6.2.2

27 Sep 16:01
b293299
Compare
Choose a tag to compare

rocSPARSE code for ROCm 6.2.2 did not change. The library was rebuilt for the updated ROCm 6.2.2 stack.

rocSPARSE 3.2.0 for ROCm 6.2.1

20 Sep 19:58
b293299
Compare
Choose a tag to compare

rocSPARSE code for ROCm 6.2.1 did not change. The library was rebuilt for the updated ROCm 6.2.1 stack.

rocSPARSE 3.2.0 for ROCm 6.2.0

02 Aug 16:15
b293299
Compare
Choose a tag to compare

Additions

  • New Merge-Path algorithm to SpMM, supporting CSR format
  • SpSM now supports row order
  • rocsparseio I/O functionality has been added to the library
  • rocsparse_set_identity_permutation has been added

Changes

  • Adjusted rocSPARSE dependencies to related HIP packages
  • Binary size has been reduced
  • A namespace has been wrapped around internal rocSPARSE functions and kernels
  • rocsparse_csr_set_pointers, rocsparse_csc_set_pointers, and rocsparse_bsr_set_pointers do now allow the column indices and values arrays to be nullptr if nnz is 0
  • gfx803 target has been removed from address sanitizer builds

Optimizations

  • Improved user manual
  • Improved contribution guidelines
  • SpMV adaptive and LRB algorithms have been further optimized on CSR format
  • Improved performance of SpMV adaptive with symmetrically stored matrices on CSR format

Fixes

  • Compilation errors with BUILD_ROCSPARSE_ILP64=ON have been resolved