Spectral Embedding #871

aamijar · 2025-05-04T00:40:49Z

No description provided.

copy-pr-bot · 2025-05-04T00:40:53Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

cpp/include/cuvs/preprocessing/spectral/spectral_embedding.hpp

cpp/include/cuvs/preprocessing/spectral/spectral_embedding_types.hpp

cpp/src/preprocessing/spectral/spectral_embedding.cu

cjnolet · 2025-05-05T15:42:00Z

cpp/src/preprocessing/spectral/spectral_embedding.cu

+  raft::copy(knn_coo.cols(), knn_cols.data_handle(), nnz, stream);
+  raft::copy(knn_coo.vals(), d_distances.data_handle(), nnz, stream);
+
+  raft::sparse::COO<float> coo_no_zeros(stream);  // Don't pre-allocate dimensions


It's okay to use this API in internal code, but we need to make sure we're using the new raft sparse coo_matrix_view in any public APIs. raft::sparse::COO is deprecated and we'll eventually have to change this call. Please create an issue in RAFT for updating this call to use the new raft sparse API types and reference that issue here for completeness. That way we can do a simple grep to find this and fix it.

I'm currently working on using the new coo_matrix types and adding support for the relevant functions in raft. Tracking here: rapidsai/raft#2659 rapidsai/raft#2656
Although I'm running into some cuda invalid memory accesses and I can't seem to debug why its happening. It was passing with my gtests, but when I connected it with cuML it didn't pass the pytests. I've narrowed it down to fail in the gtests when I call transform twice in a row.

Okay, I think I found the issue which is that I needed to initialize some vectors with zeros.

cpp/src/preprocessing/spectral/spectral_embedding.cu

cjnolet · 2025-05-05T15:43:27Z

cpp/src/preprocessing/spectral/spectral_embedding.cu

+  const int one = sym_coo.nnz;
+  raft::copy(row_ind.data_handle() + row_ind.size() - 1, &one, 1, stream);
+
+  auto csr_structure = raft::make_device_compressed_structure_view<int, int, int>(


You are ure using this here- why not also use the coo version above? That'll save us a lot of refactoring time in the future.

The new csr matrix types are supported in laplacian and lanczos functions. The new coo matrix types aren't supported in the functions where I need them, so that's why I am using the legacy ones. However, I'm trying to migrate to use the new types but currently stuck debugging #871 (comment)

cpp/src/preprocessing/spectral/spectral_embedding.cu

cpp/include/cuvs/preprocessing/spectral/spectral_embedding.hpp

cpp/src/preprocessing/spectral/detail/spectral_embedding.cuh

cpp/tests/preprocessing/spectral_embedding.cu

cjnolet · 2025-05-08T17:18:11Z

cpp/include/cuvs/preprocessing/spectral/spectral_embedding.hpp

Now that you've been able to wrap this through cuML, can you provide a sense of the speedup for different datasets? It would be super helpful to know what we are working with here.

cpp/include/cuvs/preprocessing/spectral/spectral_embedding.hpp

aamijar · 2025-05-08T23:15:08Z

cpp/include/cuvs/preprocessing/spectral/spectral_embedding.hpp

+  uint64_t seed;
+};
+
+// template <typename IndexTypeT, typename ValueTypeT>


enable support for <uint32_t/uint64_t, float/double>

cpp/src/preprocessing/spectral/spectral_embedding.cu

init spectral embedding

7c0c136

github-actions bot added cpp CMake labels May 4, 2025

aamijar added feature request New feature or request non-breaking Introduces a non-breaking change labels May 4, 2025

aamijar self-assigned this May 4, 2025

aamijar added 3 commits May 4, 2025 05:23

remove unused

123878b

update license year

f5d6915

return 0

d181676

aamijar mentioned this pull request May 4, 2025

Spectral Embedding rapidsai/cuml#6581

Draft

use raft matrix_vector_op

de5d56f