support heterogenous fanout type #4608

jnke2016 · 2024-08-13T15:54:13Z

closes #4589
closes #4591

ChuckHastings

Some thoughts on changing the API a bit.

ChuckHastings · 2024-08-13T19:19:42Z

cpp/include/cugraph/sampling_functions.hpp

  raft::random::RngState& rng_state,
  bool return_hops,
  bool with_replacement                           = true,
  prior_sources_behavior_t prior_sources_behavior = prior_sources_behavior_t::DEFAULT,
  bool dedupe_sources                             = false,
  bool do_expensive_check                         = false);

+#if 0
+/* FIXME:
+ There are two options to support heterogeneous fanout


Here's another option to explore.

Create a new function called neighbor_sample. Create it off of the biased sampling API, but with the following changes:

the biases become optional instead of required. Then it can do either uniform or biased in the same call just by whether the biases are included or not

the fanout and heterogeneous fanout as you have defined. Or we might explore using std::variant, where it would either take host_span or tuple of host span and make the right choice internally

Move the rng_state parameter to be right after the handle (before the graph_view). This feels like a better standard place for the parameter.

We can then mark the existing uniform_neighbor_sample and biased_neighbor_sample as deprecated. When we implement, the internal C++ implementation can just call the new neighbor_sample with the parameters properly configured. This makes it a non-breaking change (eventually we'll drop the old functions), but still keeps the code reuse increased.

Thoughts @seunghwak ?

the biases become optional instead of required. Then it can do either uniform or biased in the same call just by whether the biases are included or not

=> In this case, we may update the existing non-heterogeneous fanout type sampling functions as well. i.e. combine the uniform & biased sampling functions. Not sure about the optimal balancing point between creating too many functions vs creating a function with too many input parameters.

Yeah... I guess we should avoid creating a too busy function (one function handling all different types of sampling based on the input arguments excessively using std::variant & std::optional) but we should also avoid creating too many functions... Not sure what's the optimal balancing point...

In theory, adding new parameters exponentially increase code complexity (too handle all possible combinations of optional parameters), we should better create separate functions. If supporting an additional optional parameter requires only a minor change in the API and implementation, we may create one generic function (or we may create one complex function that handles all different options in the detail namespace and multiple public functions calling this if this helps in reducing code replication).

cpp/include/cugraph_c/sampling_algorithms.h

ChuckHastings · 2024-08-13T19:30:38Z

cpp/include/cugraph_c/sampling_algorithms.h

@@ -368,6 +410,7 @@ cugraph_error_code_t cugraph_uniform_neighbor_sample(
  const cugraph_type_erased_device_array_view_t* label_to_comm_rank,
  const cugraph_type_erased_device_array_view_t* label_offsets,
  const cugraph_type_erased_host_array_view_t* fan_out,
+  const cugraph_sample_heterogeneous_fanout_t* heterogeneous_fanout,


Perhaps we take the same approach here. Create a new C API function called neighbor_sample, following the biased function definition. Add this parameter. Deprecate the other functions. In the implementation we can just check for nullptr (NULL).

ChuckHastings · 2024-08-13T20:01:01Z

cpp/src/sampling/neighbor_sampling_impl.hpp

@@ -150,7 +173,7 @@ neighbor_sample_impl(

  std::vector<size_t> level_sizes{};
  int32_t hop{0};
-  for (auto&& k_level : fan_out) {
+  for (auto&& k_level : (*fan_out)) {


This isn't actually sufficient yet... but I'm more worried about the API right now.

This loop will need, in the case of heterogeneous sampling, to have 2 levels of for loop. An outer loop iterating by hop and an inner loop iterating by type.

I'd be inclined to add a setup loop that iterates over the types and generates the masks - and perhaps identifies the maximum number of hops to drive the outer loop. You'll need to get k_level from the right type/hop combination... so this for construct won't work at all, it will need to look different.

Right I only added it for it to compile. I will revisit this approach once we lock the API's interface. It is only supporting non heterogeneous type for now

ChuckHastings · 2024-08-13T20:01:23Z

cpp/src/sampling/neighbor_sampling_impl.hpp

@@ -192,7 +215,7 @@ neighbor_sample_impl(
    if (labels) { (*level_result_label_vectors).push_back(std::move(*labels)); }

    ++hop;
-    if (hop < fan_out.size()) {
+    if (hop < (*fan_out).size()) {


fan_out size will (potentially) vary by type.

Right I only added it for it to compile. I will revisit this approach once we lock the API's interface. It is only supporting non heterogeneous type for now

ChuckHastings · 2024-08-13T20:03:15Z

python/cugraph/cugraph/dask/sampling/uniform_neighbor_sample.py

+        # FIXME: Add expensive check to ensure all dict values are lists
+        # Convert to a tuple of sequence (edge type size and fanout values)
+        edge_type_size = []
+        [edge_type_size.append(len(s)) for s in list(fanout_vals.values())]


Does this iterate over the edge types in the dictionary in order? We need to make sure that this is constructed with edge type 0 first, followed by edge type 1, etc.

Right. I converted the heterogeneous fanout type to a sorted ordered dictionary.

ChuckHastings · 2024-08-13T20:03:56Z

python/cugraph/cugraph/dask/sampling/uniform_neighbor_sample.py

+        edge_type_size = []
+        [edge_type_size.append(len(s)) for s in list(fanout_vals.values())]
+        edge_type_fanout_vals = list(chain.from_iterable(list(fanout_vals.values())))
+        fanout_vals = (


Per my earlier suggestions, I think we want this to be a CSR structure, so converting from a list of sizes to a list of offsets is perhaps best done here.

We changed this back to a dense structure... so I think this code isn't right.

This still seems wrong to me. If you want to support fanout_vals as a dictionary I think we need to convert it to a dense array to get the right values. Do you have a python test for this path that we can verify?

ChuckHastings · 2024-08-13T20:04:22Z

python/cugraph/cugraph/sampling/uniform_neighbor_sample.py

@@ -314,8 +316,21 @@ def uniform_neighbor_sample(
        fanout_vals = fanout_vals.get().astype("int32")
    elif isinstance(fanout_vals, cudf.Series):
        fanout_vals = fanout_vals.values_host.astype("int32")
+    elif isinstance(fanout_vals, dict):


Same comments as above

…olidate neighborhood sampling functions

cpp/include/cugraph/sampling_functions.hpp

cpp/include/cugraph_c/sampling_algorithms.h

cpp/src/c_api/neighbor_sampling.cpp

…0_support_heterogeneous_fanout

ChuckHastings

Small change...

cpp/include/cugraph/detail/utility_wrappers.hpp

seunghwak

Review part 1

seunghwak · 2024-11-06T20:32:11Z

cpp/include/cugraph/detail/shuffle_wrappers.hpp

+    raft::handle_t const& handle,
+    rmm::device_uvector<vertex_t>&& vertices,
+    rmm::device_uvector<value0_t>&& values_0,
+    rmm::device_uvector<value1_t>&& values_1);


I am not sure about this function.

We have shuffle_values that work for arbitrary value types and we have additional functions for commonly used type combinations defined in this header file (and explicitly instantiated for reuse in multiple places).

This function works just for value0_t = float or double and value1_t = int32_t.

I have few suggestions.

Just use shuffle_values if you don't think you will call this function in multiple places for the same type combination.

More explicit about what values_0 and values1_1 are. For example, we are using weights, edge_id, and edge_weights for shuffle functions for edges. Just seeing this declaration, callers might be misled that this function will work for arbitrary value0_t and value1_t.

At the very minimum, we need to document what type combinations are supported.

I believe this function was added so that we didn't need to convert a big .cpp file to a .cu file just to shuffle values.

My suggestion would be to use option 2. If we later find other uses for this function we can revisit this.

This function is no longer used anywhere though so I can remove it for now. The only reason I added it was to shuffle the triplet vertices, labels and rank but after discussing with @ChuckHastings we don't need to shuffle the latter. Should I just remove this function for now since there is no use case?

Yes. If we don't need the function I would delete it.

I removed the function since it is unused

seunghwak · 2024-11-06T20:43:23Z

cpp/include/cugraph/detail/utility_wrappers.hpp

+void transform_increment(rmm::cuda_stream_view const& stream_view,
+                         raft::device_span<value_t> d_span,
+                         value_t value);
+


Similar here. I am not sure creating a thrust wrapper for arbitrary types is a good idea or not.

For commonly used types, we can clearly cut compile time and binary size by doing this.

In such case, I am inclined to better naming functions to indicate the supported types or at least properly document the supported types.

For example, for the sort function here,

We may rename the function to sort_vertices or at least sort_ints to indicate that this works only for integers and document the supported integer types (e.g. int32_t, int64_t). If we explicitly instantiated this function for floating point numbers as well, then we may create sort_floats as well.

Or at the very minimum, we need to document the supported types.

And our general convention is to pass stream as the last parameter.

Here, we are passing handle in some functions and stream in other functions. And passing stream as the last parameter when we are passing stream.

Better be consistent. I think we should pass stream as the last parameter consistently for the functions defined in this header file to allow calling these functions in multi-stream executions.

@ChuckHastings Any thoughts on this?

Stream should be the last parameter, I think. We should do some review of the code and identify other places where we should be passing stream instead of handle. I think passing the handle into the algorithm is great, since it gives us access to everything. But I had to do some complex things in MTMG to get some of the lower level functions working in a multi-stream environment because we use the handle too much. I think we should look at many of the non-public functions and explore passing the comms object and stream instead of passing the handle.

Regarding these wrappers for thrust calls, I think we'll end up with higher quality code if we have function names that are more precise about what we're doing. I think sort_ints might be sufficiently precise... I imagine there are other integer data types that we would want to sort.

I went for option 1

seunghwak · 2024-11-06T20:50:22Z

cpp/include/cugraph/detail/utility_wrappers.hpp

+ *
+ */
+template <typename value_t>
+void sort(raft::handle_t const& handle, raft::device_span<value_t> d_span);


raft::device_span<value_t> d_span I guess span here really does not convey any additional information. The type already specifies that this is a raft::device_span. Naming this variable as span really does not provided any additional information. It's like saying int32_t integer. Better rename this (e.g. values).

That might have been my old code that Joseph's modifying to use a span... so that naming issue is probably my fault.

seunghwak · 2024-11-06T20:54:45Z

cpp/include/cugraph/sampling_functions.hpp

+  std::optional<raft::device_span<int32_t const>> label_to_output_comm_rank,
+  raft::host_span<int32_t const> fan_out,
+  sampling_flags_t sampling_flags,
+  bool do_expensive_check = false);


A tedious thing... but I think it is more natural to list homogeneous sampling functions first.

This header file declares the sampling functions in the order of (heterogeneous, uniform), (heterogeneous, biased), (homogeneous, uniform), and (homogeneous, biased). Better list more basic sampling functions first.

seunghwak · 2024-11-06T21:00:54Z

cpp/include/cugraph/sampling_functions.hpp

+ * offsets), identifying the randomly selected edges.  src is the source vertex, dst is the
+ * destination vertex, weight (optional) is the edge weight, edge_id (optional) identifies the edge
+ * id, edge_type (optional) identifies the edge type, hop identifies which hop the edge was
+ * encountered in.  The offsets array (optional) identifies the offset for each label.


Better state that the size of the src, dst, weight, edge_id, edge_type, or hop vector is # sampled edges while the size of the offsets vector is # labels + 1.

seunghwak · 2024-11-06T21:01:57Z

cpp/include/cugraph/sampling_functions.hpp

+ * If @p starting_vertex_offsets is not specified then no organization is applied to the output, the
+ * offsets values in the return set will be std::nullopt.
+ *
+ * If @p starting_vertex_offsets is specified the offsets array will be populated. The offsets array


We don't have starting_vertex_offsets anymore.

Better update the documentation as well.

seunghwak · 2024-11-06T21:04:11Z

cpp/include/cugraph/sampling_functions.hpp

+ * @tparam edge_t Type of edge identifiers. Needs to be an integral type.
+ * @tparam weight_t Type of edge weights. Needs to be a floating point type.
+ * @tparam edge_type_t Type of edge type. Needs to be an integral type.
+ * @tparam bias_t Type of bias. Needs to be an integral type.


No bias_t in this function

seunghwak · 2024-11-06T21:10:27Z

cpp/include/cugraph/sampling_functions.hpp

+ * @param starting_vertex_labels Optional device span of labels associated with each starting vertex
+ * for the sampling.
+ * @param label_to_output_comm_rank Optional device span identifying which rank should get each
+ * vertex label.  This should be the same on each rank.


should get each vertex label=>should get sampling outputs of each vertex label?

seunghwak · 2024-11-06T21:15:29Z

cpp/include/cugraph/sampling_functions.hpp

+ * level. The fanout value at hop x is given by the expression 'fanout[x*num_edge_types +
+ * edge_type_id]'
+ * @param num_edge_types Number of edge types where a value of 1 translates to homogeneous neighbor
+ * sample whereas a value greater than 1 translates to heterogeneous neighbor sample.


What happens if the user passes 1 here? Throw an exception and asks to call the homogeneous version instead? Or internally call the homogeneous version? Or take the heterogeneous code path (which might be less efficient)?

seunghwak

Review part 2/3

cpp/include/cugraph_c/sampling_algorithms.h

seunghwak · 2024-11-06T21:32:57Z

cpp/src/sampling/detail/conversion_utilities.cu

+namespace cugraph {
+namespace detail {
+
+rmm::device_uvector<int32_t> convert_starting_vertex_offsets_to_labels(


starting_vertex_offsets=>starting_vertex_label_offsets?

And should we really define this function and create one additional layer of indirection? What this function does is just calling expand_sparse_offsets so why not just directly call expand_sparse_offsets?

Right we can directly call expand_sparse_offsets but I believe an additional layer of indirection was added for a better description of the operation being performed which is converting the starting_vertex_label_offsets to labels. And this is done through the expand_sparse_offsets method. @ChuckHastings any comment on this?

seunghwak · 2024-11-06T21:35:58Z

cpp/src/sampling/detail/conversion_utilities.cu

+                                     label_t{0},
+                                     thrust::maximum<label_t>());
+
+  label_map.resize(max_label + 1, handle.get_stream());


rmm::device_uvector<int32_t> label_map(0, handle.get_stream()); ... label_map.resize(max_label + 1, handle.get_stream());

=>

rmm::device_uvector<int32_t> label_map(max_label + 1, handle.get_stream());

And shouldn't the caller already know # labels? Should we really compute this here?

I believe we can infer the max/number of labels by looking at the size of the first device array in label_to_output_comm_rank. max_label = std::get<0>(label_to_output_comm_rank) - 1 or

rmm::device_uvector<int32_t> label_map(label_to_output_comm_rank, handle.get_stream());. @ChuckHastings is it a safe assumption to make?

cpp/src/sampling/detail/conversion_utilities_impl.cuh

seunghwak · 2024-11-06T21:46:34Z

cpp/src/sampling/neighbor_sampling_impl.hpp

+    starting_vertex_labels ? std::make_optional(std::vector<rmm::device_uvector<label_t>>{})
+                           : std::nullopt;
+
+  level_result_src_vectors.reserve((fan_out).size());


(fanout).size()=>fanout.size()

seunghwak · 2024-11-06T21:48:04Z

cpp/src/sampling/neighbor_sampling_impl.hpp

+      if (weights) { (*level_result_weight_vectors).push_back(std::move(*weights)); }
+      if (edge_ids) { (*level_result_edge_id_vectors).push_back(std::move(*edge_ids)); }
+      if (edge_types) { (*level_result_edge_type_vectors).push_back(std::move(*edge_types)); }
+      if (labels) { (*level_result_label_vectors).push_back(std::move(*labels)); }


No need to detach edge mask here?

seunghwak

Review part 3/3

seunghwak · 2024-11-06T21:53:03Z

cpp/tests/sampling/heterogeneous_biased_neighbor_sampling.cpp

+  int32_t num_edge_types{1};
+  bool flag_replacement{true};
+
+  bool check_correctness{true};


We are not testing with edge masking, and is this because we currently don't allow attaching two masks?

If that's the case, better add a FIXME statement. Once we add a primitive to support heterogeneous sampling, we won't need to attach two masks (or collapse two masks to one).

and is this because we currently don't allow attaching two masks?

Right. Iff you recall , I briefly mentioned that in one of our 1 on 1 few weeks ago. I am adding a fixme

I added a fixme

cpp/tests/sampling/heterogeneous_biased_neighbor_sampling.cpp

seunghwak · 2024-11-06T22:15:54Z

cpp/tests/sampling/mg_heterogeneous_biased_neighbor_sampling.cpp

+      mg_graph_view,
+      std::optional<raft::device_span<vertex_t const>>{std::nullopt},
+      rng_state,
+      // 20,


Delete commented out code.

…hen naming

…0_support_heterogeneous_fanout

support heterogenous fanout type

0adb2fd

github-actions bot added cuGraph python labels Aug 13, 2024

jnke2016 requested review from ChuckHastings and alexbarghi-nv August 13, 2024 17:51

jnke2016 added 2 commits August 13, 2024 11:01

remove unusued code

bb5a3e2

fix style

10fa86d

ChuckHastings reviewed Aug 13, 2024

View reviewed changes

jnke2016 added 3 commits August 20, 2024 04:15

create one API for both uniform and biased neighborhood sampling

f904350

use the same function for both uniform and biased nieghborhood sampling

1fc32c3

add support for heterogenous fanout support at the plc layer and cons…

8fc21f8

…olidate neighborhood sampling functions

github-actions bot added the CMake label Aug 20, 2024

jnke2016 added 11 commits August 20, 2024 09:06

remove outdated codes

01a57f3

add flag differentiating between biased and uniform sampling

3a6aeb2

update docstrings and rename variable

d2f6467

rename variable

5d25155

create new tuple type

80f8b86

remove unnecessary check

50e0fc5

add constructor converting from array_view_t to array_t

9f455bf

leverage new constructor and remove unnecessary code

d114534

ensure edge types are ordered in increasing order

cf4a3ae

update docstrings

bc87b50

update docstrings

3013684

ChuckHastings reviewed Aug 21, 2024

View reviewed changes

jnke2016 added 6 commits August 22, 2024 12:30

undo changes to uniform neighbor sample

d6b6234

undo changes to uniform neighbor sample

068b0a3

update docstrings

6920f65

re-order arguments

760c5cd

remove outdated comments

1e0ef27

add arguments and type check

de79620

jnke2016 added 11 commits October 30, 2024 10:08

remove obsolete instantiation

ae92c9f

remove unnecessary parenthesis

d7d6109

remove obsolete instantiation

6b3ffbd

fix style

67d7d0a

Merge remote-tracking branch 'upstream/branch-24.12' into branch-24.1…

3386230

…0_support_heterogeneous_fanout

fix type error

413a577

fix import error

22db98d

remove redundant tests

6a95852

remove hardcoded path

4f5dc3e

Merge remote-tracking branch 'upstream/branch-24.12' into branch-24.1…

334ce6d

…0_support_heterogeneous_fanout

fix style

9b4683a

ChuckHastings reviewed Nov 1, 2024

View reviewed changes

cpp/include/cugraph/detail/utility_wrappers.hpp Outdated Show resolved Hide resolved

jnke2016 added 4 commits November 1, 2024 15:22

rename 'd_value' to 'd_span'

8cb0c94

use 'label_list' as a map form 'comm_rank' to 'label_map'

77303d5

use 'label_list' as a map form 'comm_rank' to 'label_map'

117a4ec

add module biased_neighbor_sample

ac66f13

seunghwak reviewed Nov 6, 2024

View reviewed changes

jnke2016 added 11 commits November 7, 2024 12:12

rename variable

114bf56

avoid creating function that compile all types and be more explicit w…

63a59ca

…hen naming

remove unsued function

84face3

declare homogeneous functions first

ab853e5

rename variable

802d9b0

remove duplicated functions

c5bec5f

reorder variable declaration

4bd09f6

add fixme for not testing edge masking

5d6cb34

remove outdated fixme

24e31cb

fix style

7ca5d59

Merge remote-tracking branch 'upstream/branch-24.12' into branch-24.1…

64b7edc

…0_support_heterogeneous_fanout

support heterogenous fanout type #4608

Are you sure you want to change the base?

support heterogenous fanout type #4608

Conversation

jnke2016 commented Aug 13, 2024 • edited Loading

ChuckHastings left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

seunghwak Aug 14, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jnke2016 Aug 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ChuckHastings left a comment

Choose a reason for hiding this comment

seunghwak left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

seunghwak left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jnke2016 Nov 8, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

seunghwak left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jnke2016 commented Aug 13, 2024 •

edited

Loading

seunghwak Aug 14, 2024 •

edited

Loading

jnke2016 Aug 21, 2024 •

edited

Loading

jnke2016 Nov 8, 2024 •

edited

Loading