Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Jitify versions of binaryops for non-homogeneous types #892

Merged
merged 65 commits into from
Mar 9, 2019
Merged
Show file tree
Hide file tree
Changes from 51 commits
Commits
Show all changes
65 commits
Select commit Hold shift + click to select a range
0421c02
The core implementation. Ported from PR 94 on libgdf.
devavret Feb 6, 2019
d66bfda
Jitify binops google tests.
devavret Feb 7, 2019
c23fda5
Cleanups
devavret Feb 7, 2019
f84a330
Merge branch 'branch-0.6' into enh-ext-jitify-binops
devavret Feb 11, 2019
ca64bde
Merge branch 'fea-ext-test-utils-improvements' into enh-ext-jitify-bi…
devavret Feb 11, 2019
061d4e5
Merge branch 'fea-ext-test-utils-improvements' into enh-ext-jitify-bi…
devavret Feb 12, 2019
c9791b1
Refactored scalar implementation, added a test util to work with scal…
devavret Feb 12, 2019
590bb5c
Merge branch 'fea-ext-test-utils-improvements' into enh-ext-jitify-bi…
devavret Feb 13, 2019
9017317
The tests are all clean now and only use our own utils.
devavret Feb 13, 2019
df64885
Added cython bindings
devavret Feb 16, 2019
8fd5081
Cleanup types and doc.
devavret Feb 16, 2019
beec867
Use jit binop in cuDF for heterogeneous types.
devavret Feb 16, 2019
7fe29b8
Fixed bug where jit kernel assumes we always have null mask.
devavret Feb 16, 2019
8e54098
fixed another bug where null mask existence was assumed, now in cuDF
devavret Feb 17, 2019
ada453e
Enabled global JIT cache and confirmed performance and correctness. t…
devavret Feb 17, 2019
88c57c1
Added remaining binary ops were previously unimplemented
devavret Feb 18, 2019
b6650ce
Added pytests for the two new ops
devavret Feb 18, 2019
f9dea20
renamed Vax and Vay to Lhs and Rhs.
devavret Feb 20, 2019
48302f7
Documentation cleanup in types.
devavret Feb 20, 2019
d553db0
change name of namespace gdf to cudf.
devavret Feb 20, 2019
4413a78
changed error handling to use GDF_REQUIRE
devavret Feb 20, 2019
ca9e43e
documentation for launcher
devavret Feb 20, 2019
a55f111
Some cleanup of traits.
devavret Feb 20, 2019
b3673fc
Merge branch 'bug-binops-nullmask-and' into enh-ext-jitify-binops
devavret Feb 21, 2019
52892b6
Merge branch 'fea-ext-test-utils-improvements' into enh-ext-jitify-bi…
devavret Feb 21, 2019
96f83e7
modified `scalar_wrapper` to work with updated type_dispatcher
devavret Feb 21, 2019
fea6751
fix merge conflict that wasn't detected earlier
devavret Feb 21, 2019
bf4dc61
Removed valid mask binary op from JIT kernel
devavret Feb 21, 2019
358e0c0
fix bug in null mask calculation
devavret Feb 22, 2019
877ac61
Merge branch 'branch-0.6' into enh-ext-jitify-binops
devavret Feb 25, 2019
ca0b31f
We didn't need common_type after all
devavret Feb 25, 2019
b269ee4
Removed passing valid mask pointers to jit kernels
devavret Feb 25, 2019
322e4a5
Jit kernel now uses gdf_size_type
devavret Feb 25, 2019
35c2c46
Changed `getTypeName` to use compiler generated string instead of har…
devavret Feb 26, 2019
d93fa0b
style fix
devavret Feb 26, 2019
df641c4
Added modulo to index because we support it now.
devavret Feb 26, 2019
aaae814
fixing pytests.
devavret Feb 26, 2019
02e2a9e
Merge branch 'branch-0.6' into enh-ext-jitify-binops
devavret Feb 26, 2019
e0c8fab
fix inlining for util functions.
devavret Feb 26, 2019
be31178
Merge branch 'fea-ext-test-utils-improvements' into enh-ext-jitify-bi…
devavret Feb 27, 2019
c5cbd8b
documentation update
devavret Feb 27, 2019
b5c729b
changed casts and traits in jit code
devavret Feb 27, 2019
febbad6
more doc changes.
devavret Feb 27, 2019
d0e30b2
removed changelog duplicates (artifacts of merge conflicts)
devavret Feb 27, 2019
dde39a7
refactored the mask calculation.
devavret Feb 27, 2019
86ea59b
removed redundant null_count calculation and fixed a bug in calculati…
devavret Feb 28, 2019
11b67dc
fix rebuilding of launcher.cpp everytime because types.h.jit was alwa…
devavret Feb 28, 2019
9520853
Changed gdf_scalar to use union and is_valid bool member.
devavret Mar 4, 2019
be21879
added nvidia license
devavret Mar 4, 2019
3946e1c
Merge branch 'fea-ext-test-utils-improvements' into enh-ext-jitify-bi…
devavret Mar 5, 2019
78a3cc8
Merge branch 'branch-0.6' into enh-ext-jitify-binops
devavret Mar 5, 2019
99a943f
added TODO #1119
devavret Mar 6, 2019
500e188
Merge branch 'branch-0.6' into enh-ext-jitify-binops
devavret Mar 6, 2019
ec1116b
documentation cleanup
devavret Mar 6, 2019
c0243e6
one more doc change that i missed.
devavret Mar 7, 2019
9c729c7
changed JITIFY thread safe macro in CMakeLists
devavret Mar 7, 2019
4626f7d
Changed union members to typedef'd versions
devavret Mar 7, 2019
b5d1f78
Merge branch 'branch-0.6' into enh-ext-jitify-binops
devavret Mar 7, 2019
7fe729a
include the changed location of `error_utils.hpp`
devavret Mar 7, 2019
79d5111
updated jitify
devavret Mar 7, 2019
8c11073
Changed Cython binding map to dict.
devavret Mar 7, 2019
87e0b41
Clean Cmakelists
devavret Mar 7, 2019
0d797fe
Merge branch 'branch-0.6' into enh-ext-jitify-binops
kkraus14 Mar 8, 2019
09ffe36
Use rapids' fork of Jitify submodule.
devavret Mar 8, 2019
2bfa21e
now pointing to a branch in Forked Jitify
devavret Mar 8, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,6 @@
path = thirdparty/rmm
url = https://github.com/rapidsai/rmm.git
branch = branch-0.6
[submodule "thirdparty/jitify"]
path = thirdparty/jitify
url = https://github.com/NVIDIA/jitify.git
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we pin this to a specific branch / tag / commit?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please pin to a commit to avoid build issues in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this do? Or do I need to specify it in .gitmodules.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Collaborator

@kkraus14 kkraus14 Mar 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to specify it in .gitmodules otherwise a git submodule update --init --recursive --remote will update it to the latest commit of master.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't seem to find a way to pin this to a commit/tag in the .gitmodules file. I read this and update seems to be the only option. But details on update say that the only way to do this is to use none. Is that what I should do?

2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@

## Improvements

- PR #892 Add support for heterogeneous types in binary ops with JIT
- PR #730 Improve performance of `gdf_table` constructor
- PR #561 Add Doxygen style comments to Join CUDA functions
- PR #813 unified libcudf API functions by replacing gpu_ with gdf_
Expand All @@ -57,7 +58,6 @@
- PR #909 CSV Reader: Avoid host->device->host copy for header row data
- PR #916 Improved unit testing and error checking for `gdf_column_concat`
- PR #941 Replace `numpy` call in `Series.hash_encode` with `numba`
- PR #943 Updated `count_nonzero_mask` to return `num_rows` when the mask is null
- PR #942 Added increment/decrement operators for wrapper types
- PR #943 Updated `count_nonzero_mask` to return `num_rows` when the mask is null
- PR #952 Added trait to map C++ type to `gdf_dtype`
Expand Down
35 changes: 34 additions & 1 deletion cpp/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -141,6 +141,7 @@ include_directories("${ARROW_INCLUDE_DIR}"
"${CMAKE_SOURCE_DIR}/include"
"${CMAKE_SOURCE_DIR}/src"
"${CMAKE_SOURCE_DIR}/thirdparty/cub"
"${CMAKE_SOURCE_DIR}/thirdparty/jitify"
"${CMAKE_SOURCE_DIR}/thirdparty/moderngpu/src"
"${CMAKE_SOURCE_DIR}/thirdparty/rmm/include"
"${ZLIB_INCLUDE_DIRS}")
Expand Down Expand Up @@ -189,6 +190,13 @@ add_library(cudf SHARED
src/groupby/groupby.cu
src/groupby/new_groupby.cu
src/binary/binary_ops.cu
src/binary/jit/code/kernel.cpp
src/binary/jit/code/operation.cpp
src/binary/jit/code/traits.cpp
src/binary/jit/core/binop.cpp
src/binary/jit/core/launcher.cpp
src/binary/jit/util/operator.cpp
src/binary/jit/util/type.cpp
src/bitmask/bitmask_ops.cu
src/bitmask/valid_ops.cu
src/compaction/stream_compaction_ops.cu
Expand All @@ -213,6 +221,24 @@ add_library(cudf SHARED
#Override RPATH for cudf
SET_TARGET_PROPERTIES(cudf PROPERTIES BUILD_RPATH "\$ORIGIN")

###################################################################################################
# - jitify ----------------------------------------------------------------------------------------

add_executable(stringify "${CMAKE_SOURCE_DIR}/thirdparty/jitify/stringify.cpp")
execute_process(WORKING_DIRECTORY ${CMAKE_BINARY_DIR}
COMMAND ${CMAKE_COMMAND} -E make_directory ${CMAKE_BINARY_DIR}/include)

add_custom_command(OUTPUT ${CMAKE_BINARY_DIR}/include/types.h.jit
WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}/include
COMMAND ${CMAKE_BINARY_DIR}/stringify cudf/types.h > ${CMAKE_BINARY_DIR}/include/types.h.jit
COMMENT "stringify header types.h"
DEPENDS stringify
MAIN_DEPENDENCY ${CMAKE_CURRENT_SOURCE_DIR}/include/cudf/types.h)

add_custom_target(stringify_run DEPENDS ${CMAKE_BINARY_DIR}/include/types.h.jit)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DEPENDS does not have to be a path, it can be a CMake variable.

For example:

add_custom_command(OUTPUT  TYPES_JIT
	                   WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}/include
	                   COMMAND ${CMAKE_BINARY_DIR}/stringify cudf/types.h > ${CMAKE_BINARY_DIR}/include/types.h.jit
                           ...)
...
add_custom_target(stringify_run DEPENDS TYPES_JIT)
...

As long as the MAIN_DEPENDENCY is properly specified to be ${CMAKE_CURRENT_SOURCE_DIR}/include/cudf/types.h, CMake will update jit_types if that file changes.

Note that this is helpful if later you expect your command to have multiple outputs.

I would also consider renaming stringify_run to something more descriptive. I'm not really sure what the purpose of this target is off the cuff, and that obfuscates our build process for new developers, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this but it didn't work. Running make rebuilds launcher.cpp every time.

add_custom_command(OUTPUT STRINGIFIED_HEADERS
                   WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}/include
                   COMMAND ${CMAKE_BINARY_DIR}/stringify cudf/types.h > ${CMAKE_BINARY_DIR}/include/types.h.jit
                   COMMENT "Run stringify on header types.h to convert it to c-str for use in JIT compiled code"
                   DEPENDS stringify
                   MAIN_DEPENDENCY ${CMAKE_CURRENT_SOURCE_DIR}/include/cudf/types.h)

add_custom_target(stringified_headers DEPENDS STRINGIFIED_HEADERS)

add_dependencies(cudf stringified_headers)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry what does the cmake code above have to do with launcher.cpp?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is for making and running an executable that comes with Jitify. This executable, called stringify, converts source files into c-strings and writes them to another source file. Now those stringified source files can be used in JIT code compilation. In our specific use case, we wanted to be able to use the definitions in types.h in the JIT kernels. So we use stringify to convert types.h into a string and store it in types.h.jit in the build/include directory. That types.h.jit is included in launcher.cpp. I wanted a way to use CMake to run stringify only when types.h is changed. With @mt-jones ' suggestion, CMake runs it all the time and every time i run make, launcher.cpp is rebuilt because the include is touched.


add_dependencies(cudf stringify_run)

###################################################################################################
# - build options ---------------------------------------------------------------------------------

Expand All @@ -230,11 +256,18 @@ if(HT_LEGACY_ALLOCATOR)
set(CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} --define-macro HT_LEGACY_ALLOCATOR")
endif(HT_LEGACY_ALLOCATOR)

option(JITIFY_THREAD_SAFE "Use a global cache for JIT compiled kernels" ON)
if(JITIFY_THREAD_SAFE)
message(STATUS "Using global cache for JIT compiled kernels")
set(CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} --define-macro JITIFY_THREAD_SAFE")
endif(JITIFY_THREAD_SAFE)


###################################################################################################
# - link libraries --------------------------------------------------------------------------------

target_link_libraries(cudf rmm "${ARROW_LIB}" ${ZLIB_LIBRARIES} NVStrings)
# TODO: better nvrtc linking with optional variables
target_link_libraries(cudf rmm "${ARROW_LIB}" ${ZLIB_LIBRARIES} NVStrings nvrtc)

###################################################################################################
# - python cffi bindings --------------------------------------------------------------------------
Expand Down
60 changes: 60 additions & 0 deletions cpp/include/cudf/functions.h
Original file line number Diff line number Diff line change
Expand Up @@ -1960,6 +1960,66 @@ gdf_error gdf_extract_datetime_second(gdf_column *input, gdf_column *output);

/* binary operators */

/**
* @brief Binary operation function between gdf_scalar and gdf_column structs.
*
* The function performs the binary operation of a gdf_scalar operand and a
* gdf_column operand.
*
* The desired output type needs to be specified in out->dtype
*
* If the valid field in the gdf_column output is not nullptr, then the valid
* mask from rhs gdf_column is copied into the data pointer to by out->valid
*
* @param out (gdf_column) Output of the operation.
* @param lhs (gdf_scalar) First operand of the operation.
* @param rhs (gdf_column) Second operand of the operation.
* @param ope (enum) The binary operator that is going to be used in the operation.
* @return GDF_SUCCESS if the operation was successful, otherwise an appropriate
* error code
*/
gdf_error gdf_binary_operation_s_v(gdf_column* out, gdf_scalar* lhs, gdf_column* rhs, gdf_binary_operator ope);

/**
* @brief Binary operation function between gdf_column and gdf_scalar structs.
*
* The function performs the binary operation of a gdf_column operand and a
* gdf_scalar operand.
*
* The desired output type needs to be specified in out->dtype
*
* If the valid field in the gdf_column output is not nullptr, then the valid
* mask from lhs gdf_column is copied into the data pointer to by out->valid
*
* @param out (gdf_column) Output of the operation.
* @param lhs (gdf_column) First operand of the operation.
* @param rhs (gdf_scalar) Second operand of the operation.
* @param ope (enum) The binary operator that is going to be used in the operation.
* @return GDF_SUCCESS if the operation was successful, otherwise an appropriate
* error code
*/
gdf_error gdf_binary_operation_v_s(gdf_column* out, gdf_column* lhs, gdf_scalar* rhs, gdf_binary_operator ope);

/**
* @brief Binary operation function between two gdf_column structs.
*
* The function performs the binary operation of two gdf_column operands.
*
* The desired output type needs to be specified in out->dtype
*
* If the valid field in the gdf_column output is not nullptr, then it will be
* filled with the bitwise AND of the valid masks of lhs and rhs gdf_column's
*
* @param out (gdf_column) Output of the operation.
* @param lhs (gdf_column) First operand of the operation.
* @param rhs (gdf_column) Second operand of the operation.
* @param ope (enum) The binary operator that is going to be used in the operation.
* @return GDF_SUCCESS if the operation was successful, otherwise an appropriate
* error code
*/
gdf_error gdf_binary_operation_v_v(gdf_column* out, gdf_column* lhs, gdf_column* rhs, gdf_binary_operator ope);


/* arith */

gdf_error gdf_add_generic(gdf_column *lhs, gdf_column *rhs, gdf_column *output);
Expand Down
52 changes: 52 additions & 0 deletions cpp/include/cudf/types.h
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,35 @@ typedef struct {
// here we can also hold info for decimal datatype or any other datatype that requires additional information
} gdf_dtype_extra_info;

/**---------------------------------------------------------------------------*
* @union gdf_data
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be gdf_datum, signifying the singular rather than the plural?

gdf_data sounds super- vague, and is also confusing IMHO since it can be interpreted to mean lots of data, or all data in a column etc.

* @brief Union used for scalar type.
* It stores a unique value for scalar type.
* It has a direct relationship with the gdf_dtype.
*---------------------------------------------------------------------------**/
typedef union {
int8_t si08; /**< GDF_INT8 */
int16_t si16; /**< GDF_INT16 */
int32_t si32; /**< GDF_INT32 */
int64_t si64; /**< GDF_INT64 */
float fp32; /**< GDF_FLOAT32 */
double fp64; /**< GDF_FLOAT64 */
int32_t dt32; /**< GDF_DATE32 */
int64_t dt64; /**< GDF_DATE64 */
int64_t tmst; /**< GDF_TIMESTAMP */
} gdf_data;

/**---------------------------------------------------------------------------*
* @brief A struct to hold a scalar (single) value and its type information
*
*---------------------------------------------------------------------------**/
typedef struct {
gdf_data data; /**< Pointer to the scalar data */
gdf_dtype dtype; /**< The datatype of the scalar's data */
bool is_valid; /**< False if the value is null */
} gdf_scalar;


typedef struct gdf_column_{
void *data; /**< Pointer to the columns data */
gdf_valid_type *valid; /**< Pointer to the columns validity bit mask where the 'i'th bit indicates if the 'i'th row is NULL */
Expand Down Expand Up @@ -156,6 +185,29 @@ typedef enum {
GDF_NUM_COLORS, /** Add new colors above this line */
} gdf_color;


/**---------------------------------------------------------------------------*
* @brief Types of binary operations that can be performed on data.
*
*---------------------------------------------------------------------------**/
typedef enum {
GDF_ADD, /**< operator + */
GDF_SUB, /**< operator - */
GDF_MUL, /**< operator * */
GDF_DIV, /**< operator / using common type of lhs and rhs */
GDF_TRUE_DIV, /**< operator / after promoting type to floating point*/
GDF_FLOOR_DIV, /**< operator / after promoting to float and then flooring the result */
GDF_MOD, /**< operator % */
GDF_POW, /**< lhs ^ rhs */
GDF_EQUAL, /**< operator == */
GDF_NOT_EQUAL, /**< operator != */
GDF_LESS, /**< operator < */
GDF_GREATER, /**< operator > */
GDF_LESS_EQUAL, /**< operator <= */
GDF_GREATER_EQUAL, /**< operator >= */
} gdf_binary_operator;


/* --------------------------------------------------------------------------*/
/**
* @brief This struct holds various information about how an operation should be
Expand Down
37 changes: 37 additions & 0 deletions cpp/src/binary/jit/code/code.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
/*
* Copyright (c) 2019, NVIDIA CORPORATION.
*
* Copyright 2018-2019 BlazingDB, Inc.
* Copyright 2018 Christian Noboa Mardini <[email protected]>
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

#ifndef GDF_BINARY_OPERATION_JIT_CODE_CODE_H
#define GDF_BINARY_OPERATION_JIT_CODE_CODE_H

namespace cudf {
namespace binops {
namespace jit {
namespace code {

extern const char* kernel;
extern const char* traits;
extern const char* operation;

}
}
}
}

#endif
69 changes: 69 additions & 0 deletions cpp/src/binary/jit/code/kernel.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
/*
* Copyright (c) 2019, NVIDIA CORPORATION.
*
* Copyright 2018-2019 BlazingDB, Inc.
* Copyright 2018 Christian Noboa Mardini <[email protected]>
* Copyright 2018 Rommel Quintanilla <[email protected]>
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

namespace cudf {
namespace binops {
namespace jit {
namespace code {

const char* kernel =
R"***(
#include "operation.h"
#include "cudf/types.h"

template <typename TypeOut, typename TypeLhs, typename TypeRhs, typename TypeOpe>
__global__
void kernel_v_s(gdf_size_type size,
TypeOut* out_data, TypeLhs* lhs_data, gdf_data rhs_data) {
int tid = threadIdx.x;
int blkid = blockIdx.x;
int blksz = blockDim.x;
int gridsz = gridDim.x;

int start = tid + blkid * blksz;
int step = blksz * gridsz;

for (gdf_size_type i=start; i<size; i+=step) {
out_data[i] = TypeOpe::template operate<TypeOut, TypeLhs, TypeRhs>(lhs_data[i], *reinterpret_cast<TypeRhs*>(&rhs_data));
}
}

template <typename TypeOut, typename TypeLhs, typename TypeRhs, typename TypeOpe>
__global__
void kernel_v_v(gdf_size_type size,
TypeOut* out_data, TypeLhs* lhs_data, TypeRhs* rhs_data) {
int tid = threadIdx.x;
int blkid = blockIdx.x;
int blksz = blockDim.x;
int gridsz = gridDim.x;

int start = tid + blkid * blksz;
int step = blksz * gridsz;

for (gdf_size_type i=start; i<size; i+=step) {
out_data[i] = TypeOpe::template operate<TypeOut, TypeLhs, TypeRhs>(lhs_data[i], rhs_data[i]);
}
}
)***";

} // namespace code
} // namespace jit
} // namespace binops
} // namespace cudf
Loading