Add CUDA unit test and fix CUDA build configuration #3568

dkjung · 2025-11-18T05:55:23Z

This PR enhances CUDA support and adds CUDA unit tests.

Key Changes:

CUDA Unit Test Addition (unittest_cuda.cpp):
- Comprehensive unit tests for RMSNorm CUDA kernel
- Test cases for various dimension sizes and epsilon values
- CUDA kernel execution time measurement functionality
CUDA Build Configuration Improvements:
- Reorganized CUDA operations subdirectory inclusion from nntrainer/meson.build to nntrainer/tensor/meson.build
- Added CUDA test target to test/unittest/meson.build
- Fixed CUDA linking issues by adding proper link arguments (-NOIMPLIB, -NOEXP)
- Added CUDA dependencies handling in unit test build configuration
Prevent Unnecessary File Generation:
- Modified linker options to prevent .lib and .exp file generation during unittest_cuda.exe build on Windows

Expected Benefits:

Improved test coverage for CUDA functionality
Enhanced CUDA build system stability
Optimized CUDA builds on Windows environments

These changes contribute to strengthening CUDA support in the build system and improving reliability of CUDA operations.

Adds CUDA context management files (cuda_context.h and cuda_context.cpp) that provide similar functionality to the existing OpenCL context. The changes include: - CudaContext class inheriting from Context and Singleton - CUDA kernel management and execution interfaces - Build system updates to support CUDA with enable-cuda option - Conditional linking of CUDA runtime library for both Windows and Linux - Addition of enable-cuda option in meson_options.txt Signed-off-by: Daekyoung Jung <[email protected]>

This commit adds CUDA context management files (cuda_context.h and cuda_context.cpp) that provide similar functionality to the existing OpenCL context. The changes include: - Implementation of CudaContext class inheriting from Context and Singleton - CUDA kernel management and execution interface - Build system updates to support CUDA with enable-cuda meson_options - Conditional linking of CUDA runtime library for both Windows and Linux - Addition of enable-cuda option in meson_options.txt - Implementation of RMSNorm CUDA kernel and build configuration Signed-off-by: Daekyoung Jung <[email protected]>

This commit includes the following changes: 1. Add new CUDA unit test file (unittest_cuda.cpp) with RMSNorm CUDA kernel tests 2. Reorganize CUDA operations directory structure by moving subdir inclusion from nntrainer/meson.build to nntrainer/tensor/meson.build 3. Add CUDA test target in test/unittest/meson.build 4. Fix CUDA linking issues by adding proper link arguments (-NOIMPLIB, -NOEXP) to prevent generation of unnecessary .lib and .exp files 5. Add CUDA dependencies handling in unit test build configuration The changes ensure proper CUDA support in the build system and add comprehensive unit tests for CUDA operations. Signed-off-by: Daekyoung Jung <[email protected]>

myungjoo · 2025-11-20T09:33:28Z

nntrainer/cuda_context.h

+  /**
+   * @brief Get the name of the context
+   */
+  std::string getName() override { return "cuda"; }


Even if you register CudaContext, base_properties.h is not ready for it:

nntrainer/nntrainer/utils/base_properties.h

Lines 786 to 794 in ba47d7f

/**

* @brief Enumeration of Run Engine type

*/

struct ComputeEngineTypeInfo {

using Enum = ml::train::LayerComputeEngine;

static constexpr std::initializer_list<Enum> EnumList = {Enum::CPU, Enum::GPU,

Enum::QNN};

static constexpr const char *EnumStr[] = {"cpu", "gpu", "qnn"};

};

myungjoo · 2025-11-20T09:35:55Z

nntrainer/meson.build

+if get_option('enable-cuda')
+  nntrainer_headers += meson.current_source_dir() / 'cuda_context.h'
+  nntrainer_common_sources += 'cuda_context.cpp'
+  extra_defines += '-DENABLE_CUDA=1'


This is ignored code.

In /meson.build

nntrainer/meson.build

Lines 704 to 707 in ba47d7f

message('extra defines are:' + ' '.join(extra_defines))

foreach defs: extra_defines

add_project_arguments(defs, language: ['c', 'cpp'])

endforeach

happens before subdir('nntrainer').

You need to do this at /meson.build, before L704

myungjoo · 2025-11-20T09:36:56Z

nntrainer/engine.cpp

+  auto &cuda_context = nntrainer::CudaContext::Global();
+
+  registerContext("cuda", &cuda_context);
+#endif


Add #include "cuda_context.h" in this engine.cpp file.

myungjoo · 2025-11-20T09:37:55Z

test/unittest/meson.build

+  cuda_deps = []
+  cuda_link_args = []
+  if target[0] == 'unittest_cuda' and get_option('enable-cuda')
+    cuda_deps = [cuda_dep]


Is cuda_dep`` in scope of the cuda_dep``` of /nntrainer/tensor/meson.build? It doesn't appear so.

myungjoo · 2025-11-20T09:38:51Z

nntrainer/tensor/cuda_operations/meson.build

+  foreach kernel : cuda_sources
+    obj_name = kernel.replace('.cu', '.o')
+    obj = custom_target(obj_name,
+      command: [nvcc, '-c', '-Xcompiler', '/MD', '@INPUT@', '-o', '@OUTPUT@'],


Do not use /MD unconditionally. It should depend on the compiler.

myungjoo · 2025-11-20T09:39:31Z

nntrainer/tensor/cuda_operations/meson.build

@@ -0,0 +1,34 @@
+# Find CUDA compiler
+dep = dependency('cuda', version : '>=13', modules : ['cublas'])


Even if cuda is available, this might fail (at least in most Linux distros). Check if this is effective in developer machines.

jijoongmoon · 2025-11-29T04:57:01Z

nntrainer/cuda_context.cpp

+
+void CudaContext::add_default_object() {
+  // Register default layers that support CUDA
+  registerFactory(nntrainer::createLayer<FullyConnectedLayer>,


is this just for the testing or intentional? It seems like try to add CPU FullyConnectedLayer, not the Cude Layer.
If those are just testing, then please leave the comments.

jijoongmoon · 2025-11-29T05:29:14Z

please add [ Wait for #3567 ] in title

dkjung requested review from DonghakPark, EunjuYang, SeoHyungjun, again4you, anyj0527, baek2sm, djeong20, gichan-jang, haehun, jaeyun-jung, jihochu, jijoongmoon, leemgs, lhs8928, myungjoo, skykongkong8, songgot and wooksong as code owners November 18, 2025 05:55

github-actions bot added the Need Review label Nov 18, 2025

dkjung added 2 commits November 20, 2025 15:21

dkjung force-pushed the feature/cuda3 branch from c9d6e4b to 078308c Compare November 20, 2025 06:38

myungjoo reviewed Nov 20, 2025

View reviewed changes

jijoongmoon reviewed Nov 29, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add CUDA unit test and fix CUDA build configuration #3568

Add CUDA unit test and fix CUDA build configuration #3568

Uh oh!

dkjung commented Nov 18, 2025

Uh oh!

myungjoo Nov 20, 2025

Uh oh!

myungjoo Nov 20, 2025

Uh oh!

myungjoo Nov 20, 2025

Uh oh!

myungjoo Nov 20, 2025

Uh oh!

myungjoo Nov 20, 2025

Uh oh!

myungjoo Nov 20, 2025

Uh oh!

jijoongmoon Nov 29, 2025

Uh oh!

jijoongmoon commented Nov 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	/**
	* @brief Enumeration of Run Engine type
	*/
	struct ComputeEngineTypeInfo {
	using Enum = ml::train::LayerComputeEngine;
	static constexpr std::initializer_list<Enum> EnumList = {Enum::CPU, Enum::GPU,
	Enum::QNN};
	static constexpr const char *EnumStr[] = {"cpu", "gpu", "qnn"};
	};

	message('extra defines are:' + ' '.join(extra_defines))
	foreach defs: extra_defines
	add_project_arguments(defs, language: ['c', 'cpp'])
	endforeach

		@@ -0,0 +1,34 @@
		# Find CUDA compiler
		dep = dependency('cuda', version : '>=13', modules : ['cublas'])

Add CUDA unit test and fix CUDA build configuration #3568

Are you sure you want to change the base?

Add CUDA unit test and fix CUDA build configuration #3568

Uh oh!

Conversation

dkjung commented Nov 18, 2025

Key Changes:

Expected Benefits:

Uh oh!

myungjoo Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

myungjoo Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

myungjoo Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

myungjoo Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

myungjoo Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

myungjoo Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

jijoongmoon Nov 29, 2025

Choose a reason for hiding this comment

Uh oh!

jijoongmoon commented Nov 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants