[Draft][ONNX] Enable Android Build Support for ONNX Model with Protobuf Integration #3574

niket-agarwal · 2025-11-19T14:35:45Z

This PR enables Android build support for ONNX models by integrating protobuf and abseil-cpp libraries with architecture-specific compilation using Android NDK. It implements the ONNX Qwen3 model inference pipeline with proper JNI layer configuration and Android.mk setup for running models on device.

Key Changes:

Android-specific protobuf and abseil-cpp library compilation using Android NDK
ONNX interpreter integration with JNI layer for Android
Architecture-specific build configurations (arm64-v8a)

Current Status:

Inference is currently working on Qwen3 models on Android devices, but we are facing accuracy issues that are being investigated. The core Android build infrastructure and ONNX model loading are functional.

Self evaluation:

Build test: []Passed [ ]Failed [X]Skipped
Run test: []Passed [ ]Failed [X]Skipped
Signed-off-by: Niket Agarwal [email protected]

added cast layer (type casting). - there is no model-unittest yet. - In some cases, optimization may be considered (such as fusing only one layer in front), but for now, only the basic implementation is included. optimization will be applied through ONNX graph optimization work, later. **Self evaluation:** 1. Build test: [*]Passed [ ]Failed [ ]Skipped 2. Run test: [ ]Passed [ ]Failed [*]Skipped Signed-off-by: Seungbaek Hong <[email protected]>

For mapping operation unit with ONNX, I reverted weight layer to a structure that has only one weight. **Self evaluation:** 1. Build test: [*]Passed [ ]Failed [ ]Skipped 2. Run test: [*]Passed [ ]Failed [ ]Skipped Signed-off-by: Seungbaek Hong <[email protected]>

- added "How to run ONNX Model with NNTrainer" document (how-to-run-onnx-model.md) - Since operations other than the add operation have not yet been merged into the main branch, the list of supported and planned operations has not been documented. The documentation will be updated once several additional operations are added to the main branch Signed-off-by: Seungbaek Hong <[email protected]>

- Refactor ONNX interpreter operator registration using handler structure instead of if-else statements - Add logic to read and convert layer properties - Add simplified Attention Block example (removing rope, gqa, etc.) The execution result of the example app is as follows: ``` ================================================================================ Layer name Layer type Output dimension Input layer ================================================================================ input input 1:1:1:8 -------------------------------------------------------------------------------- input/generated_out multiout 1:1:1:8 input -------------------------------------------------------------------------------- onnx__matmul_88 weight 1:1:8:8 -------------------------------------------------------------------------------- onnx__matmul_83 weight 1:1:8:8 -------------------------------------------------------------------------------- v_proj_matmul matmul 1:1:1:8 input/generated_out onnx__matmul_83 -------------------------------------------------------------------------------- reshape_2 reshape 1:1:1:8 v_proj_matmul -------------------------------------------------------------------------------- transpose_1 permute 1:1:1:8 reshape_2 -------------------------------------------------------------------------------- onnx__matmul_82 weight 1:1:8:8 -------------------------------------------------------------------------------- k_proj_matmul matmul 1:1:1:8 input/generated_out onnx__matmul_82 -------------------------------------------------------------------------------- reshape_1 reshape 1:1:1:8 k_proj_matmul -------------------------------------------------------------------------------- transpose_2 permute 1:1:8:1 reshape_1 -------------------------------------------------------------------------------- onnx__matmul_66 weight 1:1:8:8 -------------------------------------------------------------------------------- q_proj_matmul matmul 1:1:1:8 input/generated_out onnx__matmul_66 -------------------------------------------------------------------------------- reshape reshape 1:1:1:8 q_proj_matmul -------------------------------------------------------------------------------- transpose permute 1:1:1:8 reshape -------------------------------------------------------------------------------- matmul matmul 1:1:1:1 transpose transpose_2 -------------------------------------------------------------------------------- softmax activation 1:1:1:1 matmul -------------------------------------------------------------------------------- cast cast 1:1:1:1 softmax -------------------------------------------------------------------------------- matmul_1 matmul 1:1:1:8 cast transpose_1 -------------------------------------------------------------------------------- transpose_3 permute 1:1:1:8 matmul_1 -------------------------------------------------------------------------------- reshape_3 reshape 1:1:1:8 transpose_3 -------------------------------------------------------------------------------- o_proj_matmul matmul 1:1:1:8 reshape_3 onnx__matmul_88 ================================================================================ ``` **Self evaluation:** 1. Build test: [*]Passed [ ]Failed [ ]Skipped 2. Run test: [*]Passed [ ]Failed [ ]Skipped Signed-off-by: Seungbaek Hong <[email protected]>

Added "gather, slice, negative" layers for supporting onnx model These layer implementations were added to create graph connections during the development of the ONNX interpreter and will soon be replaced by actual layer implementations along with unit-tests in other PRs. **Self evaluation:** 1. Build test: [*]Passed [ ]Failed [ ]Skipped 2. Run test: [*]Passed [ ]Failed [ ]Skipped Signed-off-by: Seungbaek Hong <[email protected]>

- It's an draft PR and I'll modify the commit message The execution result of the example app is as follows (llama 1b): ``` ================================================================================ Layer name Layer type Output dimension Input layer ================================================================================ onnx__add_6 input 1:1:1:1 -------------------------------------------------------------------------------- onnx__add_6/generat multiout 1:1:1:1 onnx__add_6 -------------------------------------------------------------------------------- sin input 1:1:1:64 -------------------------------------------------------------------------------- sin/generated_out_0 multiout 1:1:1:64 sin -------------------------------------------------------------------------------- ... -------------------------------------------------------------------------------- model_norm_cast_1 cast 1:1:1:2048 model_norm_mul -------------------------------------------------------------------------------- model_norm_mul_1 multiply 1:1:1:2048 model_norm_weight model_norm_cast_1 -------------------------------------------------------------------------------- lm_head_matmul matmul 1:1:1:50304 model_norm_mul_1 onnx__matmul_3531 ================================================================================ ``` (Approximately 8,800 lines, omitted; the total number of layers is estimated to be around 4,000.) **Self evaluation:** 1. Build test: [*]Passed [ ]Failed [ ]Skipped 2. Run test: [*]Passed [ ]Failed [ ]Skipped Signed-off-by: Seungbaek Hong <[email protected]>

The input layer was identified during compilation based on the number of input connections(or property). However, weight layer also has no input connection, so it causes some issues. this commit resolved these issues. **Self evaluation:** 1. Build test: [*]Passed [ ]Failed [ ]Skipped 2. Run test: [*]Passed [ ]Failed [ ]Skipped Signed-off-by: Seungbaek Hong <[email protected]>

Signed-off-by: Sumon Nath <[email protected]>

…uf and abseil-cpp libraries with architecture-specific compilation using Android NDK. - Implemented ONNX Qwen model inference pipeline with proper JNI layer configuration and Android.mk setup for running models on device. Signed-off-by: Niket Agarwal <[email protected]>

myungjoo · 2025-11-20T05:14:18Z

nntrainer/layers/slice_layer.cpp

+namespace nntrainer {
+
+void SliceLayer::finalize(InitLayerContext &context) {
+  unsigned int axis = std::get<props::Axis>(slice_props).get();


you are accessing the local variable axis and not updating/accessing this->axis. Don't you need to update this->axis?

Plus, shadowing isn't good. Please do not shadow class members with local variables. It will be soon prohibited.

myungjoo · 2025-11-20T05:15:13Z

nntrainer/layers/gather_layer.cpp

+            break;
+          case 2:
+            outDeriv.addValue(b, i, j, selected, inDerivValue, 1);
+          default:


waht if axis == 3 ?

and default should process error/corner cases.

myungjoo · 2025-11-20T05:15:32Z

nntrainer/layers/gather_layer.cpp

+            break;
+          case 3:
+            output.setValue(b, i, j, k, input.getValue(b, i, j, selected));
+          default:


ok to ignore axis=0 ?

that exception case is handled in the finalize function!
if (axis < 1 || axis > 3) { throw std::invalid_argument( "The axis property of GatherLayer should be between 1 and 3."); }

myungjoo

Do not add generated files (onnix.pb.cc/h)

myungjoo · 2025-11-20T09:09:48Z

nntrainer/compiler/onnx_interpreter.cpp

+  case onnx::TensorProto::FLOAT16:
+    return "FP16";
+  case onnx::TensorProto::INT64:
+    return "FP32";


Is INT64-->FP32 ok? (8bytes --> 4bytes)

github-actions · 2025-12-05T02:57:33Z

This PR is stale because it has been open 14 days with no activity. Remove stale label or comment or this will be closed in 3 days.

sachin-nntrainer and others added 20 commits October 14, 2025 17:53

[ONNX][draft] Weight loading from bin files for onnx models

892069a

Signed-off-by: Sumon Nath <[email protected]>

[ONNX] Solved Slice layer setTensorDim erroe

7d15c88

Signed-off-by: Sumon Nath <[email protected]>

matmul error unfinished

3380f13

Signed-off-by: Sumon Nath <[email protected]>

Python modelling and utility scripts

29eb120

ONNX Interpreter changes: Added support for parsing pow op

f874fd8

layer to supports Qwen3B

532870b

Added weight loading support via layer name

67617e8

Debug ops for accuracy:No changes

f4cfa5c

Adding save2raw in main.cpp and changing example value

9bd39a3

remove unwanted files

ca210db

unwanted cout statement

85d866a

Added python env reqs

ed7081e

myungjoo reviewed Nov 20, 2025

View reviewed changes

github-actions bot added the Stale label Dec 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Draft][ONNX] Enable Android Build Support for ONNX Model with Protobuf Integration #3574

[Draft][ONNX] Enable Android Build Support for ONNX Model with Protobuf Integration #3574

Uh oh!

niket-agarwal commented Nov 19, 2025

Uh oh!

myungjoo Nov 20, 2025

Uh oh!

myungjoo Nov 20, 2025

Uh oh!

myungjoo Nov 20, 2025

Uh oh!

baek2sm Nov 20, 2025

Uh oh!

myungjoo left a comment

Uh oh!

myungjoo Nov 20, 2025

Uh oh!

github-actions bot commented Dec 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[Draft][ONNX] Enable Android Build Support for ONNX Model with Protobuf Integration #3574

Are you sure you want to change the base?

[Draft][ONNX] Enable Android Build Support for ONNX Model with Protobuf Integration #3574

Uh oh!

Conversation

niket-agarwal commented Nov 19, 2025

Key Changes:

Current Status:

Uh oh!

myungjoo Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

myungjoo Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

myungjoo Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

baek2sm Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

myungjoo left a comment

Choose a reason for hiding this comment

Uh oh!

myungjoo Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Dec 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants