-
Notifications
You must be signed in to change notification settings - Fork 99
[Draft][ONNX] Enable Android Build Support for ONNX Model with Protobuf Integration #3574
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
added cast layer (type casting). - there is no model-unittest yet. - In some cases, optimization may be considered (such as fusing only one layer in front), but for now, only the basic implementation is included. optimization will be applied through ONNX graph optimization work, later. **Self evaluation:** 1. Build test: [*]Passed [ ]Failed [ ]Skipped 2. Run test: [ ]Passed [ ]Failed [*]Skipped Signed-off-by: Seungbaek Hong <[email protected]>
For mapping operation unit with ONNX, I reverted weight layer to a structure that has only one weight. **Self evaluation:** 1. Build test: [*]Passed [ ]Failed [ ]Skipped 2. Run test: [*]Passed [ ]Failed [ ]Skipped Signed-off-by: Seungbaek Hong <[email protected]>
- added "How to run ONNX Model with NNTrainer" document (how-to-run-onnx-model.md) - Since operations other than the add operation have not yet been merged into the main branch, the list of supported and planned operations has not been documented. The documentation will be updated once several additional operations are added to the main branch Signed-off-by: Seungbaek Hong <[email protected]>
- Refactor ONNX interpreter operator registration using handler
structure instead of if-else statements
- Add logic to read and convert layer properties
- Add simplified Attention Block example (removing rope, gqa, etc.)
The execution result of the example app is as follows:
```
================================================================================
Layer name Layer type Output dimension Input layer
================================================================================
input input 1:1:1:8
--------------------------------------------------------------------------------
input/generated_out multiout 1:1:1:8 input
--------------------------------------------------------------------------------
onnx__matmul_88 weight 1:1:8:8
--------------------------------------------------------------------------------
onnx__matmul_83 weight 1:1:8:8
--------------------------------------------------------------------------------
v_proj_matmul matmul 1:1:1:8 input/generated_out
onnx__matmul_83
--------------------------------------------------------------------------------
reshape_2 reshape 1:1:1:8 v_proj_matmul
--------------------------------------------------------------------------------
transpose_1 permute 1:1:1:8 reshape_2
--------------------------------------------------------------------------------
onnx__matmul_82 weight 1:1:8:8
--------------------------------------------------------------------------------
k_proj_matmul matmul 1:1:1:8 input/generated_out
onnx__matmul_82
--------------------------------------------------------------------------------
reshape_1 reshape 1:1:1:8 k_proj_matmul
--------------------------------------------------------------------------------
transpose_2 permute 1:1:8:1 reshape_1
--------------------------------------------------------------------------------
onnx__matmul_66 weight 1:1:8:8
--------------------------------------------------------------------------------
q_proj_matmul matmul 1:1:1:8 input/generated_out
onnx__matmul_66
--------------------------------------------------------------------------------
reshape reshape 1:1:1:8 q_proj_matmul
--------------------------------------------------------------------------------
transpose permute 1:1:1:8 reshape
--------------------------------------------------------------------------------
matmul matmul 1:1:1:1 transpose
transpose_2
--------------------------------------------------------------------------------
softmax activation 1:1:1:1 matmul
--------------------------------------------------------------------------------
cast cast 1:1:1:1 softmax
--------------------------------------------------------------------------------
matmul_1 matmul 1:1:1:8 cast
transpose_1
--------------------------------------------------------------------------------
transpose_3 permute 1:1:1:8 matmul_1
--------------------------------------------------------------------------------
reshape_3 reshape 1:1:1:8 transpose_3
--------------------------------------------------------------------------------
o_proj_matmul matmul 1:1:1:8 reshape_3
onnx__matmul_88
================================================================================
```
**Self evaluation:**
1. Build test: [*]Passed [ ]Failed [ ]Skipped
2. Run test: [*]Passed [ ]Failed [ ]Skipped
Signed-off-by: Seungbaek Hong <[email protected]>
Added "gather, slice, negative" layers for supporting onnx model These layer implementations were added to create graph connections during the development of the ONNX interpreter and will soon be replaced by actual layer implementations along with unit-tests in other PRs. **Self evaluation:** 1. Build test: [*]Passed [ ]Failed [ ]Skipped 2. Run test: [*]Passed [ ]Failed [ ]Skipped Signed-off-by: Seungbaek Hong <[email protected]>
- It's an draft PR and I'll modify the commit message
The execution result of the example app is as follows (llama 1b):
```
================================================================================
Layer name Layer type Output dimension Input layer
================================================================================
onnx__add_6 input 1:1:1:1
--------------------------------------------------------------------------------
onnx__add_6/generat multiout 1:1:1:1 onnx__add_6
--------------------------------------------------------------------------------
sin input 1:1:1:64
--------------------------------------------------------------------------------
sin/generated_out_0 multiout 1:1:1:64 sin
--------------------------------------------------------------------------------
...
--------------------------------------------------------------------------------
model_norm_cast_1 cast 1:1:1:2048 model_norm_mul
--------------------------------------------------------------------------------
model_norm_mul_1 multiply 1:1:1:2048 model_norm_weight
model_norm_cast_1
--------------------------------------------------------------------------------
lm_head_matmul matmul 1:1:1:50304 model_norm_mul_1
onnx__matmul_3531
================================================================================
```
(Approximately 8,800 lines, omitted; the total number of layers is estimated to be around 4,000.)
**Self evaluation:**
1. Build test: [*]Passed [ ]Failed [ ]Skipped
2. Run test: [*]Passed [ ]Failed [ ]Skipped
Signed-off-by: Seungbaek Hong <[email protected]>
The input layer was identified during compilation based on the number of input connections(or property). However, weight layer also has no input connection, so it causes some issues. this commit resolved these issues. **Self evaluation:** 1. Build test: [*]Passed [ ]Failed [ ]Skipped 2. Run test: [*]Passed [ ]Failed [ ]Skipped Signed-off-by: Seungbaek Hong <[email protected]>
Signed-off-by: Sumon Nath <[email protected]>
Signed-off-by: Sumon Nath <[email protected]>
Signed-off-by: Sumon Nath <[email protected]>
…uf and abseil-cpp libraries with architecture-specific compilation using Android NDK. - Implemented ONNX Qwen model inference pipeline with proper JNI layer configuration and Android.mk setup for running models on device. Signed-off-by: Niket Agarwal <[email protected]>
| namespace nntrainer { | ||
|
|
||
| void SliceLayer::finalize(InitLayerContext &context) { | ||
| unsigned int axis = std::get<props::Axis>(slice_props).get(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are accessing the local variable axis and not updating/accessing this->axis. Don't you need to update this->axis?
Plus, shadowing isn't good. Please do not shadow class members with local variables. It will be soon prohibited.
| break; | ||
| case 2: | ||
| outDeriv.addValue(b, i, j, selected, inDerivValue, 1); | ||
| default: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
waht if axis == 3 ?
and default should process error/corner cases.
| break; | ||
| case 3: | ||
| output.setValue(b, i, j, k, input.getValue(b, i, j, selected)); | ||
| default: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok to ignore axis=0 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that exception case is handled in the finalize function!
if (axis < 1 || axis > 3) { throw std::invalid_argument( "The axis property of GatherLayer should be between 1 and 3."); }
myungjoo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not add generated files (onnix.pb.cc/h)
| case onnx::TensorProto::FLOAT16: | ||
| return "FP16"; | ||
| case onnx::TensorProto::INT64: | ||
| return "FP32"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is INT64-->FP32 ok? (8bytes --> 4bytes)
|
This PR is stale because it has been open 14 days with no activity. Remove stale label or comment or this will be closed in 3 days. |
This PR enables Android build support for ONNX models by integrating protobuf and abseil-cpp libraries with architecture-specific compilation using Android NDK. It implements the ONNX Qwen3 model inference pipeline with proper JNI layer configuration and Android.mk setup for running models on device.
Key Changes:
Current Status:
Inference is currently working on Qwen3 models on Android devices, but we are facing accuracy issues that are being investigated. The core Android build infrastructure and ONNX model loading are functional.
Self evaluation:
Build test: []Passed [ ]Failed [X]Skipped
Run test: []Passed [ ]Failed [X]Skipped
Signed-off-by: Niket Agarwal [email protected]