-
Notifications
You must be signed in to change notification settings - Fork 523
[1/N] Android JNI llama cache temperature in class #10287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces caching for the temperature parameter in the Android JNI layer for llama models. The changes include adding a new member variable to store temperature, initializing that variable in the constructor, and updating the generation configuration to use the cached value.
- Added a new member variable (temperature_) to cache the temperature.
- Assigned the temperature parameter to temperature_ in the constructor.
- Updated the generation configuration to use temperature_.
Comments suppressed due to low confidence (1)
extension/android/jni/jni_layer_llama.cpp:186
- Please verify that removing the temperature parameter from the MTKLlamaRunner constructor is intentional. If the temperature was previously required by the runner, additional changes in MTKLlamaRunner or its usage may be needed.
tokenizer_path->toStdString().c_str());
18f75b1
to
e26af08
Compare
9f6ee3b
to
6dc4fc0
Compare
This PR needs a
|
6dc4fc0
to
c11f8dc
Compare
So far, for LLM, we can use the new config, with cached temperature from ctor. For llava, we use the old workflow.
Test: instrumentation test