You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Could you give me a brief summary on how this works and what are the most important code parts?
This will probably also help other users a lot, which will help you get more people to use and develop your app better :D
The text was updated successfully, but these errors were encountered:
I'm trying to figure it out too. Sorry I don't really know the answer. I noticed the llama.cpp repo is referenced but that source isn't really included in this repo. There are is also generated_bindings_llama.dart which has comments saying do not change this generated file. So, I wonder if they used some kind of script to map llama.cpp to the generated file?
One issue I'm having is the program closes when the remaining_tokens approach 0. Trying to figure out how to stop that from happening. So far, resetting the remaining_tokens value and lowering the n_predict Param to be less than n_ctx hasn't solved the issue. Did you find the same problem when you ran it?
Since it is possible to include native binary code in Android apps, llama.cpp has been compiled for the Android platform. The authors have done this using a separate repo https://github.com/Bip-Rep/llama.cpp where you can find the build script (it is quite simple).
The resultant built llama.cpp version is included directly in the sherpa git repo under assets/libs/ and you already found the binaryIsolate() function where this is loaded into the app at runtime.
This app is written in Flutter/Dart and in order to access the binary library it uses dart's FFI functionality. ffigen is used to automatically create the glue code that allows you to call C/C++ code from Dart, which you can see in generated_bindings_llama.dart.
I mean seriously, how the f$ck did you get LLama on Android to run? :O
I tried to look at the code myself, but I don't fully understand it. I think this function is where the actual magic happens: https://github.com/Bip-Rep/sherpa/blob/main/lib/lib.dart#L337
Could you give me a brief summary on how this works and what are the most important code parts?
This will probably also help other users a lot, which will help you get more people to use and develop your app better :D
The text was updated successfully, but these errors were encountered: