Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How does this work?!?!?!? #8

Open
Leichesters opened this issue May 6, 2023 · 2 comments
Open

How does this work?!?!?!? #8

Leichesters opened this issue May 6, 2023 · 2 comments

Comments

@Leichesters
Copy link

I mean seriously, how the f$ck did you get LLama on Android to run? :O

I tried to look at the code myself, but I don't fully understand it. I think this function is where the actual magic happens: https://github.com/Bip-Rep/sherpa/blob/main/lib/lib.dart#L337

Could you give me a brief summary on how this works and what are the most important code parts?
This will probably also help other users a lot, which will help you get more people to use and develop your app better :D

@tcnevin
Copy link

tcnevin commented May 6, 2023

I'm trying to figure it out too. Sorry I don't really know the answer. I noticed the llama.cpp repo is referenced but that source isn't really included in this repo. There are is also generated_bindings_llama.dart which has comments saying do not change this generated file. So, I wonder if they used some kind of script to map llama.cpp to the generated file?

One issue I'm having is the program closes when the remaining_tokens approach 0. Trying to figure out how to stop that from happening. So far, resetting the remaining_tokens value and lowering the n_predict Param to be less than n_ctx hasn't solved the issue. Did you find the same problem when you ran it?

@dsd
Copy link

dsd commented Jun 16, 2023

Since it is possible to include native binary code in Android apps, llama.cpp has been compiled for the Android platform. The authors have done this using a separate repo https://github.com/Bip-Rep/llama.cpp where you can find the build script (it is quite simple).

The resultant built llama.cpp version is included directly in the sherpa git repo under assets/libs/ and you already found the binaryIsolate() function where this is loaded into the app at runtime.

This app is written in Flutter/Dart and in order to access the binary library it uses dart's FFI functionality. ffigen is used to automatically create the glue code that allows you to call C/C++ code from Dart, which you can see in generated_bindings_llama.dart.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants