Intro

v0.1.0 has been released!

Here's a tweet thread explaining the update in detail: https://twitter.com/cocktailpeanut/status/1635394517615652866

v0.1.0 Goal

Fixing as many bugs as possible, so the installation succeeds as much as possible
Making the Web UI more usable

Changelog

Important Fixes

Deterministic Install: using virtualenv to make sure that everything related to python/pip installs deterministically (Thanks to @EliasVincent #19)
ALL models work: Previously, the only model that was working was the 7B one. The following cases have all been fixed, and ALL models should work (7B, 13B, 30B, 65B) (Thanks to @rizenfrmtheashes #9)
- If you were getting some gibberish when you queried against the 13B model or others, this was what was happening.
- Or in other cases, the models didn't even install at all to begin with.
More efficient: Thanks to @marcuswestin #16
- Clone or Pull: installation now tries to clone, or if already exists, pull the latest from the llama.cpp repository
- Only download when file doesn't exist: Previously, installing would always try to re-download everything from the llama-dl CDN. Now the code checks whether the files exist before attempting to download. If a file already exists, it immediately skips to the next step.
- Only create model when it doesn't exist: The model creation steps can be skipped if the model already exists.
Custom workspace folder: Now dalai supports custom workspace folder. (Previously it was always using the $HOME directory. Now you can pass in a custom parameter to point to an existing llama.cpp workspace folder.

API

ALL customization flags exposed: exposes every configuration attribute supported by llama.cpp
- top_k
- top_p
- n_predict
- seed
- temp
- repeat_last_n
- repeat_penalty
install() API documented: You can programmatically install
[NEW] installed() API: an API that returns all the currently installed models
End marker: Previously it was impossible to know when a streaming response has finished. Now every response session ends with \n\n<end> to mark the end. Now you know that the response has finished when you see \n\n<end>, and even write code to programmatically trigger other actions based on this. (Thanks to @marcuswestin #16)
- This feature can be skipped by passing a skip_end: true to the request payload.
url: Previously when connecting to a remote dalai server, you would specify it in the constructor (like new Dalai("ws://localhost:3000")). But this is not a correct place to take the url as an input since the url is only used when a client is making a request to a server. Therefore moved the url attribute to the request() method.
- Now, to make a request to a remote dalai, you can now simply attach a url parameter to your request payload (example: url: "ws://localhost:3000")

Web UI

You can now customize all configurations in the dashboard (previously you could only enter the promopt)
- top_k
- top_p
- n_predict
- seed
- temp
- repeat_last_n
- repeat_penalty
- model

Dev experience

Thanks to @marcuswestin #16

./dalai shell script added: You can clone the repository and run the command locally instead of npx dalai
- For example you can locally run the equivalent of npx dalai install with ~/dalai install after cloning the repository.
better log statements and exception handling

How to get this update

This version is 0.1.0

Method 1

You can upgrade by specifying the version:

npx [email protected] llama

Method 2

The reason you need to specify the version is because npx caches the packages. Normally you can just run npx dalai install but npx seems to cache everything, so you will need to delete the npx cache:

rm -rf ~/.npm/_npx

and then install

npx dalai llama

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

0.1.0