Skip to content

Commit a414982

Browse files
committed
README: Cortex's instructions
1 parent 526934b commit a414982

File tree

1 file changed

+7
-1
lines changed

1 file changed

+7
-1
lines changed

README.md

+7-1
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ echo "Translate into German: thank you" | ./ask-llm.py
3737

3838
## Using Local LLM Servers
3939

40-
Supported local LLM servers include [llama.cpp](https://github.com/ggerganov/llama.cpp), [Jan](https://jan.ai), [Ollama](https://ollama.com), [LocalAI](https://localai.io), [LM Studio](https://lmstudio.ai), and [Msty](https://msty.app).
40+
Supported local LLM servers include [llama.cpp](https://github.com/ggerganov/llama.cpp), [Jan](https://jan.ai), [Ollama](https://ollama.com), [Cortex](https://cortex.so), [LocalAI](https://localai.io), [LM Studio](https://lmstudio.ai), and [Msty](https://msty.app).
4141

4242
To utilize [llama.cpp](https://github.com/ggerganov/llama.cpp) locally with its inference engine, load a quantized model like [Llama-3.2 3B](https://huggingface.co/bartowski/Llama-3.2-3B-Instruct-GGUF) or [Phi-3.5 Mini](https://huggingface.co/bartowski/Phi-3.5-mini-instruct-GGUF). Then set the `LLM_API_BASE_URL` environment variable:
4343
```bash
@@ -58,6 +58,12 @@ export LLM_API_BASE_URL=http://127.0.0.1:11434/v1
5858
export LLM_CHAT_MODEL='llama3.2'
5959
```
6060

61+
To use [Cortex](https://cortex.so) local inference, pull a model (such as `llama3.2` or `phi-3.5`, among [many others](https://cortex.so/models/)) and ensure that its API server is running, and then configure these environment variables:
62+
```bash
63+
export LLM_API_BASE_URL=http://localhost:39281/v1
64+
export LLM_CHAT_MODEL='llama3.2:3b-gguf-q4-km'
65+
```
66+
6167
For [LocalAI](https://localai.io), initiate its container and adjust the environment variable `LLM_API_BASE_URL`:
6268
```bash
6369
docker run -ti -p 8080:8080 localai/localai llama-3.2-3b-instruct:q4_k_m

0 commit comments

Comments
 (0)