Idea: Target Jan's llama.cpp build

The folks at @menloresearch have recently [updated](https://github.com/menloresearch/llama.cpp/pull/275) their builds to include CUDA and Vulkan with common CPU instruction sets. Along with Metal builds, this means that all common GPUs are now supported with optimized CPU offload on Windows and Linux.

It may be helpful to target these binaries rather than upstream llama.cpp, as they are well-tested and widely deployed in [Jan](https://github.com/menloresearch/jan). The [inclusion](https://github.com/menloresearch/llama.cpp/blob/07387fe316fd4ce22cb2b23acf5178da555ccbb8/.github/workflows/menlo-build.yml#L68) of the `GGML_CPU_ALL_VARIANTS=ON` flag means that the GPU and non-GPU builds are both available with optimizations for most common CPU types.

The current Menlo release of LlamaCPP, and the first to include these optimized builds, is [b6765](https://github.com/menloresearch/llama.cpp/releases/tag/b6765). The new Windows and Linux builds are those with `common_cpus` in the filenames.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Idea: Target Jan's llama.cpp build #20

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Idea: Target Jan's llama.cpp build #20

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions