Pull requests (for contributors)

Test your changes:
- Execute the full CI locally on your machine before publishing
- Verify that the perplexity and the performance are not affected negatively by your changes (use llama-perplexity and llama-bench)
- If you modified the ggml source, run the test-backend-ops tool to check whether different backend implementations of the ggml operators produce consistent results (this requires access to at least two different ggml backends)
- If you modified a ggml operator or added a new one, add the corresponding test cases to test-backend-ops
Consider allowing write access to your branch for faster reviews, as reviewers can push commits directly
If your PR becomes stale, don't hesitate to ping the maintainers in the comments

Pull requests (for collaborators)

Squash-merge PRs
Use the following format for the squashed commit title: <module> : <commit title> (#<issue_number>). For example: utils : fix typo in utils.py (#1234)
Optionally pick a <module> from here: https://github.com/ggerganov/llama.cpp/wiki/Modules
Consider adding yourself to CODEOWNERS

Coding guidelines

Avoid adding third-party dependencies, extra files, extra headers, etc.
Always consider cross-compatibility with other operating systems and architectures
Avoid fancy-looking modern STL constructs, use basic for loops, avoid templates, keep it simple
Vertical alignment makes things more readable and easier to batch edit
Clean-up any trailing whitespaces, use 4 spaces for indentation, brackets on the same line, void * ptr, int & a
Use sized integer types in the public API
Declare structs with struct foo {} instead of typedef struct foo {} foo
- In C++ code omit the struct keyword whenever it is not necessary
[!NOTE] This guideline is yet to be applied to the llama.cpp codebase. New code should follow this guideline.
Try to follow the existing patterns in the code (indentation, spaces, etc.). In case of doubt use clang-format to format the added code
Tensors store data in row-major order. We refer to dimension 0 as columns, 1 as rows, 2 as matrices
Matrix multiplication is unconventional: C = ggml_mul_mat(ctx, A, B) means $C^T = A B^T \Leftrightarrow C = B A^T.$

Preprocessor directives
- (TODO: add guidelines with examples and apply them to the codebase)
```
#ifdef FOO
#endif // FOO
```

Naming guidelines

Use snake_case for function, variable and type names

Naming usually optimizes for common prefix (see ggml-org/ggml#302 (comment))

// not OK
int small_number;
int big_number;

// OK
int number_small;
int number_big;

Enum values are always in upper case and prefixed with the enum name

enum llama_vocab_type {
    LLAMA_VOCAB_TYPE_NONE = 0,
    LLAMA_VOCAB_TYPE_SPM  = 1,
    LLAMA_VOCAB_TYPE_BPE  = 2,
    LLAMA_VOCAB_TYPE_WPM  = 3,
    LLAMA_VOCAB_TYPE_UGM  = 4,
    LLAMA_VOCAB_TYPE_RWKV = 5,
};

The general naming pattern is <class>_<method>, with <method> being <action>_<noun>

llama_model_init();           // class: "llama_model",         method: "init"
llama_sampler_chain_remove(); // class: "llama_sampler_chain", method: "remove"
llama_sampler_get_seed();     // class: "llama_sampler",       method: "get_seed"
llama_set_embeddings();       // class: "llama_context",       method: "set_embeddings"
llama_n_threads();            // class: "llama_context",       method: "n_threads"
llama_adapter_lora_free();    // class: "llama_adapter_lora",  method: "free"

The get <action> can be omitted
The <noun> can be omitted if not necessary
The _context suffix of the <class> is optional
Use init/free for constructor/destructor <action>

Use the _t suffix when a type is supposed to be opaque to the user - it's not relevant to them if it is a struct or anything else
```
typedef struct llama_context * llama_context_t;

enum llama_pooling_type llama_pooling_type(const llama_context_t ctx);
```
[!NOTE] This guideline is yet to be applied to the llama.cpp codebase. New code should follow this guideline.
(TODO: abbreviations usage)

Resources

The Github issues, PRs and discussions contain a lot of information that can be useful to get familiar with the codebase. For convenience, some of the more important information is referenced from Github projects:

https://github.com/ggerganov/llama.cpp/projects

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!