Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

contrib : add naming guidelines #11177

Merged
merged 14 commits into from
Jan 13, 2025
Merged

contrib : add naming guidelines #11177

merged 14 commits into from
Jan 13, 2025

Conversation

ggerganov
Copy link
Owner

@slaren
Copy link
Collaborator

slaren commented Jan 10, 2025

Following this, I believe that the sampler init functions are incorrectly named, e.g. llama_sampler_init_top_k should be llama_sampler_top_k_init, but it is not completely clear to me either.

@ggerganov
Copy link
Owner Author

ggerganov commented Jan 10, 2025

I think these are OK since the subject is llama_sampler. Probably the subject_verb_object guideline needs to be adjusted, because the object in this case is not really an object. It's more of a type.

It should also cover cases such as llama_sampler_init_from_file();. Open to suggestions.

@slaren
Copy link
Collaborator

slaren commented Jan 10, 2025

I think it would be useful to put it in terms more familiar to programmers. For example, if we assume an object model, then the naming convention could be defined as llama_<class>_<method>
method then could be defined as <action>_<noun>.
It would also be good to document common actions:
init: constructor
free: destructor
If action is omitted, it is assumed to be get. noun can be omitted when not necessary.
If class is omitted, it is assumed to be context.

@slaren
Copy link
Collaborator

slaren commented Jan 10, 2025

Other things to document:

  • Always use snake_case for function and variable names
  • Enum values are always in upper case and prefixed with the enum name
  • When to use typedef struct x {..} x; or plainly struct x {..};, or when to use the _t suffix in types (currently inconsistent)
  • Always use sized integer types in the public API for easier interop with other languages
  • Common abbreviations, such as n for number (I would discourage all use of abbreviations, but the current code uses them frequently)
  • Follow the existing code style, in case of doubt use clang-format to format the added code

@ggerganov
Copy link
Owner Author

Great suggestions - will update those tomorrow.

@ggerganov
Copy link
Owner Author

When to use typedef struct x {..} x; or plainly struct x {..};, or when to use the _t suffix in types (currently inconsistent)

I've added a guideline. Needs to be extended with the _t suffix usage - could you help with this?

@slaren
Copy link
Collaborator

slaren commented Jan 11, 2025

My recommendation would be to use _t typedefs when the types are supposed to be opaque to the user - it's not relevant to them if it is a struct or anything else. However, that's not the way it is done in llama.cpp and would require changing a lot of types.
https://www.kernel.org/doc/html/v4.10/process/coding-style.html#typedefs

@JohannesGaessler
Copy link
Collaborator

For blocks starting with #ifdef foo, add a comment like #endif // foo to the end of the block.

CONTRIBUTING.md Outdated Show resolved Hide resolved
@JohannesGaessler
Copy link
Collaborator

JohannesGaessler commented Jan 11, 2025

"Documentation is a community effort. When you need to look into the source code to figure out implementation details to figure out how to use an API consider adding a short summary to the header file for future reference. When you notice incorrect or outdated documentation, please update it."

@slaren
Copy link
Collaborator

slaren commented Jan 11, 2025

Might be good to also add a link to the C++ Core Guidelines, generally it's very good advice: https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines

@slaren
Copy link
Collaborator

slaren commented Jan 12, 2025

It would be good to also document file naming. For example, most headers use .h extension, but some use .hpp. That should be standardized.

CONTRIBUTING.md Outdated Show resolved Hide resolved
CONTRIBUTING.md Outdated Show resolved Hide resolved
@ggerganov ggerganov merged commit 7426a26 into master Jan 13, 2025
1 check passed
@ggerganov ggerganov deleted the gg/docs-naming branch January 13, 2025 12:46
- Clean-up any trailing whitespaces, use 4 spaces for indentation, brackets on the same line, `void * ptr`, `int & a`
- Naming usually optimizes for common prefix (see https://github.com/ggerganov/ggml/pull/302#discussion_r1243240963)
- Use sized integer types in the public API
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Use sized integer types in the public API
- Use sized integer types such as `int32_t` in the public API, e.g. `size_t` may also be appropriate for allocation sizes or byte offsets

# Documentation

- Documentation is a community effort
- When you need to look into the source code to figure out implementation details to figure out how to use an API consider adding a short summary to the header file for future reference
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I edited my original post and crossed out "to figure out implementation details" and replaced it with "to figure out how to use an API" since I think implementation details don't actually belong in the header. It seems both got copy-pasted; I guess this formulation also makes sense though it sounds awkward.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated: ca001f6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants