Skip to content

Refactor configs #383

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 48 commits into
base: main
Choose a base branch
from
Open

Refactor configs #383

wants to merge 48 commits into from

Conversation

mikasenghaas
Copy link
Collaborator

@mikasenghaas mikasenghaas commented Jun 10, 2025

This PR is quite a major refactor of the configs. Most importantly it switches from pydantic_config to the Pydantic’s official solution pydantic-settings and renames and structures large parts of the inference configs.

Nice things

  1. Natural nesting and more descriptive names: The configs are renamed and restructured such that they easier to understand and maximally modular, allowing us to pass around only config chunks in functions/ classes for initialization (e.g. PipelineConfig is passed to setup_pipeline)

  2. CLI help message: We can now display a help message via uv run src/zeroband/infer.py -h (or —help) showing arguments’ types, default values and descriptions. This should make it much easier for a) new people to use the project and b) us (who are forgetful) to remember what certain arguments are for we defined months ago

  3. Nested TOML configs: We can now load multiple config files. This is super useful if there are some general config (e.g. @configs/inference/synthetic-2/default.toml) that is shared across multiple specific configs (e.g. here the model-specific config@configs/inference/synthetic-2/qwen3-4b.toml)

uv run src/zeroband/infer.py @configs/inference/synthetic-2/default.toml @configs/inference/synthetic-2/qwen3-4b.toml
  1. Multiple sources: We can load configs from TOML files, CLI arguments and environment variables and can easily define the precendence of sources. This is the hierarchy: 1) CLI arguments, 2) Config values, 3) Environment variables, 4) Defaults
PRIME_MODEL__NAME=Qwen/Qwen3-4B uv run src/zeroband/infer.py @qwen8b.toml @qwen14b.toml --model.name Qwen/Qwen3-32B

In this example, the CLI argument --model.name Qwen/Qwen3-32B will take precendence and the script will use Qwen/Qwen3-32B as the model name. If the CLI argument wasn't set, then the second config file would take precedence and the script would use Qwen/Qwen-14B as the model name. If the second config file wasn't set, then the first config file would take precedence and the script would use Qwen/Qwen3-8B as the model name. Finally, if the first config file wasn't set, then the environment variable would take precedence and the script would use Qwen/Qwen-4B as the model name. If the environment variable wasn't set, then the default value would be used and the script would use Qwen/Qwen3-0.6B as the model name.

  1. Easy logging: It's a log easier to log (nested) configs (both to stdout) and to external sources like W&B. For example only this line
    logger.info(f"Initializing model and tokenizer ({config.model})")

Prints the full model config

06-11 16:55:14 [INFER] [INFO] Initializing model and tokenizer (name='Qwen/Qwen3-4B' dtype='auto' kv_cache_dtype='auto' max_model_len=16384 quantization=None enforce_eager=False device='auto' enable_thinking=True)

We can also easly define which config values should be printed from a nested config using the repr argument of the Field class.

  1. No maintenance & more features: We do not need to maintain anything but get loads of nice (optional) feature for free, like setting configs via JSON string, from a .env file, etc.

“Breaking”/ annoying things

  1. Quite some argument names are changed, so people will get slightly annoyed at me when they e.g. try to type —model-name but now it is —model.name

  2. We define a new schema for setting configs via environment variables. All arguments are prefixed with PRIME_ and use __ to denote nested model. For example, --model.name is nested and the corresponding environment variable would be PRIME_MODEL__NAME. This affects how we set the socket path in production from the protocol worker. The env variable PRIME_SOCKET_PATH does not work anymore, instead we have to use PRIME_MONITOR__SOCKET__PATH, or simply pass via CLI as --monitor.socket.path (preferred)

@mikasenghaas mikasenghaas force-pushed the mika/refactor/config branch from 8d3d82d to 331b0f0 Compare June 11, 2025 17:01
@mikasenghaas mikasenghaas marked this pull request as ready for review June 11, 2025 18:47
@mikasenghaas mikasenghaas force-pushed the mika/refactor/config branch from 84838c6 to 715e3df Compare June 11, 2025 19:54
@mikasenghaas
Copy link
Collaborator Author

CI e2e run works. Also tested that all of the commands in the README for distributed inference are updated and work.

Screenshot 2025-06-11 at 1 33 12 PM

@mikasenghaas mikasenghaas changed the title Refactor inference configs [PRI2-591] Refactor inference configs Jun 11, 2025
@mikasenghaas mikasenghaas changed the title [PRI2-591] Refactor inference configs Refactor inference configs Jun 11, 2025
@mikasenghaas mikasenghaas changed the title Refactor inference configs Refactor configs Jun 12, 2025
@mikasenghaas
Copy link
Collaborator Author

Screenshot 2025-06-11 at 6 37 15 PM

@Jackmin801
Copy link
Member

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants