-
Notifications
You must be signed in to change notification settings - Fork 47
Refactor configs #383
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
mikasenghaas
wants to merge
48
commits into
main
Choose a base branch
from
mika/refactor/config
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Refactor configs #383
+1,299
−671
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
samsja
reviewed
Jun 10, 2025
samsja
reviewed
Jun 10, 2025
samsja
reviewed
Jun 10, 2025
8d3d82d
to
331b0f0
Compare
84838c6
to
715e3df
Compare
samsja
reviewed
Jun 12, 2025
samsja
reviewed
Jun 12, 2025
samsja
reviewed
Jun 12, 2025
samsja
reviewed
Jun 12, 2025
samsja
reviewed
Jun 12, 2025
samsja
reviewed
Jun 12, 2025
samsja
reviewed
Jun 12, 2025
Co-authored-by: samsja <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR is quite a major refactor of the configs. Most importantly it switches from
pydantic_config
to the Pydantic’s official solutionpydantic-settings
and renames and structures large parts of the inference configs.Nice things
Natural nesting and more descriptive names: The configs are renamed and restructured such that they easier to understand and maximally modular, allowing us to pass around only config chunks in functions/ classes for initialization (e.g.
PipelineConfig
is passed tosetup_pipeline
)CLI help message: We can now display a help message via
uv run src/zeroband/infer.py -h
(or—help
) showing arguments’ types, default values and descriptions. This should make it much easier for a) new people to use the project and b) us (who are forgetful) to remember what certain arguments are for we defined months agoNested TOML configs: We can now load multiple config files. This is super useful if there are some general config (e.g.
@configs/inference/synthetic-2/default.toml
) that is shared across multiple specific configs (e.g. here the model-specific config@configs/inference/synthetic-2/qwen3-4b.toml
)In this example, the CLI argument
--model.name Qwen/Qwen3-32B
will take precendence and the script will useQwen/Qwen3-32B
as the model name. If the CLI argument wasn't set, then the second config file would take precedence and the script would useQwen/Qwen-14B
as the model name. If the second config file wasn't set, then the first config file would take precedence and the script would useQwen/Qwen3-8B
as the model name. Finally, if the first config file wasn't set, then the environment variable would take precedence and the script would useQwen/Qwen-4B
as the model name. If the environment variable wasn't set, then the default value would be used and the script would useQwen/Qwen3-0.6B
as the model name.Prints the full model config
We can also easly define which config values should be printed from a nested config using the
repr
argument of theField
class..env
file, etc.“Breaking”/ annoying things
Quite some argument names are changed, so people will get slightly annoyed at me when they e.g. try to type
—model-name
but now it is—model.name
We define a new schema for setting configs via environment variables. All arguments are prefixed with
PRIME_
and use__
to denote nested model. For example,--model.name
is nested and the corresponding environment variable would bePRIME_MODEL__NAME
. This affects how we set the socket path in production from the protocol worker. The env variablePRIME_SOCKET_PATH
does not work anymore, instead we have to usePRIME_MONITOR__SOCKET__PATH
, or simply pass via CLI as--monitor.socket.path
(preferred)