-
Notifications
You must be signed in to change notification settings - Fork 53
docs: add boot process documentation #754
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
Add documentation of Bottlerocket's boot sequence, including systemd target progression, service dependencies, and synchronization mechanisms. Signed-off-by: Sean P. Kelly <[email protected]>
| sysinit.target | ||
| ↓ | ||
| fipscheck.target (FIPS mode only) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't the correct order, but the reason why is subtle. There's a bootconfig-fips.conf snippet in release that sets the default systemd target to fipscheck.target.
Since all the units that are needed by that target are DefaultDependencies=no, sysinit.target doesn't end up added to the job queue until we reach activate-preconfigured.target.
| Bottlerocket's boot sequence progresses through six main stages, each represented by a systemd target: | ||
|
|
||
| ``` | ||
| sysinit.target |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would probably add local-fs.target as one of the most important prerequisites to sysinit.target.
Quite a lot of Bottlerocket-specific work happens for local storage setup 😀 while the sysinit phase is rather vanilla.
|
|
||
| **Purpose:** Verify cryptographic module integrity when FIPS mode is enabled. | ||
|
|
||
| **When it runs:** Only when the kernel command line includes `fips=1`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not correct because of the bootconfig override, which is also where fips=1 comes from.
| - Creates `/etc/.fips-module-check-passed` sentinel file on success | ||
| - Blocks boot if FIPS checks fail | ||
|
|
||
| **Transition:** Completes before `drivers.target` begins. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In terms of coordinated state transitions, we have these:
- fipscheck -> preconfigured (optional)
- preconfigured -> configured
- configured -> multi-user
These are our "runlevels" or discrete stages. Targets don't work like runlevels, they just activate in response to something else causing them to be enqueued. sysinit.target is enqueued because services in preconfigured.target depend on it (because of default dependencies).
| ### Stage 2: drivers.target | ||
|
|
||
| **Purpose:** Load kernel modules and hardware drivers. | ||
|
|
||
| **When it runs:** Always runs, after `basic.target` and before `preconfigured.target`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be better to write this in terms of what "pulls" on various targets. Many of them are pulled in parallel (sysinit.target, drivers.target, network-online.target) by the units in preconfigured.
| **Dependencies:** | ||
|
|
||
| - Requires `basic.target` and `configured.target` | ||
| - This ensures all configuration is complete before workloads start |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kind of, but we never actually enqueue multi-user.target until these dependencies are satisfied.
| ## Service Dependency Patterns | ||
|
|
||
| ### Ordering Dependencies | ||
|
|
||
| - `After=` - This service starts after the specified units | ||
| - `Before=` - This service starts before the specified units | ||
|
|
||
| ### Requirement Dependencies | ||
|
|
||
| - `Requires=` - This service requires the specified units (hard dependency) | ||
| - `Wants=` - This service wants the specified units (soft dependency) | ||
| - `RequiredBy=` - Reverse of `Requires=` (specified in `[Install]` section) | ||
| - `WantedBy=` - Reverse of `Wants=` (specified in `[Install]` section) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find the hard / soft dependency language insufficiently precise.
The "systemd as job queue" formulation would say:
- Wants/WantedBy - causes the wanted unit to be enqueued (pretend it happens in random order), only if and exactly when the wanting unit is enqueued
- After/Before - affects the order in which units are enqueued
- Requires/RequiredBy - like wants, but the requiring unit won't be started if the required unit fails
| ### Early Boot Services | ||
|
|
||
| Services that need to run very early use `DefaultDependencies=no` to avoid the standard dependency chain: | ||
|
|
||
| - `migrator.service` | ||
| - `prepare-*.service` (filesystem preparation) | ||
| - `activate-preconfigured.service` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this is true, I don't think it's helpful - most of these services are special and have unique reasons for DefaultDependencies=no, either for performance or because they're essential to reach sysinit.target.
| Target relationships: | ||
|
|
||
| ``` | ||
| drivers.target: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
drivers.target is kind of a category error here - we can still be running some units from it while starting other units that preconfigured.target wants. The others are much stronger synchronization points.
In general I don't really see targets as a synchronization point. They are more of an abstraction over a bunch of units - "I need everything to bring the network online or to make the TPM2 device to also go into the queue when you put my job in."
They let you synchronize what you enqueue but not when, exactly - "when" is just "the same instant you enqueue some other job".
| ### Sentinel Files | ||
|
|
||
| Services use sentinel files to track state across reboots: | ||
|
|
||
| - `/var/lib/bottlerocket/early-boot-config.ran` - Prevents `early-boot-config.service` from running after first boot | ||
| - `/run/bootstrap-containers/<name>.ran` - Prevents bootstrap containers from re-running | ||
| - `/etc/.fips-module-check-passed` - Marks FIPS check completion | ||
|
|
||
| Services use `ConditionPathExists=` or `ConditionPathExists=!` to check for these files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, though I dislike this pattern and it's more of a last resort ideally.
If we had a sentineldog that could create either a persistent (/.bottlerocket) or ephemeral (/etc) marker then that might be a nicer interface for programs that find themselves doing this.
Add documentation of Bottlerocket's boot sequence, including systemd target progression, service dependencies, and synchronization mechanisms.
Include keywords for semantic search discoverability.
Terms of contribution:
By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.