Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: add option to force guests to use swiotlb in dedicated memory region for all I/O #5108

Draft
wants to merge 16 commits into
base: feature/secret-hiding
Choose a base branch
from

Conversation

roypat
Copy link
Contributor

@roypat roypat commented Mar 24, 2025

Changes

Add a new API field to /machine-config, initial_swiotlb_size, which causes Firecracker to carve out a region of guest memory of the specified size, mark is as dma-restricted in FDT, and force all virtio devices to use this memory for their vrings and buffers.

Reason

With secret hiding, I/O needs to be handled by swiotlb, as we cannot do I/O into guest memory that is direct map removed

License Acceptance

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.

PR Checklist

  • I have read and understand CONTRIBUTING.md.
  • I have run tools/devtool checkstyle to verify that the PR passes the
    automated style checks.
  • I have described what is done in these changes, why they are needed, and
    how they are solving the problem in a clear and encompassing way.
  • I have updated any relevant documentation (both in code and in the docs)
    in the PR.
  • I have mentioned all user-facing changes in CHANGELOG.md.
  • If a specific issue led to this PR, this PR closes the issue.
  • When making API changes, I have followed the
    Runbook for Firecracker API changes.
  • I have tested all new and changed functionalities in unit tests and/or
    integration tests.
  • I have linked an issue to every new TODO.

  • This functionality cannot be added in rust-vmm.

@roypat roypat force-pushed the secret-freedom-v4 branch 14 times, most recently from 8fe8462 to 75f4ee1 Compare March 26, 2025 15:41
@roypat roypat force-pushed the feature/secret-hiding branch from 5573725 to ad06919 Compare March 27, 2025 07:40
@roypat roypat force-pushed the secret-freedom-v4 branch 12 times, most recently from e11ee8e to 7635ee8 Compare March 28, 2025 14:01
Copy link

codecov bot commented Mar 28, 2025

Codecov Report

Attention: Patch coverage is 62.88462% with 193 lines in your changes missing coverage. Please review.

Project coverage is 83.53%. Comparing base (cfd4d8b) to head (818a6ee).

Files with missing lines Patch % Lines
src/vmm/src/vm_memory_vendored.rs 41.61% 87 Missing ⚠️
src/vmm/src/vstate/vm.rs 55.05% 40 Missing ⚠️
src/vmm/src/builder.rs 60.00% 24 Missing ⚠️
src/vmm/src/resources.rs 83.33% 8 Missing ⚠️
src/vmm/src/devices/virtio/block/device.rs 0.00% 5 Missing ⚠️
src/vmm/src/devices/virtio/queue.rs 82.60% 4 Missing ⚠️
src/vmm/src/arch/x86_64/mod.rs 0.00% 3 Missing ⚠️
src/vmm/src/devices/virtio/balloon/device.rs 0.00% 3 Missing ⚠️
.../vmm/src/devices/virtio/block/vhost_user/device.rs 0.00% 3 Missing ⚠️
src/vmm/src/devices/virtio/block/virtio/device.rs 0.00% 3 Missing ⚠️
... and 5 more
Additional details and impacted files
@@                    Coverage Diff                    @@
##           feature/secret-hiding    #5108      +/-   ##
=========================================================
+ Coverage                  83.03%   83.53%   +0.50%     
=========================================================
  Files                        250      222      -28     
  Lines                      26911    24751    -2160     
=========================================================
- Hits                       22345    20676    -1669     
+ Misses                      4566     4075     -491     
Flag Coverage Δ
5.10-c5n.metal 83.27% <62.88%> (-0.28%) ⬇️
5.10-m5n.metal 83.28% <62.88%> (-0.28%) ⬇️
5.10-m6a.metal 82.44% <62.88%> (-0.28%) ⬇️
5.10-m6g.metal ?
5.10-m6i.metal 83.27% <62.88%> (-0.29%) ⬇️
5.10-m7a.metal-48xl 82.43% <62.88%> (-0.28%) ⬇️
5.10-m7g.metal ?
6.1-c5n.metal 83.32% <62.88%> (-0.28%) ⬇️
6.1-m5n.metal 83.31% <62.88%> (-0.29%) ⬇️
6.1-m6a.metal 82.48% <62.88%> (-0.29%) ⬇️
6.1-m6g.metal ?
6.1-m6i.metal 83.31% <62.88%> (-0.29%) ⬇️
6.1-m7a.metal-48xl 82.48% <62.88%> (-0.29%) ⬇️
6.1-m7g.metal ?
6.14-c5n.metal 83.31% <62.88%> (-0.28%) ⬇️
6.14-m5n.metal 83.31% <62.88%> (-0.29%) ⬇️
6.14-m6a.metal 82.48% <62.88%> (-0.28%) ⬇️
6.14-m6g.metal ?
6.14-m6i.metal 83.30% <62.88%> (-0.29%) ⬇️
6.14-m7a.metal-48xl 82.48% <62.88%> (-0.28%) ⬇️
6.14-m7g.metal ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@roypat roypat force-pushed the secret-freedom-v4 branch 2 times, most recently from 6c587d6 to c9019db Compare March 28, 2025 15:12
@roypat roypat force-pushed the secret-freedom-v4 branch 8 times, most recently from 3e042c6 to a4e86a0 Compare March 28, 2025 17:22
@roypat roypat force-pushed the feature/secret-hiding branch from bf38d65 to cfd4d8b Compare April 3, 2025 05:39
@roypat roypat force-pushed the secret-freedom-v4 branch 2 times, most recently from 674a87c to ba75dbc Compare April 3, 2025 12:28
roypat added 16 commits April 3, 2025 13:49
There's no need to test this through VmResources when it can be tested
in isolation. Also, everytime I touch MachineConfig I get confsued by
where the hell the tests are, cuz not only are they in a different
module, they're also one directory level away. So move the tests into
machine_config.rs, where it makes sense to have them.

Signed-off-by: Patrick Roy <[email protected]>
We will use the generified GuestRegionCollection type from
vm-memory#312, so temporarily copy it from the vm-memory PR into the
firecracker repository.

This commit must be dropped in favor of the upstream solution once
vm-memory#312 is merged, and before the feature branch is merged.

Signed-off-by: Patrick Roy <[email protected]>
Use a special KvmRegion type, which allows us to manage
kvm_userspace_memory_region2 inside of GuestRegionCollection.

This means we can stop using .zip(0u32..) to determine kvm slot indices,
which doesnt work anymore when multiple GuestMemoryMmaps are around,
which have no order imposed on them wrt which one is initialized first.

Signed-off-by: Patrick Roy <[email protected]>
Add a mem_config parameter to the PUT/PATCH /machine-config endpoints,
which allows configuration of the size of a swiotlb region to use.

Signed-off-by: Patrick Roy <[email protected]>
Add a special `GuestMemoryMmap` to `struct Vm` that has the semantics
that if its not empty, it represents the swiotlb area in the guest.

Have all virtio devices only get access to swiotlb regions if they
exist, only falling back onto having access to all of guest memory
otherwise. This means that in case of swiotlb regions being set up,
virtio devices cannot access generic memory, so if the guest decides to
(incorrectly) place I/O buffers outside of swiotlb we'll get a somewhat
meaningful error of "invalid guest address" (or similar), instead of
random things failing further down the line in case generic memory isn't
accessible by firecracker/the host kernel for some reason (e.g. if its
guest_memfd backed in the future).

When a VM with swiotlb regions is being snapshotted, the regions are
simply concatenated to the end of the snapshot memory file. This changes
the snapshot memory file format in the sense that regions in the file
are not necessarily sorted by guest address anymore.

Signed-off-by: Patrick Roy <[email protected]>
If swiotlb is requested through the API, this function will allocate a
single contiguous region intended for swiotlb use. It needs to be
continguous because we must be able to describe it using a single memory
range in FDT, as devices cannot be assigned to reserved memory
consisting of multiple regions (e.g. FDT assignment is to memory
regions, not to #reserved-memory nodes).

While we're at it, always use anon memory for the "normal" part of guest
memory if swiotlb is enabled, as if swiotlb is enabled we know that
vhost devices will never need to access that part of memory.

Signed-off-by: Patrick Roy <[email protected]>
Allocate the swiotlb region if requested via the /machine-config API
endpoint, and register it to the VM/KVM.

Signed-off-by: Patrick Roy <[email protected]>
describe the swiotlb region as a restricted-mem node in the flattened
device tree, and force all virtio devices to use it for their vrings and
buffers by setting their `memory-region` property.

Signed-off-by: Patrick Roy <[email protected]>
Have all virtio devices off VIRTIO_F_PLATFORM_ACCESS, which communicates
to the guest that DMA APIs should be used for allocating vrings and
virtio buffers. Since we have created a dedicated swiotlb region in the
geust and assigned all virtio devices to this, it means that all such
allocations will go through the swiotlb region.

Signed-off-by: Patrick Roy <[email protected]>
Support restoring VMs with swiotlb regions. For this, untangle the uffd
handshake from the actual restoration of memory regions, as we first
need to restore and register to the Vm _all_ memory regions, before we
can then send a single handshake containing both normal and swiotlb
regions to the Uffd handler.

While we're here, significantly simplify the jungle of error types.

Signed-off-by: Patrick Roy <[email protected]>
This will make it easier to validate in an integration test that virt
queues are actually placed in the swiotlb region, as a sort of sanity
check that on Firecracker's side the swiotlb regions are actually passed
to the virtio devices, and that the guest is honoring the request to
place virtio stuff into the swiotlb regions.

Signed-off-by: Patrick Roy <[email protected]>
This is needed for the guest to correctly parse the restricted-memory
nodes that firecracker now puts into FDT.

Signed-off-by: Patrick Roy <[email protected]>
these contain 6.1 guest kernels that have CONFIG_DMA_RESTRICTED_MEM
enabled and can thus parse swiotlb region descriptions from FDT.

Signed-off-by: Patrick Roy <[email protected]>
If the CI artifacts dont contain old firecracker releases, still succeed
at setting downloading them.

Signed-off-by: Patrick Roy <[email protected]>
On aarch64, additionally parametrize our throughput performance tests
(network, block and vsock) by swiotlb size, to get a feeling for the
performance impact bounce buffering will have. Try out different sizes
of swiotlb region to see whether we can identify a cut off.

Signed-off-by: Patrick Roy <[email protected]>
Add an integration test on ARM that configures an in-guest swiotlb
region to which all virtio devices should be bound. Asserts the guest
sees the swiotlb region, and also that block and net I/O work
(indirectly through loading of rootfs during boot and by running
commands via SSH). Also asserts that all virtio queue components have
been placed into swiotlb, by parsing firecracker log messages.

Also add a test for snapshotting VMs with swiotlb regions, which just
checks the network device after restoration to sanity check that the
swiotlb region was correctly restored.

Signed-off-by: Patrick Roy <[email protected]>
@roypat roypat force-pushed the secret-freedom-v4 branch from ba75dbc to 818a6ee Compare April 3, 2025 12:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant