Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework shared memory in light of revocation and explicit mmap permissions #2225

Open
wants to merge 11 commits into
base: dev
Choose a base branch
from

Conversation

brooksdavis
Copy link
Member

Sharing memory that can contain capabilities across address spaces is fraught with peril, but also somewhat common. Increase the degree to which the programmer must alter their code to express the intent share capabilities across address spaces with the aim of making such uses auditable.

TODO:

  • Add elfctl knob(s) to enable/disable cross address space sharing
  • minherit(INHERIT_NONE) when not covering a full reservation is probably a panic implementation post fork and thus should be disallowed.

bin/cheribsdtest/cheribsdtest_vm.c Outdated Show resolved Hide resolved
lib/libsys/shm_open.2 Outdated Show resolved Hide resolved
lib/libsys/shm_open.2 Show resolved Hide resolved
sys/kern/uipc_shm.c Outdated Show resolved Hide resolved
sys/kern/uipc_shm.c Outdated Show resolved Hide resolved
sys/kern/uipc_shm.c Outdated Show resolved Hide resolved
sys/kern/uipc_shm.c Outdated Show resolved Hide resolved
&reservation, shmfd->shm_size, max_prot);
if (rv != KERN_SUCCESS)
goto fail;
rv = vm_map_insert(map, shmfd->shm_object, 0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a particular reason we have to map them in order to scan them, or is this just more convenient? In general I'd expect it to be significantly cheaper to scan the resident pages directly (and page in as needed, based on the capability tags stashed in swap block structures). That approach however does require more code, so I understand why you went with this approach.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing scan logic is more vm_map_entry-centric than might be ideal. Perhaps ‎vm_cheri_revoke_object_at should be refactored to not take the entry and instead take (and return) the fields it needs. Then we could sweep an arbitrary chunk of an object, mapped or not.

Though that approach is at odds with an experiment I'd love to have time to run: https://github.com/CTSRD-CHERI/paper-revocation3/issues/33

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did it because it was easy and took advantage of existing infrastructure. Ideally we'd probably scan the pages directly at the end and thus do minimal work in the common case where the pages were already mapped.

sys/vm/vm_mmap.c Show resolved Hide resolved
brooksdavis and others added 11 commits October 17, 2024 22:28
While here, explicitly request PROT_CAP permissions.  In purecap
compilations of the test suite we store a full capability for
faulting addresses and thus need the ability to share them.
Alter the child side to use MAP_PRIVATE which can still leak
capabilities into the address space, just not back the other way.
Differentiate between system calls failing (e.g., due to ABI changes)
and tags being transfered.  Syscalls wrapped with cheribsdtest
CHERI/VERIFY macros exit with EX_SOFTWARE where success or tag transfer
now returns 0 or 1 respectively.
This uses a new shm_open_flags table which is positively defined to
the list of allowed flags as the majority of open(2) flags don't apply
to shm_open(2) or memfd_create(2). This will permit shm-specific O_
flags which reuse otherwise irrelevant values (e.g., O_DIRECTORY or
O_EXEC) to avoid consuming scarce O_ values.
We were previously unconditionally adding PROT_WRITE to the maxprot of
private mapping (because a private mapping can be written even if the
fd is read-only), but this might violate the user's PROT_MAX request.

While here, rename cap_maxprot to max_maxprot.  This is the intersection
of the maximum protections imposed by capsicum rights on the fd (not
really relevant for private mappings) and the user-required maximum
protections (which were not being obeyed).  In particular, cap_maxprot
is a misnomer after the introduction of PROT_MAX.

Add some regression test cases.  mmap__maxprot_shm fails without this
patch.

Note: Capsicum's CAP_MMAP_W is a bit ambiguous.  Should it be required
in order to create writeable private mappings?  Currently it is, even
though such mappings don't permit writes to the object referenced by the
fd.

Reported by:	brooks
Reviewed by:	brooks
MFC after:	1 month
Fixes:		c7841c6 ("Relax restrictions on private mappings of POSIX shm objects.")
Differential Revision:	https://reviews.freebsd.org/D46741

(cherry picked from commit 33c2c58f0a3db0a6d3996fa14ac7967274678771)
In order to preserve provenance and prevent colluding proceses from
hiding capabilities from the revoker, we must not share capabilities
across address spaces.  However, doing so is often useful and some
existing code makes use of of this functionality in relatively safe
ways.

As a compromise, allow sharing, but only if the programmer makes their
intent clear by making the following changes:
 - Require that PROT_CAP be requested explicitly in mmap to allow load
   or store of capabilities.
 - By default, attach each share memory object to the first address
   space (vmspace) that opens it and allow only that address space to
   map the object with PROT_CAP.  On fork, remove capability permissions
   from the mapping.
 - Add an O_SHARECAP flag for shm_open2 (aka shm_open and memfd_create)
   which allows sharing capabilities across address spaces, overriding
   the default.
Allow vm entries to be created with VM_INHERIT_NONE.  To be used in a
future commit which maps objects during a revocation pass.
During revocation, map all local shm objects into the address space to
ensure that they are scanned even when they are not (fully) mapped.
Mappings created with O_SHARECAP aren't mapped as revocation on them
is unreliable by design.

I had originally envisioned a model were we maintained a mapping of
only pages not currently mapped in the address space, but the required
bookkeeping seems quite complex and there isn't obvious machinery to
handle it.
mmap(..., MAP_ANON|MAP_SHARED, ...):
    Require PROT_CAP explicitly to enable capability support in
    shared, anonymous mappings.  When specified, set the MAP_SHARECAP
    cow flag which causes a backing object to be allocated and the
    OBJ_SHARECAP flag set to allow sharing capabilities across address
    spaces.

shmat:
    Always set OBJ_SHARECAP on SysV shared memory objects.  Use of them
    is straightforwardly auditable.  We might want to add an explict
    SHM_SHARECAP flag at some point rather than making this universal,
    but shmat is probably best left in the dustbin of history.
CheriABI: mostly disallow post-fork sharing via minherit().  Developers
should use mmap and MAP_SHARED instead.  Do allow no-op reqests and
sharing of mappings that either have no capabilities or where objects
have the OBJ_SHARECAP flag.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants