-
Notifications
You must be signed in to change notification settings - Fork 223
Description
Problem
Multi-threaded workloads with many syscalls stress the VMA subsystem a lot, because almost all syscalls verify their buffers for read/write access using the following functions:
is_user_memory_readable()is_user_memory_writable()is_user_string_readable()is_user_memory_writable_no_skip()
All these functions call test_user_memory() helper:
gramine/libos/src/bookkeep/libos_signal.c
Lines 393 to 405 in 4afc550
| /* | |
| * Tests whether whole range of memory `[addr; addr+size)` is readable, or, if `writable` is true, | |
| * writable. The intended usage of this function is checking memory pointers passed to system calls. | |
| * Note that this does not check the accesses to the memory themselves and is only meant to handle | |
| * invalid syscall arguments (e.g. LTP test suite checks syscall arguments validation). | |
| */ | |
| static bool test_user_memory(const void* addr, size_t size, bool writable) { | |
| if (!access_ok(addr, size)) { | |
| return false; | |
| } | |
| return is_in_adjacent_user_vmas(addr, size, writable ? PROT_WRITE : PROT_READ); | |
| } |
This helper in turn calls the is_in_adjacent_user_vmas() func:
gramine/libos/src/bookkeep/libos_vma.c
Lines 1204 to 1219 in 4afc550
| bool is_in_adjacent_user_vmas(const void* addr, size_t length, int prot) { | |
| uintptr_t begin = (uintptr_t)addr; | |
| uintptr_t end = begin + length; | |
| assert(begin <= end); | |
| struct adj_visitor_ctx ctx = { | |
| .prot = prot, | |
| .is_ok = true, | |
| }; | |
| spinlock_lock(&vma_tree_lock); | |
| bool is_continuous = _traverse_vmas_in_range(begin, end, adj_visitor, &ctx); | |
| spinlock_unlock(&vma_tree_lock); | |
| return is_continuous && ctx.is_ok; | |
| } |
The important part is spinlock_lock(&vma_tree_lock) and spinlock_unlock(&vma_tree_lock). On a multi-threaded app, this lock contention becomes the bottleneck.
Gramine introduced a workaround to sidestep this bottleneck, via the libos.check_invalid_pointers manifest option; it translates to the g_check_invalid_ptrs variable. However, this cannot be used in all cases:
- Some runtimes like Java rely on being able to check invalid pointers. Thus they cannot set
libos.check_invalid_pointers = false; this would lead to Java apps failing. - The
is_user_memory_writable_no_skip()function does not honor thelibos.check_invalid_pointersmanifest option; this is because in certain situations Gramine really must decide whether the VMA is writable or read-only, see e.g. theppoll()case which emulates how Linux works.
Solution
Use the RW lock that was previously introduced in the Gramine codebase: https://github.com/gramineproject/gramine/blob/master/libos/include/libos_rwlock.h
Example usage of this RW lock: f071450
Benchmark results
TODO