Open
Description
Oh no! These are not cache aligned. This is a large amount of false sharing that will cause a big perf decrease on multicore.
We want to essentially have each atomic be 128 bytes apart (on 128-aligned boundaries). The typical way to accomplish this without blowing up the cache and memory consumption is to put the counters in a struct, and align the structs.
struct event_counts {
_Atomic volatile sig_atomic_t software_interrupt_counts_sigalarm_reply;
_Atomic volatile sig_atomic_t software_interrupt_counts_kernel;
/* ... */
} __attribute((aligned(128)));
struct event_counts software_interrupt_counts[runtime_worker_threads_count];
Metadata
Metadata
Assignees
Labels
No labels