You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There is a transparency violation in DynamoRIO's handling of SYS_futex that affects threads requeued to a different futex var.
The FUTEX_CMP_REQUEUE mode of the futex syscall allows a thread to be "requeue-d" to a different futex var than the one it was originally waiting at. However, if such a thread is interrupted by an SA_RESTART signal handler, the restarted futex resumes waiting at the original futex var, rather than the one the thread was requeue-d to. This is true of native execution, as proved by the small test program below.
This has transparency implications for DynamoRIO's own signals, particularly the detach signal (also shown by the test added in #7032). When DR detaches, threads that were requeue-d to a different futex var will be re-re-queued to the futex var in the original futex syscall made by them.
Native behavior can be observed with the following program, with and without -DWITHOUT_FUTEX_INTERRUPTION.
#include <linux/futex.h>
#include <pthread.h>
#include <signal.h>
#include <stdint.h>
#include <sys/syscall.h>
#include <unistd.h>
#include <cassert>
#include <iostream>
#include <string>
/* The futex the child waits at initially. */
static uint32_t child_futex_var = 0xf00d;
/* The futex the child is transferred to using FUTEX_CMP_REQUEUE. */
static uint32_t child_futex_var_other = 0x8bad;
/* The futex the parent waits at for the child to complete signal handling. */
static uint32_t parent_futex_var = 0xdead;
static void *child_futex_wait(void *) {
std::cerr << "Child " << gettid() << " going to wait at futex\n";
long res = syscall(SYS_futex, &child_futex_var, FUTEX_WAIT, /*#val=*/0xf00d,
/*timeout=*/nullptr, /*uaddr2=*/nullptr, /*val3=*/0);
assert(res == 0);
std::cerr << "Child " << gettid() << " released from futex\n";
return NULL;
}
static void parent_futex_wake() {
std::cerr << "Parent going to wake child from futex\n";
#ifdef WITHOUT_FUTEX_INTERRUPTION
uint32_t *futex = &child_futex_var_other;
#else
/* One may expect the child to be waiting at the child_futex_var_other, but
* it turns out that the child is actually waiting at child_futex_var.
* This is because when the SIGUSR1 sent to the child interrupts the child's
* futex wait, the child restarts the futex wait at the original futex var,
* not the one it was transferred to by the parent's FUTEX_CMP_REQUEUE.
*/
uint32_t *futex = &child_futex_var;
#endif
long res = syscall(SYS_futex, futex, FUTEX_WAKE, /*#wakeup=*/1,
/*timeout=*/nullptr, /*uaddr2=*/nullptr, /*val3=*/0);
assert(res == 1);
}
static void parent_futex_reque() {
long res;
do {
/* Repeat until the child is surely waiting at the futex. We'll know this
* when the following call returns a 1, which means the child was
* transferred to child_futex_var_other.
*/
res = syscall(
SYS_futex, &child_futex_var, FUTEX_CMP_REQUEUE, /*#wakeup_max=*/0,
/*#requeue_max=*/1, /*uaddr2=*/&child_futex_var_other, /*val3=*/0xf00d);
assert(res == 0 || res == 1);
} while (res == 0);
}
static void signal_handler(int sig, siginfo_t *siginfo, ucontext_t *ucxt) {
std::cerr << "Child " << gettid() << " got signal " << sig
<< ". Now back to __original__ futex wait.\n";
/* Let the parent know that the child is done handling the signal. */
long res = syscall(SYS_futex, &parent_futex_var, FUTEX_WAKE, /*#wakeup=*/1,
/*timeout=*/nullptr, /*uaddr2=*/nullptr, /*val3=*/0);
assert(res == 1);
}
int main() {
/* Register the signal handler. */
int rc;
struct sigaction act;
act.sa_sigaction = (void (*)(int, siginfo_t *, void *))signal_handler;
rc = sigfillset(&act.sa_mask); /* Block all signals within handler */
assert(rc == 0);
/* Without the SA_RESTART, an interrupted futex simply returns EINTR. */
act.sa_flags = SA_SIGINFO | SA_RESTART;
rc = sigaction(SIGUSR1, &act, NULL);
assert(rc == 0);
/* Create a child thread. */
pthread_t child_thread;
int res = pthread_create(&child_thread, NULL, child_futex_wait, NULL);
assert(res == 0);
/* Ensure that the child is waiting at a futex. */
parent_futex_reque();
#ifdef WITHOUT_FUTEX_INTERRUPTION
std::cerr << "Parent " << gettid() << " skipping signal send to child\n";
#else
/* Send a signal to the child to interrupt its futex wait. */
std::cerr << "Parent sending signal to child to interrupt futex wait\n";
res = pthread_kill(child_thread, SIGUSR1);
assert(res == 0);
/* Wait for the child to handle the signal. */
res = syscall(SYS_futex, &parent_futex_var, FUTEX_WAIT, /*#val1=*/0xdead,
/*timeout=*/nullptr, /*uaddr2=*/nullptr, /*val3=*/0);
assert(res == 0);
std::cerr << "Parent " << gettid() << " back after child handled signal\n";
#endif
/* Wake up the child. */
parent_futex_wake();
pthread_join(child_thread, NULL);
return 0;
}
The text was updated successfully, but these errors were encountered:
Probably this would not affect correctness of most programs as they would next wait for the 2nd futex regardless. An app with recurring signals like an itimer would have to expect this already as well.
Adds a test where one of the threads is waiting on a futex when detach
occurs. PT traces for such futex syscalls have been observed to fail in
libipt decode. We also do not want such PT traces because they do not
represent real app behavior, as the syscall was interrupted by DR's
detach signal. #7027 added logic to skip them from the written trace.
This PR adds a regression test. Unfortunately this test still does not
reproduce the original libipt decode issue that was seen on a large app.
Most errors seen were on a modified kernel and only a few on a regular
futex. But it is still useful to add this test that ensures that the
thread-final interrupted syscall is skipped.
This test also uncovers a possible transparency violation seen in the
behavior of an interrupted-and-restarted futex call, where the blocked
thread doesn't remember that it was supposed to wait on a different
futex specified by a later FUTEX_CMP_REQUEUE call than the one specified
by it in the original futex syscall.
Since the new test requires Intel-PT, verified that it passes by running
it manually locally:
```
The following tests passed:
code_api|tool.drcacheoff.burst_syscall_pt_SUDO
The following tests passed:
code_api|tool.drcacheoff.kernel.simple_SUDO
code_api|tool.drcacheoff.kernel.opcode-mix_SUDO
code_api|tool.drcacheoff.kernel.syscall-mix_SUDO
code_api|tool.drcacheoff.kernel.invariant-checker_SUDO
```
Issue: #5505
Issue: #7034
There is a transparency violation in DynamoRIO's handling of SYS_futex that affects threads requeued to a different futex var.
The FUTEX_CMP_REQUEUE mode of the futex syscall allows a thread to be "requeue-d" to a different futex var than the one it was originally waiting at. However, if such a thread is interrupted by an SA_RESTART signal handler, the restarted futex resumes waiting at the original futex var, rather than the one the thread was requeue-d to. This is true of native execution, as proved by the small test program below.
This has transparency implications for DynamoRIO's own signals, particularly the detach signal (also shown by the test added in #7032). When DR detaches, threads that were requeue-d to a different futex var will be re-re-queued to the futex var in the original futex syscall made by them.
Native behavior can be observed with the following program, with and without -DWITHOUT_FUTEX_INTERRUPTION.
The text was updated successfully, but these errors were encountered: