Open
Description
Hello, I’m currently using UCX 1.81.1. My application loads data via a child process via fork()
midway through, at which point ucx has already been called multiple times, then the program hangs after forking even with env UCX_IB_FORK_INIT=y
enabled. Reinitializing UCX didn’t resolve the issue #4325
After modifying the example code, the problem can be reproduced, the UCX in the child process does not work properly.
Is there a safe way to make UCX work in the child process? (using RDMA IB)
Here is the code modified based on /examples/ucp_client_server.c +1130 :
/* Client-Server initialization */
if (server_addr == NULL) {
/* Server side */
ret = run_server(ucp_context, ucp_worker, listen_addr, send_recv_type);
} else {
/* Client side */
ret = run_client(ucp_context, server_addr, send_recv_type);
pid = fork();
if (pid == 0) {
// not reinit -> Caught signal 11 (Segmentation fault: address not mapped to object at address 0x55e892795000)
ret = init_context(&ucp_context1, &ucp_worker1, send_recv_type);
if (ret != 0) {
goto err;
}
printf("%p, %p", (void *)ucp_context, (void *)ucp_context1);
// but this will hang
ret = run_client(ucp_worker1, server_addr, send_recv_type);
} else {
waitpid(pid, NULL, 0);
}
}
Metadata
Metadata
Assignees
Labels
No labels