-
Notifications
You must be signed in to change notification settings - Fork 649
[countersyncd]: Fix netlink fd leakage and deadlock issue #4043
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Ze Gan <[email protected]>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR replaces the neli netlink library with genetlink and netlink-* libraries to fix deadlock and file descriptor leakage issues in the countersyncd component. The changes involve significant refactoring of socket management, introduction of a shared utilities module, and updates to the reconnection strategy.
Key Changes
- Migrated from
nelitogenetlink,netlink-packet-core,netlink-packet-generic, andnetlink-syslibraries - Added new
netlink_utilsmodule for shared family/group resolution functionality - Introduced
SoftReconnectcommand that performs health-check-based reconnection vs forcing immediate reconnection - Updated socket configuration path and increased health timeout from 10s to 60s
Reviewed changes
Copilot reviewed 7 out of 8 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
crates/countersyncd/src/message/netlink.rs |
Added new SoftReconnect command variant to the NetlinkCommand enum |
crates/countersyncd/src/actor/netlink_utils.rs |
New shared utilities module for netlink family/group resolution, avoiding code duplication |
crates/countersyncd/src/actor/mod.rs |
Registered the new netlink_utils module |
crates/countersyncd/src/actor/data_netlink.rs |
Replaced neli socket types with netlink-sys Socket, updated connection logic, added health-check reconnection, changed to raw recv syscalls |
crates/countersyncd/src/actor/control_netlink.rs |
Replaced neli types with netlink-sys, updated resolver to use shared utilities module |
crates/countersyncd/Cargo.toml |
Replaced neli dependency with genetlink and netlink-* crates, added libc dependency |
Cargo.toml |
Updated workspace dependencies to include new netlink libraries, upgraded binrw to 0.15.0 |
Cargo.lock |
Reflected all dependency changes including removal of neli and addition of netlink libraries |
Comments suppressed due to low confidence (1)
crates/countersyncd/src/actor/data_netlink.rs:691
- The condition on line 689 checks
if size == 0but this check is unreachable because it's inside a branch wheresize if size > 0. This dead code should be removed as it can never execute and creates confusion about the control flow.
if size == 0 {
return Err(io::Error::new(
io::ErrorKind::UnexpectedEof,
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 7 out of 8 changed files in this pull request and generated 14 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: Ze Gan <[email protected]>
4415981 to
f89a59d
Compare
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
What I did
Replace netlink library from neli to genetlink and netlink-*
Why I did it
The original neli has two issues: deadlock and fd leakage.
How I verified it
Check it via unittest and in the real platform
Details if related