-
Notifications
You must be signed in to change notification settings - Fork 25
issue: 4219010 XLIO support for kernel 6.10 #278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: vNext
Are you sure you want to change the base?
Conversation
e2450d0
to
e54e7f3
Compare
e54e7f3
to
224e851
Compare
bot:retest |
[question] Will this implementation handle multipath routes? AFAIR, in the previous manual parsing didn't support the RTA_MULTIPATH attribute and the routing was broken in this case. Will the libnl implementation use the 1st nexthop regardless of the type? |
bot:retest |
1 similar comment
bot:retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
review is not finished yet
} | ||
|
||
delete[] buf; | ||
__log_dbg("Done"); | ||
parse_tbl(cache_state); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the cache needs to be destroyed at the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
#include <unistd.h> // getpid() | ||
|
||
#include <netlink/route/route.h> | ||
#include <netlink/route/rule.h> | ||
#include <netlink/route/link.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like link.h is unused
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed it
@@ -152,113 +152,117 @@ void route_table_mgr::update_tbl(nl_data_t data_type) | |||
netlink_socket_mgr::update_tbl(data_type); | |||
} | |||
|
|||
void route_table_mgr::parse_entry(struct nlmsghdr *nl_header) | |||
void route_table_mgr::parse_entry(struct nl_object *nl_obj) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add netlink/route/route.h and netlink/netlink.h includes.
Currently the route.h is included implicitly from netlink_event.h which can be rewritten in the future to generalize the netlink related code. Besides, we can do forward declarations for few netlink structures and avoid netlink in the XLIO headers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done :)
// p_val : custom object that contain parsed rule data. | ||
// return true if its not related to local or default table, false otherwise. | ||
void rule_table_mgr::parse_entry(struct nlmsghdr *nl_header) | ||
void rule_table_mgr::parse_entry(struct nl_object *nl_obj) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add netlink/route/rule.h and netlink/netlink.h includes. See rationale at route_table_mgr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done :)
tests/gtest/tcp/tcp_send.cc
Outdated
@@ -184,6 +184,8 @@ TEST_F(tcp_send, null_iov_elements) | |||
vec[1].iov_len = 1000U; | |||
rcs = sendmsg(fd, &msg, 0); | |||
EXPECT_LE_ERRNO(rcs, 0); | |||
EXPECT_EQ(0, rcs); | |||
EXPECT_EQ(14, errno); // TODO - remove after check |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should it be removed as TODO says?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup - removed it
src/core/proto/rule_table_mgr.cpp
Outdated
break; | ||
// FRA_PRIORITY: Rule Priority | ||
uint32_t priority = rtnl_rule_get_prio(rule); | ||
if (priority) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is 0 a valid priority?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes - fixed it
src/core/proto/rule_table_mgr.cpp
Outdated
if (table_id) { | ||
val.set_table_id(table_id); | ||
} | ||
|
||
#if DEFINED_FRA_OIFNAME |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this ifdef and remove its definition in configure.ac. All libnl-3 versions provide rtnl_rule_get_oif().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done :)
76995b1
to
b4415b6
Compare
bot:retest |
1 similar comment
bot:retest |
} | ||
|
||
// Gateway Address (Next Hop) | ||
struct rtnl_nexthop *nh = rtnl_route_nexthop_n(route, 0); // Assuming the first nexthop |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since libnl treats non-multipath nexthop as multipath internally, can we assume that that the multipath loop below handles all the cases and current code block is redundant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
relying on this internal behavior may not be robust in the long term.
The internal implementation details of libnl could change in future updates, which might break our code if we assume this behavior.
To ensure the robustness and maintainability of our code, it is safer to explicitly handle non-multipath nexthop cases separately. This approach will protect us against any potential changes in libnl's internal handling of nexthops.
src/core/proto/rule_table_mgr.h
Outdated
#include "core/proto/netlink_socket_mgr.h" | ||
#include "route_rule_table_key.h" | ||
#include "rule_entry.h" | ||
#include "rule_val.h" | ||
#include <netlink/route/rule.h> | ||
#include <netlink/route/link.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like unused header
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed :)
b4415b6
to
ff8f15e
Compare
bot:retest |
ff8f15e
to
f414448
Compare
/review |
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
bot:retest |
3 similar comments
bot:retest |
bot:retest |
bot:retest |
bot:retest |
3 similar comments
bot:retest |
bot:retest |
bot:retest |
f414448
to
8b4ad1a
Compare
Kernel 6.10 netlink has breaked XLIO functionality. Transitioned to libnl - an abstraction that wraps netlink. This both solves the issue and makes us more robust. Signed-off-by: Tomer Cabouly <[email protected]>
8b4ad1a
to
3e71874
Compare
bot:retest |
1 similar comment
bot:retest |
Description
Kernel 6.10 netlink has breaked XLIO functionality. Transitioned to libnl - an abstraction that wraps netlink.
This both solves the issue and makes us more robust.
What
Build routing and rules table using libnl instead of netlink.
Why ?
solves https://redmine.mellanox.com/issues/4219010
How ?
Tested on kernel 5.21 and kernel 6.10
Change type
What kind of change does this PR introduce?
Check list