-
Notifications
You must be signed in to change notification settings - Fork 588
add support to handle ha notifications #3659
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
orchagent/notifications.cpp
Outdated
@@ -52,6 +52,18 @@ void on_twamp_session_event(uint32_t count, sai_twamp_session_event_notification | |||
// which causes concurrency access to the DB | |||
} | |||
|
|||
void on_ha_set_event(uint32_t count, sai_ha_set_event_data_t *data) | |||
{ | |||
// don't use this event handler, because it runs by libsairedis in a separate thread |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't work since DPU uses ZMQ. Please check this PR #3547 on what has to be done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @vivekrnv , thanks for bringing it up. Can you share the doc? Per HA detailed design, there is no change mentioned for communication mode between syncd to orchagent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a change done by @prabhataravind to move the communication channel to ZMQ. PFA: sonic-net/sonic-buildimage#21940
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
orchagent/dash/dashhaorch.cpp
Outdated
SWSS_LOG_NOTICE("DPU is pending on role activation for %s", key.c_str()); | ||
} | ||
|
||
fvs.push_back({"ha_state", sai_ha_state_name.at(ha_scope_event[i].ha_state)}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see ha_state in hld. Is it a new addition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. Will remove it.
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
… state_activated_rq
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notification changes lgtm
Hi @vivekrnv - appreciate your comments! Can you help review again? |
… state_activated_rq
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
@zjswhhh , most PR checker issues are fixed as of today. Do you know if the test issues are caused by the PR? Please investigate |
… state_activated_rq
/azpw |
/azp run Azure.sonic-swss |
Commenter does not have sufficient privileges for PR 3659 in repo sonic-net/sonic-swss |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
orchagent/dash/dashhaorch.cpp
Outdated
std::string data; | ||
std::vector<swss::FieldValueTuple> values; | ||
|
||
consumer.pop(op, data, values); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use pops
api and process multiple notifications at once.
Also, Make sure doTask(NotificationConsumer &consumer) is called during the empty doTask() Eg:
sonic-swss/orchagent/orchdaemon.cpp
Line 882 in 90fcead
o->doTask(); |
DashHaOrch::doTask()
{
# Finish pending tasks
doTask(*m_haSetNotificationConsumer);
doTask(*m_haScopeNotificationConsumer);
}
if not there is a possibility we might miss some notifications
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use pops api and process multiple notifications at once.
Updated.
Also, Make sure doTask(NotificationConsumer &consumer) is called during the empty doTask()
That's a great tip! I added the executors to m_consumerMap
in DashHaOrch constructor
sonic-swss/orchagent/dash/dashhaorch.cpp
Lines 72 to 73 in eec055a
Orch::addExecutor(haSetNotificatier); | |
Orch::addExecutor(haScopeNotificatier); |
So it should have been taken care of:
Lines 664 to 670 in db7d939
void Orch::doTask() | |
{ | |
for (auto &it : m_consumerMap) | |
{ | |
it.second->drain(); | |
} | |
} |
Don't think I need to override the method in DashHaOrch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, i see. We should be good then.
… state_activated_rq
6ff9ebe
to
eec055a
Compare
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
@@ -25,6 +57,125 @@ DashHaOrch::DashHaOrch(DBConnector *db, const vector<string> &tables, DashOrch * | |||
|
|||
dash_ha_set_result_table_ = make_unique<Table>(app_state_db, APP_DASH_HA_SET_TABLE_NAME); | |||
dash_ha_scope_result_table_ = make_unique<Table>(app_state_db, APP_DASH_HA_SCOPE_TABLE_NAME); | |||
|
|||
m_dpuStateDbConnector = make_unique<DBConnector>("DPU_STATE_DB", 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VS tests are failing, looks like we will need swsscommon changes: https://github.com/sonic-net/sonic-swss-common/blob/master/common/database_config.json
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on it now.
… state_activated_rq
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
What I did
Handle ha_set and ha_scope sai notifications.
HLD:
https://github.com/sonic-net/SONiC/blob/master/doc/smart-switch/high-availability/smart-switch-ha-detailed-design.md
https://github.com/sonic-net/SONiC/blob/master/doc/smart-switch/high-availability/smart-switch-ha-dpu-scope-dpu-driven-setup.md#72-ha-role-activation
Per offline discussion with @r12f, haorch will directly write
DPU_STATE_DB
instead of using zmq forsign-off: Jing Zhang [email protected]
Why I did it
How I verified it
UTs.
Details if related