Skip to content

Conversation

@Ndancejic
Copy link
Contributor

Today we allow neighbor IPs which are outside a mux cable's subnet to be learned on any mux port. However, internal logic prevents successful processing of these IPs in certain scenarios. This removes the check that compares the ip to the Vlan subnet so that it will be processed as a regular mux neighbor.

What I did
Removed subnet checks from muxorch logic and created unit tests to validate the functionality when a neighbor is moving between 2 vlan subnets

Why I did it
mac move between vlan subnets causes dualtor_neighbor_check inconsistency

How I verified it
created a unit test to check functionality

Details if related
ADO: #35945298

Today we allow neighbor IPs which are outside a mux cable's subnet to be
learned on any mux port. However, internal logic prevents successful
processing of these IPs in certain scenarios. This removes the check
that compares the ip to the Vlan subnet so that it will be processed as
a regular mux neighbor.

Signed-off-by: Nikola Dancejic <[email protected]>
@Ndancejic Ndancejic requested a review from prsunny as a code owner November 20, 2025 23:46
Copilot AI review requested due to automatic review settings November 20, 2025 23:46
@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).


def check_tunnel_route_in_app_db(self, dvs, destinations, expected=True):
appdb = dvs.get_app_db()
def test_mac_move_vlan(

Check notice

Code scanning / CodeQL

Unused local variable Note test

Variable test_mac_move_vlan is not used.
Copilot finished reviewing on behalf of Ndancejic November 20, 2025 23:50
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request aims to remove subnet IP checks from muxorch logic to allow neighbors outside a mux cable's subnet to be processed correctly, fixing MAC move issues between VLAN subnets. However, the PR contains several critical bugs that prevent it from functioning correctly.

Key intended changes:

  • Remove isIpInSubnet() method and related subnet validation logic from C++ muxorch
  • Add comprehensive test coverage for MAC moves between different VLAN subnets
  • Update helper methods to support operations on multiple VLANs

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 10 comments.

File Description
orchagent/muxorch.cpp Removes subnet IP validation logic, but introduces critical bugs in findMuxCableInSubnet() and updateNeighbor() that break multi-cable scenarios
tests/test_mux.py Adds new test for cross-VLAN MAC moves and updates helper methods, but contains structural issues with duplicate code, missing constants, and missing helper methods

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 1358 to 1362
for (auto it = mux_cable_tb_.begin(); it != mux_cable_tb_.end(); it++)
{
MuxCable* ptr = it->second.get();
if (ptr->isIpInSubnet(ip))
{
return ptr;
}
MuxCable* ptr = it->second.get();
return ptr;
}
Copy link

Copilot AI Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The findMuxCableInSubnet function now unconditionally returns the first mux cable in the table, regardless of which IP is being queried. This will cause incorrect behavior when multiple mux cables exist. The function should either:

  1. Return nullptr if the subnet check logic is truly being removed, or
  2. Implement an alternative method to find the correct mux cable (e.g., by port lookup from FDB).

Currently, this will match any IP to the first available mux cable, which is incorrect logic.

Copilot uses AI. Check for mistakes.
else:
self.check_tunnel_route_in_app_db(dvs, [prefix], expected=True)
for _ in range(10):
if not self.neigh_exists_in_asic(asicdb, ip_addr, vlan_id):
Copy link

Copilot AI Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method self.neigh_exists_in_asic() is called but is not defined in the TestMuxTunnelBase class. This method needs to be implemented to check if a neighbor exists in the ASIC database for the given VLAN.

Copilot uses AI. Check for mistakes.
Comment on lines +166 to +261
def test_mac_move_vlan(
self, dvs, setup, setup_vlan, setup_mux_cable, setup_peer_switch,
setup_tunnel, setup_mac_move_vlan_test, neighbor_cleanup, testlog
):
"""Validate neighbor move behavior between Vlan1000 and Vlan2000 mux ports"""

if expected:
appdb.wait_for_matching_keys(self.APP_TUNNEL_ROUTE_TABLE_NAME, destinations)
else:
appdb.wait_for_deleted_keys(self.APP_TUNNEL_ROUTE_TABLE_NAME, destinations)

def check_neigh_in_asic_db(self, asicdb, ip, expected=True):
rif_oid = self.get_vlan_rif_oid(asicdb)
switch_oid = self.get_switch_oid(asicdb)
neigh_key_map = {
"ip": ip,
"rif": rif_oid,
"switch_id": switch_oid
}
expected_key = json.dumps(neigh_key_map, sort_keys=True, separators=(',', ':'))
appdb_conn = swsscommon.DBConnector(swsscommon.APPL_DB, dvs.redis_sock, 0)
asicdb = dvs.get_asic_db()

if expected:
nbr_keys = asicdb.wait_for_matching_keys(self.ASIC_NEIGH_TABLE, [expected_key])
mac_hyphen = "00-00-00-00-aa-aa"
mac_colon = mac_hyphen.replace('-', ':')
vlan1000_id = int(self.VLAN_1000_ID)
vlan2000_id = int(self.VLAN_2000_ID)

for key in nbr_keys:
if ip in key:
return key
ip_test_data = [
("192.168.0.250", self.IPV4_MASK),
("fc02:1000::250", self.IPV6_MASK)
]

else:
def verify_neighbor(ip_addr, mask, vlan_id, mux_state):
prefix = ip_addr + mask
if mux_state == ACTIVE:
self.check_neigh_in_asic_db(asicdb, ip_addr, vlan_id=vlan_id)
self.check_tunnel_route_in_app_db(dvs, [prefix], expected=False)
else:
self.check_tunnel_route_in_app_db(dvs, [prefix], expected=True)
for _ in range(10):
if not self.neigh_exists_in_asic(asicdb, ip_addr, vlan_id):
break
time.sleep(0.2)
else:
pytest.fail(f"Neighbor {ip_addr} unexpectedly present in ASIC for VLAN {vlan_id}")

def cleanup_entries(ip_addr, mask):
self.del_neighbor(dvs, ip_addr, vlan=self.VLAN_1000)
self.del_neighbor(dvs, ip_addr, vlan=self.VLAN_2000)
self.del_fdb(dvs, mac_hyphen)
self.del_fdb(dvs, mac_hyphen, vlan=self.VLAN_2000)
self.check_tunnel_route_in_app_db(dvs, [ip_addr + mask], expected=False)

self.set_mux_state(appdb_conn, "Ethernet0", ACTIVE)
self.wait_for_mux_state(dvs, "Ethernet0", ACTIVE)
self.set_mux_state(appdb_conn, "Ethernet8", ACTIVE)
self.wait_for_mux_state(dvs, "Ethernet8", ACTIVE)

for ip_addr, mask in ip_test_data:
cleanup_entries(ip_addr, mask)

for state_eth4, state_eth12 in itertools.product([ACTIVE, STANDBY], repeat=2):
self.set_mux_state(appdb_conn, "Ethernet4", state_eth4)
self.wait_for_mux_state(dvs, "Ethernet4", state_eth4)
self.set_mux_state(appdb_conn, "Ethernet12", state_eth12)
self.wait_for_mux_state(dvs, "Ethernet12", state_eth12)

for ip_addr, mask in ip_test_data:
cleanup_entries(ip_addr, mask)

self.add_fdb(dvs, "Ethernet4", mac_hyphen)
self.add_neighbor(dvs, ip_addr, mac_colon)
verify_neighbor(ip_addr, mask, vlan1000_id, state_eth4)

self.del_fdb(dvs, mac_hyphen)
self.del_neighbor(dvs, ip_addr, vlan=self.VLAN_1000)
self.add_fdb(dvs, "Ethernet12", mac_hyphen, vlan=self.VLAN_2000)
self.add_neighbor(dvs, ip_addr, mac_colon, vlan=self.VLAN_2000)
verify_neighbor(ip_addr, mask, vlan2000_id, state_eth12)

self.del_fdb(dvs, mac_hyphen, vlan=self.VLAN_2000)
self.del_neighbor(dvs, ip_addr, vlan=self.VLAN_2000)
self.add_fdb(dvs, "Ethernet4", mac_hyphen)
self.add_neighbor(dvs, ip_addr, mac_colon)
verify_neighbor(ip_addr, mask, vlan1000_id, state_eth4)

toggle_eth4 = [STANDBY, ACTIVE] if state_eth4 == ACTIVE else [ACTIVE, STANDBY]
for new_state in toggle_eth4:
self.set_mux_state(appdb_conn, "Ethernet4", new_state)
self.wait_for_mux_state(dvs, "Ethernet4", new_state)
verify_neighbor(ip_addr, mask, vlan1000_id, new_state)

self.del_fdb(dvs, mac_hyphen)
self.del_neighbor(dvs, ip_addr, vlan=self.VLAN_1000)
self.add_fdb(dvs, "Ethernet12", mac_hyphen, vlan=self.VLAN_2000)
self.add_neighbor(dvs, ip_addr, mac_colon, vlan=self.VLAN_2000)

toggle_eth12 = [STANDBY, ACTIVE] if state_eth12 == ACTIVE else [ACTIVE, STANDBY]
for new_state in toggle_eth12:
self.set_mux_state(appdb_conn, "Ethernet12", new_state)
self.wait_for_mux_state(dvs, "Ethernet12", new_state)
verify_neighbor(ip_addr, mask, vlan2000_id, new_state)

cleanup_entries(ip_addr, mask)

self.set_mux_state(appdb_conn, "Ethernet4", ACTIVE)
self.wait_for_mux_state(dvs, "Ethernet4", ACTIVE)
self.set_mux_state(appdb_conn, "Ethernet12", ACTIVE)
self.wait_for_mux_state(dvs, "Ethernet12", ACTIVE)
Copy link

Copilot AI Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another duplicate definition of test_mac_move_vlan appears at lines 166-261. This is the third occurrence of this test method in the file (lines 59-156, 166-261, and 1813-1916). The test should only be defined once in the appropriate test class (TestMuxTunnel). These duplicate definitions at lines 166-261 should be removed.

Copilot uses AI. Check for mistakes.
Comment on lines 1513 to 1516
MuxCable* ptr = it->second.get();
if (ptr->isIpInSubnet(update.entry.ip_address))
{
ptr->updateNeighbor(update.entry, update.add);
return;
}
ptr->updateNeighbor(update.entry, update.add);
return;
}
Copy link

Copilot AI Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The loop through all mux cables now updates only the first mux cable and immediately returns, processing only one mux cable regardless of whether it's the correct one for the given neighbor. This breaks the neighbor update logic when multiple mux cables exist.

Either the loop should be removed entirely (if only one cable needs to be updated), or the correct mux cable should be identified before calling updateNeighbor.

Copilot uses AI. Check for mistakes.
Comment on lines +70 to +75
vlan1000_id = int(self.VLAN_1000_ID)
vlan2000_id = int(self.VLAN_2000_ID)

ip_test_data = [
("192.168.0.250", self.IPV4_MASK),
("fc02:1000::250", self.IPV6_MASK)
Copy link

Copilot AI Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The constants VLAN_1000_ID, VLAN_2000_ID, IPV4_MASK, IPV6_MASK, VLAN_1000, and VLAN_2000 are referenced but not defined in the TestMuxTunnelBase class. These constants need to be added as class attributes for the test to function correctly.

Copilot uses AI. Check for mistakes.
self.del_fdb(dvs, mac_hyphen, vlan=self.VLAN_2000)
self.check_tunnel_route_in_app_db(dvs, [ip_addr + mask], expected=False)

self.set_mux_state(appdb_conn, "Ethernet0", ACTIVE)
Copy link

Copilot AI Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method self.wait_for_mux_state() is called but doesn't appear to be defined. This method needs to be implemented to wait for the mux state to reach the expected value before continuing with the test.

Copilot uses AI. Check for mistakes.
Comment on lines +59 to +156
def test_mac_move_vlan(
self, dvs, setup, setup_vlan, setup_mux_cable, setup_peer_switch,
setup_tunnel, setup_mac_move_vlan_test, neighbor_cleanup, testlog
):
"""Validate neighbor move behavior between Vlan1000 and Vlan2000 mux ports"""

def set_mux_state(self, appdb, ifname, state_change):
appdb_conn = swsscommon.DBConnector(swsscommon.APPL_DB, dvs.redis_sock, 0)
asicdb = dvs.get_asic_db()

mac_hyphen = "00-00-00-00-aa-aa"
mac_colon = mac_hyphen.replace('-', ':')
vlan1000_id = int(self.VLAN_1000_ID)
vlan2000_id = int(self.VLAN_2000_ID)

ip_test_data = [
("192.168.0.250", self.IPV4_MASK),
("fc02:1000::250", self.IPV6_MASK)
]

def verify_neighbor(ip_addr, mask, vlan_id, mux_state):
prefix = ip_addr + mask
if mux_state == ACTIVE:
self.check_neigh_in_asic_db(asicdb, ip_addr, vlan_id=vlan_id)
self.check_tunnel_route_in_app_db(dvs, [prefix], expected=False)
else:
self.check_tunnel_route_in_app_db(dvs, [prefix], expected=True)
for _ in range(10):
if not self.neigh_exists_in_asic(asicdb, ip_addr, vlan_id):
break
time.sleep(0.2)
else:
pytest.fail(f"Neighbor {ip_addr} unexpectedly present in ASIC for VLAN {vlan_id}")

def cleanup_entries(ip_addr, mask):
self.del_neighbor(dvs, ip_addr, vlan=self.VLAN_1000)
self.del_neighbor(dvs, ip_addr, vlan=self.VLAN_2000)
self.del_fdb(dvs, mac_hyphen)
self.del_fdb(dvs, mac_hyphen, vlan=self.VLAN_2000)
self.check_tunnel_route_in_app_db(dvs, [ip_addr + mask], expected=False)

self.set_mux_state(appdb_conn, "Ethernet0", ACTIVE)
self.wait_for_mux_state(dvs, "Ethernet0", ACTIVE)
self.set_mux_state(appdb_conn, "Ethernet8", ACTIVE)
self.wait_for_mux_state(dvs, "Ethernet8", ACTIVE)

for ip_addr, mask in ip_test_data:
cleanup_entries(ip_addr, mask)

for state_eth4, state_eth12 in itertools.product([ACTIVE, STANDBY], repeat=2):
self.set_mux_state(appdb_conn, "Ethernet4", state_eth4)
self.wait_for_mux_state(dvs, "Ethernet4", state_eth4)
self.set_mux_state(appdb_conn, "Ethernet12", state_eth12)
self.wait_for_mux_state(dvs, "Ethernet12", state_eth12)

for ip_addr, mask in ip_test_data:
# Step 2: learn on Ethernet4 / Vlan1000
self.add_fdb(dvs, "Ethernet4", mac_hyphen)
self.add_neighbor(dvs, ip_addr, mac_colon)
verify_neighbor(ip_addr, mask, vlan1000_id, state_eth4)

# Step 3: move to Ethernet12 / Vlan2000
self.del_fdb(dvs, mac_hyphen)
self.del_neighbor(dvs, ip_addr, vlan=self.VLAN_1000)
self.add_fdb(dvs, "Ethernet12", mac_hyphen, vlan=self.VLAN_2000)
self.add_neighbor(dvs, ip_addr, mac_colon, vlan=self.VLAN_2000)
verify_neighbor(ip_addr, mask, vlan2000_id, state_eth12)

# Step 4: move back to Ethernet4 / Vlan1000
self.del_fdb(dvs, mac_hyphen, vlan=self.VLAN_2000)
self.del_neighbor(dvs, ip_addr, vlan=self.VLAN_2000)
self.add_fdb(dvs, "Ethernet4", mac_hyphen)
self.add_neighbor(dvs, ip_addr, mac_colon)
verify_neighbor(ip_addr, mask, vlan1000_id, state_eth4)

toggle_eth4 = [STANDBY, ACTIVE] if state_eth4 == ACTIVE else [ACTIVE, STANDBY]
for new_state in toggle_eth4:
self.set_mux_state(appdb_conn, "Ethernet4", new_state)
self.wait_for_mux_state(dvs, "Ethernet4", new_state)
verify_neighbor(ip_addr, mask, vlan1000_id, new_state)

# Re-learn on Vlan2000 for Ethernet12 toggling
self.del_fdb(dvs, mac_hyphen)
self.del_neighbor(dvs, ip_addr, vlan=self.VLAN_1000)
self.add_fdb(dvs, "Ethernet12", mac_hyphen, vlan=self.VLAN_2000)
self.add_neighbor(dvs, ip_addr, mac_colon, vlan=self.VLAN_2000)

toggle_eth12 = [STANDBY, ACTIVE] if state_eth12 == ACTIVE else [ACTIVE, STANDBY]
for new_state in toggle_eth12:
self.set_mux_state(appdb_conn, "Ethernet12", new_state)
self.wait_for_mux_state(dvs, "Ethernet12", new_state)
verify_neighbor(ip_addr, mask, vlan2000_id, new_state)

cleanup_entries(ip_addr, mask)

self.set_mux_state(appdb_conn, "Ethernet4", ACTIVE)
self.wait_for_mux_state(dvs, "Ethernet4", ACTIVE)
self.set_mux_state(appdb_conn, "Ethernet12", ACTIVE)
self.wait_for_mux_state(dvs, "Ethernet12", ACTIVE)
Copy link

Copilot AI Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test method test_mac_move_vlan appears at lines 59-156 within the TestMuxTunnelBase class definition. This appears to be incorrectly placed, as looking at the diff structure, lines 59-156 seem to replace what should be class constant definitions. Additionally, the same test is properly defined later at lines 1813-1916 within the TestMuxTunnel class. This duplicate definition at lines 59-156 should be removed, and the proper class constants should be restored here instead.

Copilot uses AI. Check for mistakes.
Comment on lines +1534 to +1542
vlan_name = self.VLAN_2000
port = self.VLAN_2000_MEMBER

try:
config_db.create_entry("VLAN", vlan_name, {"vlanid": self.VLAN_2000_ID})
config_db.create_entry("VLAN_MEMBER", f"{vlan_name}|{port}", {"tagging_mode": "untagged"})
config_db.create_entry("VLAN_INTERFACE", vlan_name, {"NULL": "NULL"})
config_db.create_entry("VLAN_INTERFACE", f"{vlan_name}|{self.VLAN_2000_IPV4}", {"NULL": "NULL"})
config_db.create_entry("VLAN_INTERFACE", f"{vlan_name}|{self.VLAN_2000_IPV6}", {"NULL": "NULL"})
Copy link

Copilot AI Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fixture references self.VLAN_2000, self.VLAN_2000_MEMBER, self.VLAN_2000_ID, self.VLAN_2000_IPV4, and self.VLAN_2000_IPV6 but these constants are not defined in the TestMuxTunnelBase class. These need to be added as class attributes for the fixture to work correctly.

Copilot uses AI. Check for mistakes.

def check_tunnel_route_in_app_db(self, dvs, destinations, expected=True):
appdb = dvs.get_app_db()
def test_mac_move_vlan(
Copy link

Copilot AI Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable test_mac_move_vlan is not used.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants