Skip to content

Bug: swss/orchagent "config reload" failed on armhf #24766

@yanmarkman

Description

@yanmarkman

Is it platform specific

marvell

Importance or Severity

Critical

Description of the bug

Platform: marvell-prestera ARCH: armhf Board Nokia-7215
Branch - TRIXIE (not branch 202505)

The board has week-CPU.
"config reload" failed under High CPU-loading (for example - PTF test_po_cleanup.py)

case CPU-loading time-to-ready Success/Fail(%)
IDLE 55% 350 sec 100%-success
2 "yes" processes full-100% 870 sec 5%-success
2 "yes" + Workaround full-100% 550 sec 100%-success

Deep-trial-investigation:
The "config reload" makes restart for the sonic.target superset service.
The superset start chain/dependency sequence is asynchronous.
It works perfect on Fast-CPUs, but has "bad/wrong async timing"
between swss/syncd and OTHER services started in the set
on week CPU armhf Nokia-7215.

Under debugging, the workaround has been found:

  • Stop swss and syncd services at once after "restart sonic.target"
  • restart swss

Steps to Reproduce

1). Run PTF test_po_cleanup.py on Nokia7215(armhf) board with TRIXIE

2). manually step-by-step:
taskset 0x1 yes > /dev/null 2>&1 &
taskset 0x2 yes > /dev/null 2>&1 &
config reload -y

wait 15 min
docker exec swss supervisorctl status orchagent
show interfaces status

Actual Behavior and Expected Behavior

Relevant log output

Output of show version, show techsupport

Attach files (if any)

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions