Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Infra][add/remove topo] Improve vm_topology performance #16230

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

lolyu
Copy link
Contributor

@lolyu lolyu commented Dec 25, 2024

Description of PR

Summary:
Fixes # (issue)

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Back port request

  • 202012
  • 202205
  • 202305
  • 202311
  • 202405

Approach

What is the motivation for this PR?

vm_topology builds up the testbed connections (veth links, ovs bridges, etc) on the test server by running Linux commands, which involves a lot of waiting for I/O operations.

  • the vm_topology script running statistics with restart-ptf:
real    18m50.615s
user    0m0.009s
sys     0m0.099s

With the I/O bound nature, vm_topology runtime could be greatly decreased by using threading pool to parallelize the I/O operations.

Signed-off-by: Longxiang [email protected]

How did you do it?

Introduce the thread pool to vm_topology to parallel run functions that take time to finish.

  • restart-ptf on dualtor-120 vm_topology profile statistics:
image

Top three total run time function call:

function name total run time
add_host_ports 1040s
bind_fp_ports 96.3s
init 16.7s
  • remove-topo on dualtor-120 vm_topology profile statistics:
image Top three total run time function call:
function name total run time
remove_host_ports 165s
unbind_fp_ports 40.6s
remove_injected_fp_ports_from_docker 3.3s

Let's use thread pool to parallel run the following functions that take most of time from the above statistics:

  • add_host_ports
  • remove_host_ports
  • bind_fp_ports
  • unbind_fp_ports

Two new classes are introduced to support this feature:

  • class VMTopologyWorker: a worker class to support work in either single thread mode or thread pool mode.
  • class ThreadBufferHandler: a logging handler to buffer logs from each task submitted to the VMTopologyWorker and flush when the task ends. This is to ensure vm_topology logs are grouped by the tasks, logs from different tasks will not be mixed together.

How did you verify/test it?

Let's test this PR on a dualtor-120 testbed with this PR, and the thread pool has 13 thread workers.

operation vm_topology run time without this PR vm_topology run time with this PR
remove-topo 3m19.786s 1m18.430s
restart-ptf 18m50.615s 3m58.963s
  • restart-ptf with-this-PR vm_topology profile statistics:
image
function name total run time without this PR total run time with this PR
add_host_ports 1040s 169s
bind_fp_ports 96.3s 39.3s
  • remove-topo with-this-pR vm_topology profile statistics:
image
function name total run time without this PR total run time with this PR
remove_host_ports 165s 68.8s
unbind_fp_ports 40.6s 8.40

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

@mssonicbld
Copy link
Collaborator

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@lolyu lolyu marked this pull request as ready for review December 26, 2024 05:18
@mssonicbld
Copy link
Collaborator

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@lolyu lolyu force-pushed the improve_vm_topology branch from 12c0a77 to fa6b408 Compare December 26, 2024 05:23
@mssonicbld
Copy link
Collaborator

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@lolyu lolyu force-pushed the improve_vm_topology branch from fa6b408 to 14b8df8 Compare December 26, 2024 05:24
@mssonicbld
Copy link
Collaborator

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@lolyu lolyu changed the title Improve vm_topology performance [Infra][add/remove topo] Improve vm_topology performance Dec 26, 2024
Signed-off-by: Longxiang <[email protected]>
@mssonicbld
Copy link
Collaborator

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@lolyu lolyu requested review from wangxin and yxieca December 26, 2024 06:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants