Skip to content

Commit

Permalink
Merge pull request #248 from Azure/yusef-cps-scalenumbers-update
Browse files Browse the repository at this point in the history
Yusef cps scalenumbers and oversubscription update
  • Loading branch information
yusefMS06 authored Oct 7, 2022
2 parents 264ffd4 + ea3be09 commit 008151e
Show file tree
Hide file tree
Showing 3 changed files with 23 additions and 21 deletions.
15 changes: 8 additions & 7 deletions documentation/general/dash-sonic-hld.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ Following are the minimal scaling requirements
| ACLs per ENI | 6x100K prefixes |
| ACLs per ENI | 6x10K SRC/DST ports |
| CA-PA Mappings | 10M |
| Active Connections/ENI | 1M (Bidirectional) |
| Active Connections/ENI | 1M (Bidirectional TCP or UDP) |

## 1.5 Design Considerations

Expand All @@ -114,12 +114,13 @@ DASH Sonic implementation is targeted for appliance scenarios and must handles m
7. Implementation must support ability to get all ACL rules/groups based on guid.
8. In normal operation, mappings churn often followed by routes and least for ACLs.
9. ENIs shall have an admin-state that enables normal connections and forwarding only *after* all configurations for an ENI is applied during initial creation. When the ENI is admin-state down, the packets destined to this ENI shall be dropped. Order of operation/configuration shall be enforced by the controller. Sonic implementation shall honor the state set by controller and ENI shall accept and forward traffic only if the admin-state is set to 'up'.
10. During VNET or ENI delete, implementation must support ability to delete all *mappings* or *routes* in a single API call.
11. Add and Delete APIs are idempotent. As an example, deleting an object that doesn't exists shall not return an error.
12. During a delete operation, if there is a dependency (E.g. mappings still present when a VNET is deleted), implementation shall return *error* and shall not perform any force-deletions or delete dependencies implicitly.
13. During a bulk operation, if any part/subset of API fails, implementation shall return *error* for the entire API. Sonic implementation shall validate the entire API as pre-checks before applying and return accordingly.
14. Implementation must have flexible memory allocation for ENI and not reserve max scale during initial create (e.g 100k routes). This is to allow oversubscription.
15. Implementation must not have silent failures for APIs. E.g accepting an API from controller, returning success and failing in the backend. This is orthogonal to the idempotency of APIs described above for ADD and Delete operations. Intent is to ensure SDN controller and Sonic implementation is in-sync
10. ENI must support 1M active bi-directional TCP connections or UDP flows however the connection pool can be oversubscribed. An oversubscription of 2:1 would be expected, so the connection pool can be more optimal if executed as one large table where ENI can be a part of the key.
11. During VNET or ENI delete, implementation must support ability to delete all *mappings* or *routes* in a single API call.
12. Add and Delete APIs are idempotent. As an example, deleting an object that doesn't exists shall not return an error.
13. During a delete operation, if there is a dependency (E.g. mappings still present when a VNET is deleted), implementation shall return *error* and shall not perform any force-deletions or delete dependencies implicitly.
14. During a bulk operation, if any part/subset of API fails, implementation shall return *error* for the entire API. Sonic implementation shall validate the entire API as pre-checks before applying and return accordingly.
15. Implementation must have flexible memory allocation for ENI and not reserve max scale during initial create (e.g 100k routes). This is to allow oversubscription.
16. Implementation must not have silent failures for APIs. E.g accepting an API from controller, returning success and failing in the backend. This is orthogonal to the idempotency of APIs described above for ADD and Delete operations. Intent is to ensure SDN controller and Sonic implementation is in-sync

# 2 Packet Flows

Expand Down
25 changes: 13 additions & 12 deletions documentation/general/program-scale-testing-requirements-draft.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,9 +63,10 @@ What we are looking for in a series of testing is how well the NIC
handles:

1. Connections/sec per ENI and per NIC
1. Number of active connections per ENI and per NIC
1. Number of flows per ENI and per NIC
1. Throughput under max connections per second load with the remaining
2. Number of active connections per ENI and per NIC
3. Number of flows per ENI and per NIC
4. ENIs' connection pool can be oversubscribed. An oversubscription of 2:1 would be expected, so the connection pool can be more optimal if executed as one large table where ENI can be apart of the key. The connection table would be the most appropriate table for oversubscription scenarios.
5. Throughput under max connections per second load with the remaining
bandwidth is filled with pre-learned connections that receive at
least one packet per second while driving the links to near 100%
utilization. This requires some work up front to get the right mix
Expand All @@ -74,10 +75,10 @@ handles:
jitter**. We therefore also run the test sufficiently long to see if
there were any queue build-ups which would eventually lead to drops
and distort both latency and jitter results.
1. Aging of (TCP connections) and (UDP bi-directional flows) such that
6. Aging of (TCP bi-directional connections) and (UDP bi-directional flows) such that
after the test is complete all connections are aged within the 1
second interval or any other interval we program.
1. We are expecting to cover below scenarios as follow-on tests:
7. We are expecting to cover below scenarios as follow-on tests:

a. Age arbitrary connections to verify that aging is also working
properly under maximum load.
Expand All @@ -102,7 +103,7 @@ Why are we running these tests?
expect the CPS to increase. **Any NIC for the application that
cannot achieve millions of connections/sec will automatically be
disqualified from further testing.**
1. Many NICs can create (a large number) of connections simply by
2. Many NICs can create (a large number) of connections simply by
adding more external memory for the connection table. For example, a
NIC can create 1M connections in its external table, however if
packets arrive across the entire connection set in a random order,
Expand All @@ -117,15 +118,15 @@ Why are we running these tests?
keepalives once every few minutes) is referred to as an idle
connection and is a useless parameter that should never be
advertised and will not be tested other than for conformance.**
1. Aging is also a vital component of tracking connections. Even under
3. Aging is also a vital component of tracking connections. Even under
the worst load the system must be able to age connections. All
packets will require either connection setup/teardown or policy
lookups/updates involving external memory and hence the memory
management of the connection table is extremely important. The tests
in this document will ensure that no matter what processing is going
on, the connection table will be maintained providing the proper
aging intervals to each connection.
1. We need to be able to enter/delete many new policies at any time
4. We need to be able to enter/delete many new policies at any time
regardless of load. For this reason, we will run the test without
updates to policy to get a baseline and then again with some
extensive policies being added/deleted during the same test. We will
Expand Down Expand Up @@ -304,10 +305,10 @@ both scenarios:
| ACLs prefixes | 10x100K | 64M | 128M | 256M | 512M |
| ACLs Ports | 10x10K | 6.4M | 12.8M | 25.6M | 51.2M |
| Mappings (CA to PA) | 160K | 10M | 20M | 40M | 80M |
| Act Con | 1M (bidir) | 64M | 128M | 256M | 512M |
| Act Con | 1M (bidir w/ connection pool capable of oversubscription) | 64M | 128M | 256M | 512M |
| CPS | | 3.75M | 7.5M | 15M | 30M |
| bg flows TCP | | 1M (bidir) | 2M | 4M | 8M |
| bg flows UDP | | 1M (bidir) | 2M | 4M | 8M |
| bg flows TCP | | 1M (bidir w/ connection pool capable of oversubscription) | 2M | 4M | 8M |
| bg flows UDP | | 1M (bidir w/ connection pool capable of oversubscription) | 2M | 4M | 8M |

- ACL rules per NSG = 1000
- Prefixes per ACL rule = 100
Expand All @@ -326,7 +327,7 @@ both scenarios:
- 48 \* 200k prefixes per NSG = 9.6M Prefixes
- 2M Mapping Table
1.   1 ENI Scenario
2.   1 ENI Scenario
- 1 ENI/VPort
- 1.6M routes
- 48 NSGs
Expand Down
4 changes: 2 additions & 2 deletions documentation/general/sdn-features-packet-transforms.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,12 +73,12 @@ applies to both IPV4 and IPV6 underlay and overlay*

| Syntax | Description | Notes |
| ----------- | ----------- |-------|
| Flow Scale <img style="width:200px"/>| <ul><li>1+ million flows per v-port (aka ENI)</li> <li>16 million per DPU/Card per 200G<ul><li>single encap IPv4 overlay and IPV6 underlay</li> <li>single encap IPv6 overlay and IPV6 underlay. (This can be lower)</li> <li>single encap IPV4</li> <li>Encap IPv6 and IPV4</li></ul></ul> *These are complex flows, details are below.* | |
| Flow Scale <img style="width:200px"/>| <ul><li>1+ million flows per v-port (aka ENI with pool connection capable of being oversubscribed)</li> <li>16 million per DPU/Card per 200G<ul><li>single encap IPv4 overlay and IPV6 underlay</li> <li>single encap IPv6 overlay and IPV6 underlay. (This can be lower)</li> <li>single encap IPV4</li> <li>Encap IPv6 and IPV4</li></ul></ul> *These are complex flows, details are below.* | |
| CPS | 4 million+ (max) | |
| Routes | 100k per v-port (max) | |
| ACLs | 100k IP-Prefixes, 10k Src/Dst ports per v-port (max) | |
| NAT | tbd | |
| V-Port (aka ENI or Source VM) | 32 Primary, 32 Secondary assuming 2x OverSub, 32GB RAM, No ISSU, 10k (theoretical max) | Each ENI 1M total active connections and 2M flows |
| V-Port (aka ENI or Source VM) | 32 Primary, 32 Secondary assuming 2x OverSub, 32GB RAM, No ISSU, 10k (theoretical max) | Each ENI must support 1M total active connections (TCP or UDP with connection pool capable of being oversubscribed) and 2M flows |
| Mappings (VMS deployed) | 10 million total mapping per DPU; mappings are the objects that help us bridge the customer's virtual space (private ip address assigned to each VM) with Azure's physical space (physical/routable addresses where the VMs are hosted) | |
| | For each VPC, we have a list of mappings of the form: PrivateAddress -> (Physical Address v4, Physical Address V6, Mac Address, etc...) | VPC can have up to 1M mappings |

Expand Down

0 comments on commit 008151e

Please sign in to comment.