Skip to content

[PERF] Cluster Sharding perf issue #5203

@carl-camilleri-uom

Description

@carl-camilleri-uom

Version Information
Version of Akka.NET? 1.4.23
Which Akka.NET Modules? Akka.Cluster.Sharding

Describe the performance issue

A minimum viable repo that reproduces the issue is at https://github.com/carlcamilleri/benchmark-akka-cluster

Running two n2-standard-8 nodes in GCP (8 CPU @ 2.80 GHz and 32GB RAM) with Windows Server 2019 in GCP ("instance-1" and "instance-2"), and a third machine to run the benchmarks from

First check:
curl http://instance-1:5000/5
Response: akka://ping-pong-cluster-system/system/sharding/PingPongActor/13/5(pid=2372,hostname=instance-2)

Therefore entity id 5 actor is hosted on instance-2 server

wrk -t48 -c400 -d30s http://instance-2:5000/5

Running 30s test @ http://instance-2:5000/5
48 threads and 400 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 19.54ms 70.10ms 1.09s 94.22%
Req/Sec 2.41k 429.13 12.12k 86.40%
3360471 requests in 30.10s, 666.60MB read
Socket errors: connect 0, read 0, write 202, timeout 0
Requests/sec: 111645.29
Transfer/sec: 22.15MB

wrk -t48 -c400 -d30s http://instance-1:5000/5


Running 30s test @ http://instance-1:5000/5
48 threads and 400 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 254.21ms 286.27ms 1.02s 78.54%
Req/Sec 119.09 127.71 0.94k 85.22%
73325 requests in 30.03s, 14.55MB read
Socket errors: connect 0, read 0, write 216, timeout 0
Requests/sec: 2441.41
Transfer/sec: 495.91KB

For interest I've also repeated test (1) (i.e. workload on the endpoint which requests the local actor) but with serialize-messages = on, and the result is:

Running 30s test @ http://instance-2:5000/5
48 threads and 400 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 24.77ms 90.16ms 1.15s 94.33%
Req/Sec 1.89k 339.47 4.94k 90.42%
2579946 requests in 30.08s, 511.77MB read
Socket errors: connect 0, read 0, write 246, timeout 0
Requests/sec: 85760.95
Transfer/sec: 17.01MB

So Hyperion serialisation drops throughput from >111k to >85k, which is probably expected

Data and Specs

ASKing a local actor I get >111k req/s throughput, but ASKing a remote actor drops throughput to 2.4k req/s.

Expected behavior
Cross-Machine communication in Cluster Sharding expected to be faster.

Actual behavior
Cross-Machine communication in Cluster Sharding seems to be extremely slow and unusable for my use case (an OLTP workload)

Environment
.NET 5.0
Windows Server 2019
n2-standard-8 machine in GCP (8 CPU @ 2.80 GHz and 32GB RAM)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions