Prioritized Partitions Simulation #3028

mudit-saxena · 2025-03-10T14:04:06Z

Summary

This PR experiment with estimated completion time for bootstrap of X% of total partitions assigned to the host. This will validate the idea that prioritized partitions can be completed in Y hours

Testing Done

./gradlew clean build && ./gradlew allJar

DevenAhluwalia

This is decently risky change in the critical section of Replication.
Although this is controlled using cfg2, Is there a way to further limit this to one host?

At the very least, imo, we should test this change in Perf/Ei using binary hot reload way.

DevenAhluwalia · 2025-03-11T05:22:14Z

ambry-replication/src/main/java/com/github/ambry/replication/continuous/DataNodeTracker.java

    // for each of smaller array of remote replicas create active group trackers with consecutive group ids
    for (List<RemoteReplicaInfo> remoteReplicaList : remoteReplicaSegregatedList) {
-      ActiveGroupTracker activeGroupTracker = new ActiveGroupTracker(currentGroupId, remoteReplicaList.stream()
+      int size = remoteReplicaList.size();


Lets extract this code piece so that its a single point where this logic resides. Currently this is being evaluated at 2 places and any change in one place must be replicated to the other location. Roughly :-

list foo(inList, bool IsReplicationEnablePrioritzation, replicationMaxPrioritizedReplicasPercent) { int size = inList.size(); if (IsReplicationEnablePrioritzation) { int maxSize = replicationMaxPrioritizedReplicasPercent/100 * size; size = Math.min(size, maxSize); } return inList.subList(0, size) }

manbearpig1996 · 2025-03-11T09:00:27Z

ambry-replication/src/main/java/com/github/ambry/replication/continuous/DataNodeTracker.java

-      ActiveGroupTracker activeGroupTracker = new ActiveGroupTracker(currentGroupId, remoteReplicaList.stream()
+      int size = remoteReplicaList.size();
+      if (isReplicaPrioritzationEnabled) {
+        int maxSize = replicationMaxPrioritizedReplicas/100 * size;


You will not be able to get accurate data from this methodology. As replicas for partition can be in different threads and different data node tracker. So you will stop one replica and not stop another replica. Also after each iteration, the list could change and different partition will be picked up.

gshantanu · 2025-04-10T23:25:14Z

What's the next step on this PR? It has been open for nearly a month.

mudit-saxena added 3 commits February 28, 2025 14:40

Merge branch 'master' of https://github.com/linkedin/ambry

24a256a

Merge branch 'master' of https://github.com/linkedin/ambry

6038156

Prioritization - Take % of remoteReplicas to mimic prioritization

189ec6c

DevenAhluwalia reviewed Mar 11, 2025

View reviewed changes

manbearpig1996 reviewed Mar 11, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Prioritized Partitions Simulation #3028

Prioritized Partitions Simulation #3028

Uh oh!

mudit-saxena commented Mar 10, 2025

Uh oh!

DevenAhluwalia left a comment

Uh oh!

DevenAhluwalia Mar 11, 2025

Uh oh!

manbearpig1996 Mar 11, 2025

Uh oh!

gshantanu commented Apr 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Prioritized Partitions Simulation #3028

Are you sure you want to change the base?

Prioritized Partitions Simulation #3028

Uh oh!

Conversation

mudit-saxena commented Mar 10, 2025

Summary

Testing Done

Uh oh!

DevenAhluwalia left a comment

Choose a reason for hiding this comment

Uh oh!

DevenAhluwalia Mar 11, 2025

Choose a reason for hiding this comment

Uh oh!

manbearpig1996 Mar 11, 2025

Choose a reason for hiding this comment

Uh oh!

gshantanu commented Apr 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants