Skip to content

Conversation

@sodonnel
Copy link
Contributor

@sodonnel sodonnel commented Jan 24, 2025

What changes were proposed in this pull request?

When RM schedules a delete of a container on a datanode, it should keep track of that delete until either:

  1. A ICR / FCR is received which confirms the container is removed.
  2. The datanode goes dead.

Currently, RM expires the delete attempt after 10 minutes and while it should resend the command to the same datanode on retry it may not (eg HDDS-12115) or in other scenarios that cause the datanode ordering to change.

With this change, the expiry still occurs and the command can get dropped on the datanode, but in the ContainerReplicaPendingOps expiry thread, it no long removes the pending delete from the pending list. Instead it will trigger a notification to RM which will then resend the same command with a new deadline until it has been confirmed as successful. RM will subscribe to the notifications from ContainerReplicaPendingOps and re-run any expired delete commands.

This is to combat a recent problem we experienced where delete command hung for a very long time and RM issued new deletes to other DNs, resulting in all replicas of a container getting removed unexpectedly.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-12127

How was this patch tested?

Various unit tests modified and added. Manually tested the deletes are resent in docker but modifying the DN code to drop all delete container commands. Logs from docker compose:

scm-1       | 2025-01-24 12:40:16,891 [OverReplicatedProcessor] INFO replication.ReplicationManager: Sending command [deleteContainerCommand: containerID: 1, replicaIndex: 0, force: true] for container ContainerInfo{id=#1, state=CLOSED, stateEnterTime=2025-01-24T12:37:21.566460928Z, pipelineID=PipelineID=5c31c1f3-21c7-4595-b851-282eef3fa642, owner=omServiceIdDefault} to fdcb66e0-8400-4c7e-b110-6aca4f8ab610(ozone-datanode-2.ozone_default/172.20.0.7) with datanode deadline 1737722986891 and scm deadline 1737723016891

scm-1       | 2025-01-24 12:50:46,790 [ExpiredContainerReplicaOpScrubber] INFO replication.ReplicationManager: Sending command [deleteContainerCommand: containerID: 1, replicaIndex: 0, force: true] for container ContainerInfo{id=#1, state=CLOSED, stateEnterTime=2025-01-24T12:37:21.566460928Z, pipelineID=PipelineID=5c31c1f3-21c7-4595-b851-282eef3fa642, owner=omServiceIdDefault} to fdcb66e0-8400-4c7e-b110-6aca4f8ab610(ozone-datanode-2.ozone_default/172.20.0.7) with datanode deadline 1737723616789 and scm deadline 1737723646789

Notice that the first delete command was send by the OverReplicatedProcessor thread. The followup was sent by ExpiredContainerReplicaOpScrubber after the timeout expired proving the expiry notifications are working.

Original deadline 1737722986891 = Friday, 24 January 2025 12:49:46
New deadline 1737723616789 = Friday, 24 January 2025 13:00:16

The deadline was advanced as expected.

Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sodonnel for the patch, LGTM.

Copy link
Contributor

@siddhantsangwan siddhantsangwan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the patch.

@adoroszlai adoroszlai merged commit 04f6255 into apache:master Jan 28, 2025
42 checks passed
@adoroszlai
Copy link
Contributor

Thanks @sodonnel for the patch, @siddhantsangwan for the review.

sodonnel added a commit to sodonnel/hadoop-ozone that referenced this pull request Jan 30, 2025
…ete is confirmed or node is dead (apache#7746)

(cherry picked from commit 04f6255)

 Conflicts:
	hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/ContainerReplicaPendingOps.java
	hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/ReplicationManager.java
	hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/replication/TestContainerReplicaPendingOps.java
	hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/replication/TestECContainerReplicaCount.java
	hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/replication/TestReplicationManager.java
	hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/replication/TestReplicationManagerScenarios.java
	hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/container/common/statemachine/commandhandler/TestBlockDeletion.java
sodonnel added a commit to sodonnel/hadoop-ozone that referenced this pull request Jan 30, 2025
…ete is confirmed or node is dead (apache#7746)

(cherry picked from commit 04f6255)

 Conflicts:
	hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/ContainerReplicaPendingOps.java
	hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/ReplicationManager.java
	hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/replication/TestContainerReplicaPendingOps.java
	hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/replication/TestECContainerReplicaCount.java
	hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/replication/TestReplicationManager.java
	hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/replication/TestReplicationManagerScenarios.java
	hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/container/common/statemachine/commandhandler/TestBlockDeletion.java
sodonnel added a commit to sodonnel/hadoop-ozone that referenced this pull request Jan 30, 2025
…ete is confirmed or node is dead (apache#7746)

(cherry picked from commit 04f6255)

 Conflicts:
	hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/ContainerReplicaPendingOps.java
	hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/ReplicationManager.java
	hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/replication/TestContainerReplicaPendingOps.java
	hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/replication/TestECContainerReplicaCount.java
	hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/replication/TestReplicationManager.java
	hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/replication/TestReplicationManagerScenarios.java
	hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/container/common/statemachine/commandhandler/TestBlockDeletion.java
nandakumar131 pushed a commit to nandakumar131/ozone that referenced this pull request Feb 10, 2025
ptlrs pushed a commit to ptlrs/ozone that referenced this pull request Mar 8, 2025
* CDPD-78092. HDDS-12114. Prevent delete commands running after a long lock wait and send ICR earlier (apache#7726)

(cherry picked from commit b6cc4af)

 Conflicts:
	hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java
	hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/keyvalue/TestKeyValueHandler.java

Change-Id: I62ffb7203f2af5be2901ef923f333de53bbc3656

* CDPD-78149. HDDS-12115. RM selects replicas to delete non-deterministically if nodes are overloaded (apache#7728)

(cherry picked from commit efd8adc)

 Conflicts:
	hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/replication/TestRatisOverReplicationHandler.java

Change-Id: Ia3d54917c7c488a9b706f6ce941e7f466746d3bd

* CDPD-78286. HDDS-12135. Set RM default deadline to 12 minutes and datanode offset to 6 minutes (apache#7747)

(cherry picked from commit d7616ec)
Change-Id: I36f237705f5a94d453bcec72c32056c2be8f38ba

* CDPD-78213. HDDS-12127. RM should not expire pending deletes, but retry until delete is confirmed or node is dead (apache#7746)

(cherry picked from commit 04f6255)

 Conflicts:
	hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/ContainerReplicaPendingOps.java
        hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/ReplicationManager.java
	hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/balancer/TestMoveManager.java
	hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/replication/TestContainerReplicaPendingOps.java
	hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/replication/TestECContainerReplicaCount.java
	hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/replication/TestReplicationManager.java
	hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/replication/TestReplicationManagerScenarios.java
	hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/container/common/statemachine/commandhandler/TestBlockDeletion.java

Change-Id: Ic01591f72706f2473c63dd2e44c3f2a94fb70d43

---------

Co-authored-by: Stephen O'Donnell <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants