yugabyte · ddhodge · Mar 10, 2025 · Mar 10, 2025 · Mar 11, 2025 · Apr 11, 2025
diff --git a/docs/content/preview/yugabyte-cloud/cloud-clusters/disaster-recovery/_index.md b/docs/content/preview/yugabyte-cloud/cloud-clusters/disaster-recovery/_index.md
@@ -0,0 +1,86 @@
+---
+title: Configure disaster recovery for an Aeon cluster
+headerTitle: Disaster Recovery
+linkTitle: Disaster recovery
+description: Enable Disaster recovery for clusters
+headContent: Fail over to a replica cluster in case of unplanned outages
+tags:
+  feature: early-access
+menu:
+  preview_yugabyte-cloud:
+    parent: cloud-clusters
+    identifier: disaster-recovery-aeon
+    weight: 500
+type: indexpage
+showRightNav: true
+---
+
+Use xCluster Disaster Recovery (DR) to recover from an unplanned outage (failover) or to perform a planned switchover. Planned switchover is commonly used for business continuity and disaster recovery testing, and failback after a failover.
+
+A DR configuration consists of the following:
+
+- a Source cluster, which serves both reads and writes.
+- a Target cluster, which can also serve reads.
+
+## RPO and RTO for failover and switchover
+
+Data from the Source is replicated asynchronously to the Target (which is read only). Due to the asynchronous nature of the replication, DR failover results in non-zero recovery point objective (RPO). In other words, data not yet committed on the Target _can be lost_ during a failover. The amount of data loss depends on the replication lag, which in turn depends on the network characteristics between the clusters. By contrast, during a switchover RPO is zero, and no data is lost, because the switchover waits for all data to be committed on the Target _before_ switching over.
+
+The recovery time objective (RTO) for failover or switchover is very low, and determined by how long it takes applications to switch their connections from one cluster to another. Applications should be designed in such a way that the switch happens as quickly as possible.
+
+DR further allows for the role of each cluster to switch during planned switchover and unplanned failover scenarios.
+
+![Disaster recovery](/images/yb-platform/disaster-recovery/disaster-recovery.png)
+
+{{<lead link="../../../yugabyte-platform/back-up-restore-universes/disaster-recovery/#xcluster-dr-vs-xcluster-replication">}}
+[xCluster DR vs xCluster Replication](../../../yugabyte-platform/back-up-restore-universes/disaster-recovery/#xcluster-dr-vs-xcluster-replication)
+{{</lead>}}
+
+&nbsp;
+
+{{<index/block>}}
+
+  {{<index/item
+    title="Set up Disaster Recovery"
+    body="Designate a cluster to act as a Target."
+    href="disaster-recovery-setup/"
+    icon="fa-thin fa-umbrella">}}
+
+  {{<index/item
+    title="Unplanned failover"
+    body="Fail over to the Target in case of an unplanned outage."
+    href="disaster-recovery-failover/"
+    icon="fa-thin fa-cloud-bolt-sun">}}
+
+  {{<index/item
+    title="Planned switchover"
+    body="Switch over to the Target for planned testing and failback."
+    href="disaster-recovery-switchover/"
+    icon="fa-thin fa-toggle-on">}}
+
+  {{<index/item
+    title="Add and remove tables and indexes"
+    body="Perform DDL changes to databases in replication."
+    href="disaster-recovery-tables/"
+    icon="fa-thin fa-plus-minus">}}
+
+{{</index/block>}}
+
+## Schema changes
+
+Table and index-level schema changes must be performed in the same order as follows:
+
+1. The Source cluster.
+2. The Target cluster.
+
+You don't need to make any changes to the DR configuration.
+
+{{<lead link="./disaster-recovery-tables/">}}
+To learn more, refer to [Manage tables and indexes](./disaster-recovery-tables/)
+{{</lead>}}
+
+## Limitations
+
+- Currently, automatic replication of DDL (SQL-level changes such as creating or dropping tables or indexes) is not supported. For more details on how to propagate DDL changes from the Source to the Target, see [Manage tables and indexes](./disaster-recovery-tables/).
+
+- If a database operation requires a full copy, any application sessions on the database on the DR target will be interrupted while the database is dropped and recreated. Your application should either retry connections or redirect reads to the Source.
diff --git a/...w/yugabyte-cloud/cloud-clusters/disaster-recovery/disaster-recovery-failover.md b/...w/yugabyte-cloud/cloud-clusters/disaster-recovery/disaster-recovery-failover.md
@@ -0,0 +1,73 @@
+---
+title: Unplanned failover to a target Aeon cluster
+headerTitle: Unplanned failover
+linkTitle: Failover
+description: Unplanned failover to a target cluster
+headContent: Failover of application traffic to the DR target
+menu:
+  preview_yugabyte-cloud:
+    parent: disaster-recovery-aeon
+    identifier: disaster-recovery-failover-aeon
+    weight: 30
+type: docs
+---
+
+Unplanned failover is the process of switching application traffic to the Target cluster in case the Source cluster becomes unavailable. One of the common reasons for such a scenario is an outage of the primary region.
+
+## Perform failover
+
+Use the following procedure to perform an unplanned failover to the Target and resume applications.
+
+If the Source is terminated for some reason, do the following:
+
+1. Stop the application traffic to ensure no more updates are attempted.
+
+1. Navigate to your Source cluster **Disaster Recovery** tab.
+
+1. Note the **Potential data loss on failover** to understand the extent of possible data loss as a result of the outage, and determine if the extent of data loss is acceptable for your situation.
+
+    - The potential data loss is computed as the safe time lag that existed at the current safe time on the Target.
+    - Use the **Tables** tab to understand which specific tables have the highest safe time lag and replication lag.
+
+    For more information on replication metrics, refer to [Replication](../../../../launch-and-manage/monitor-and-alert/metrics/replication/).
+
+1. To proceed, click **Switchover** and choose **Failover**.
+
+1. Enter the name of the Target and click **Failover**.
+
+1. Click **Restart Replication**.
+
+1. Resume the application traffic on the new Source.
+
+At this point, the DR configuration is halted and needs to be repaired.
+
+![Disaster recovery failed](/images/yb-platform/disaster-recovery/disaster-recovery-failed.png)
+
+## Repair DR after failover
+
+There are two options to repair a DR that has failed over:
+
+- If the original Source has recovered and is fully functional with no active alerts, you can configure DR to use the cluster as a Target.
+- If the original Source cannot be recovered, create a new cluster to be configured to act as the Target (see [Prerequisites](../disaster-recovery-setup/#prerequisites)).
+
+In both cases, repairing DR involves making a full copy of the databases through the backup-restore process.
+
+To repair DR, do the following:
+
+1. Navigate to your (new) Source cluster **Disaster Recovery** tab.
+
+1. Click **Repair DR** to display the **Repair DR** dialog.
+
+    ![Repair DR](/images/yb-platform/disaster-recovery/disaster-recovery-repair.png)
+
+1. If the current Target (formerly the Source) has recovered and is fully functional with no active alerts, choose **Reuse the current Target**.
+
+    To use a new cluster as the Target, choose **Select a new cluster as Target** and select the cluster.
+
+1. Click **Initiate Repair**.
+
+After the repair is complete, if your eventual desired configuration is for the Target (that is, the former Source if you chose Reuse, or the new one you added to DR to act as Target) to be the Source, follow the steps for [Planned switchover](../disaster-recovery-switchover/).
+
+{{< warning title="Important" >}}
+Do not attempt a switchover if you have not first repaired DR.
+{{< /warning >}}