doc: improve documentation (#875)

tair-opensource · Oct 24, 2024 · fcab819 · fcab819
1 parent 0dd0b41
commit fcab819
Show file tree

Hide file tree

Showing 5 changed files with 99 additions and 10 deletions.
diff --git a/README.md b/README.md
@@ -16,19 +16,19 @@
 
 RedisShake is a powerful tool for Redis data transformation and migration, offering:
 
-1. 🔄 **Zero Downtime Migration**: Enables seamless data migration without data loss or service interruption, ensuring continuous operation during the transfer process.
+1. **Zero Downtime Migration**: Enables seamless data migration without data loss or service interruption, ensuring continuous operation during the transfer process.
 
-2. 🌈 **Redis Compatibility**: Supports Redis 2.8 to 7.2, across standalone, master-slave, sentinel, and cluster deployments.
+2. **Redis Compatibility**: Supports Redis 2.8 to 7.2, across standalone, master-slave, sentinel, and cluster deployments.
 
-3. ☁️ **Cloud Service Integration**: Seamlessly works with Redis-like databases from major cloud providers:
+3. **Cloud Service Integration**: Seamlessly works with Redis-like databases from major cloud providers:
    - Alibaba Cloud: [ApsaraDB for Redis](https://www.alibabacloud.com/product/apsaradb-for-redis), [Tair](https://www.alibabacloud.com/product/tair)
    - AWS: [ElastiCache](https://aws.amazon.com/elasticache/), [MemoryDB](https://aws.amazon.com/memorydb/)  
 
-4. 🧩 **Module Support**: Compatible with [TairString](https://github.com/tair-opensource/TairString), [TairZSet](https://github.com/tair-opensource/TairZset), and [TairHash](https://github.com/tair-opensource/TairHash).
+4. **Module Support**: Compatible with [TairString](https://github.com/tair-opensource/TairString), [TairZSet](https://github.com/tair-opensource/TairZset), and [TairHash](https://github.com/tair-opensource/TairHash).
 
-5. 📤 **Flexible Data Source**: Supports [PSync](https://tair-opensource.github.io/RedisShake/zh/reader/sync_reader.html), [RDB](https://tair-opensource.github.io/RedisShake/zh/reader/rdb_reader.html), and [Scan](https://tair-opensource.github.io/RedisShake/zh/reader/scan_reader.html) data fetch methods.
+5. **Flexible Data Source**: Supports [PSync](https://tair-opensource.github.io/RedisShake/zh/reader/sync_reader.html), [RDB](https://tair-opensource.github.io/RedisShake/zh/reader/rdb_reader.html), and [Scan](https://tair-opensource.github.io/RedisShake/zh/reader/scan_reader.html) data fetch methods.
 
-6. 🔧 **Advanced Data Processing**: Enables custom [script-based data transformation](https://tair-opensource.github.io/RedisShake/zh/filter/function.html) and easy-to-use [data filter rules](https://tair-opensource.github.io/RedisShake/zh/filter/filter.html).
+6. **Advanced Data Processing**: Enables custom [script-based data transformation](https://tair-opensource.github.io/RedisShake/zh/filter/function.html) and easy-to-use [data filter rules](https://tair-opensource.github.io/RedisShake/zh/filter/filter.html).
 
 ## How to Get RedisShake
 

diff --git a/docs/.vitepress/en.ts b/docs/.vitepress/en.ts
@@ -31,6 +31,7 @@ function sidebar(): DefaultTheme.SidebarItem[] {
                 { text: 'Getting Started', link: '/en/guide/getting-started' },
                 { text: 'Configuration', link: '/en/guide/config' },
                 { text: 'Migration Mode Selection', link: '/en/guide/mode' },
+                { text: 'Architecture and Performance', link: '/en/guide/architecture' },
             ]
         },
         {

diff --git a/docs/src/en/guide/architecture.md b/docs/src/en/guide/architecture.md
@@ -0,0 +1,88 @@
+---
+outline: deep
+---
+
+# Architecture and Performance Description
+
+## Architecture Diagram
+
+When both the source and destination are clusters, the synchronization architecture diagram is as follows:
+
+![Cluster Synchronization Architecture Diagram](/architecture-c2c.svg)
+
+When the source and destination are standalone nodes, the synchronization architecture diagram is as follows:
+
+![Standalone Node Synchronization Architecture Diagram](/architecture-s2s.svg)
+
+## Architecture Description
+
+As seen in the architecture diagram, data synchronization from the source to the destination mainly goes through three parts: Cluster Reader, Main, and Cluster Writer.
+
+### Cluster Reader
+
+The Cluster Reader is the cluster reading class. It creates an equal number of Standalone Readers based on the number of source shards. Each Standalone Reader opens a goroutine to read from each source shard in parallel and stores the data in the corresponding channel (Reader Channel) for the next stage to process.
+
+### Main
+
+Main is the main function. It opens multiple goroutines based on the number of Reader Channels, performing Parse, Filter, and Function operations on the data in the channels in parallel, and then calls the Cluster Writer's Write method to distribute the data to the write end.
+
+- Parse: Data packet parsing
+- Filter: Filtering operation
+- Function: Executes Lua functions
+
+### Cluster Writer
+
+The Cluster Writer is the cluster writing class. It creates an equal number of Standalone Writers based on the number of destination shards. The Cluster Writer's Write method can distribute data to the corresponding Standalone Writer's channel (Writer Channel) for the appropriate slot, and the Standalone Writer then writes the data to the destination.
+
+## Performance and Resource Usage
+
+### Test Environment
+
+- Server: ecs.i4g.8xlarge 32 cores, disk read speed 2.4 GB/s, write speed 1.5 GB/s
+- Source and destination Redis clusters: 1 GB 12 shards
+
+### Test Tool
+
+- redis-benchmark: Redis pressure testing tool, used to create continuous write traffic for the source
+
+Test cases were designed for both redisshake modes (sync and scan) and for both full synchronization and incremental synchronization phases. For the full synchronization phase, data needs to be written to the source in advance before starting redisshake synchronization. For the incremental synchronization phase, redisshake synchronization is started first, then redis-benchmark is used to continuously generate write traffic.
+
+In sync mode, the full synchronization phase synchronizes an rdb file, while the incremental synchronization phase is an aof data stream. In scan mode, full synchronization uses scan to traverse the source database, while the incremental synchronization phase enables ksn for key-value synchronization.
+
+For the incremental synchronization phase, the redis-benchmark script is set as follows, generating about 1500k/s write requests, which can fully occupy the first 16 CPU cores of the ECS server.
+
+```bash
+taskset -c 0-15 redis-benchmark \
+  --threads 16 -r 10000000 -n 1000000000 -t set \
+  -h host -a 'username:password' \
+  --cluster -c 256 -d 8 -P 2
+```
+
+Test results can be found at [RedisShake Cloud Performance Test Results](https://github.com/OxalisCu/RedisShake/tree/benchmark-backup-cloud/demo)
+
+### Performance Data
+
+The synchronization rates for both source cluster to destination cluster synchronization methods were compared.
+
+- 12c-12c: One redisshake using Cluster mode for synchronization
+- 12(s-s): Each shard starts one redisshake using Standalone mode for synchronization
+
+The obtained synchronization rates are as follows, where "bench" represents the write traffic rate generated by redis-benchmark. In scan mode, count is set to 10.
+
+|                 | bench | 12c-12c | 12(s-s)         | 12c-12c/12(s-s) |
+| --------------- | ----- | ------- | --------------- | --------------- |
+| **sync + aof**  | 1599k | 1520k   | 12*(130k)=1560k | 0.97            |
+| **sync + rdb**  |       | 1498k   | 12*(220k)=2640k | 0.57            |
+| **scan + ksn**  | 1084k | 1081k   | 12*(95k)=1140k  | 0.95            |
+| **scan + scan** |       | 665k    | 12*(58k)=696k   | 0.95            |
+
+### Resource Consumption
+
+CPU usage and disk read/write rates were monitored using the htop tool, while network send/receive rates were monitored using the iftop tool. The results are as follows:
+
+|                 | CPU                                   | Network                                   | Disk       |
+| --------------- | ------------------------------------- | ----------------------------------------- | ---------- |
+| **sync + aof**  | 16 cores at 70%-90%, total 1276.9%    | Send 1340Mb/s, Receive 998Mb/s            | 155.91MB/s |
+| **sync + rdb**  | 32 cores at 50%-60%, total 1605.0%    | Send 435Mb/s, Receive 82.1 Mb/s           | 113.53KB/s |
+| **scan + ksn**  | 16 cores at 90%-100%, total 1911.4%   | Send 2100Mb/s, Receive 1330 Mb/s          | 172.07KB/s |
+| **scan + scan** | 32 cores at 40%-60%, total 1297.2%    | Send 1130Mb/s, Receive 533Mb/s            | 155.78KB/s |
diff --git a/docs/src/en/reader/sync_reader.md b/docs/src/en/reader/sync_reader.md
@@ -6,8 +6,8 @@ When the source database is compatible with the PSync protocol, `sync_reader` is
 
 * Redis
 * Tair
-* ElastiCache (partially compatible)
-* MemoryDB (partially compatible)
+* Valkey
+* ElastiCache (requires aws_psync configuration)
 
 Advantages: Best data consistency, minimal impact on the source database, and allows for seamless switching.
 

diff --git a/docs/src/zh/reader/sync_reader.md b/docs/src/zh/reader/sync_reader.md
@@ -10,8 +10,8 @@ outline: deep
 
 * Redis
 * Tair
-* ElastiCache 部分兼容
-* MemoryDB 部分兼容
+* Valkey
+* ElastiCache 需要提供 aws_psync 配置
 
 优势：数据一致性最佳，对源库影响小，可以实现不停机的切换