You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: text/0069-api-v2.md
+29-1Lines changed: 29 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -236,7 +236,30 @@ More about `GC` of deleted/expired entries:
236
236
237
237
### Backup and Restore
238
238
239
-
*To be supplemented in another PR*
239
+
Due to the significant change on storage format, only if the existing TiKV cluster is empty or storing only `TiDB` data, users can enable or disable API v2 smoothly. In other scenario, we need implement one tool to migrate data, which is called [TiKV-BR].
240
+
241
+
[TiKV-BR] forks from [TiDB BR] and need some improvements:
242
+
- Storage data conversion from `API V1`(with or without [TTL]) to `API V2`.
243
+
- Backup supports return `backup-ts` in `API V2` TiKV cluster. [RawKV CDC] can use `backup-ts` as `start-ts` for replication tasks.
244
+
245
+
#### API Version Conversion
246
+
247
+
To support API version conversion, a parameter `dst-api-version` is introduced in [TiKV-BR] and pass it to TiKV stores in [`BackupRequest`].
248
+
During the backup process, TiKV stores scan on all raw key-value entries and convert from current `api-version` to `dst-api-version` if they are different, then write the converted data to [SST] files. Restoration does not need the rewriting and conversion, which can speed up the restoration process.
249
+
250
+
#### Backup Ts
251
+
252
+
`backup-ts` only takes effect in the `API V2` cluster, which is defined as the timestamp that all data written before which have been backed up.
253
+
As RawKV uses pre-fetches `TSO` for writing, the latest `TSO` from PD does not satisfy the `backup-ts` requirements obviously.
254
+
255
+
The process to get `backup-ts` is as following:
256
+
1. Get current `TSO` in [TiKV-BR] at the beginning of backup.
257
+
2. Flush cached `TSO` in every TiKV store before scanning process during backup, to make sure that all writes afterward will have larger timestamps.
258
+
3. Subtract `safe-interval` from current `TSO`.
259
+
260
+
The third step is introduced because there would be **inflight** RawKV entries with timestamp before the current `TSO`. The `safe-interval` is a safe enough time duration that during which **inflight** writes should have finished. This is an empirical value and defaults to 1 minute, which is safe enough even in a high-pressure system.
261
+
262
+
Besides, this process can be optimized after we implement `safe-ts` for RawKV in stale read scenario, which is a timestamp that all RawKV writes before this timestamp can be seen **safely**.
240
263
241
264
### Change Data Capture
242
265
@@ -269,4 +292,9 @@ Upgrade to the latest TiKV Go Client and use `V1` mode.
0 commit comments