Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Online SST Recovery #92

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 46 additions & 0 deletions text/0089-quota-limiter.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# Quota limiter

- RFC PR: [https://github.com/tikv/rfcs/pull/89](https://github.com/tikv/rfcs/pull/89)
- Tracking Issue: [https://github.com/tikv/tikv/issues/12131](https://github.com/tikv/tikv/issues/12131)

## Summary

Add a global forefront throttle to limit read/write requests, which brings stable QPS.

## Motivation

On the machines where physical resources are constrained, the performance of TiKV may become unstable. Front-end and back-end processing interfere with each other. For example, TiKV's processing of background sampling will take up a lot of CPU processing time, which will cause other requests to be processed with less cpu resource. In addition, under long-term high load pressure, TiKV will accumulate more and more background tasks, such as compaction jobs, etc. When resources are limited and users are not sensitive to performance, it's better for TiKV to work stably under high pressure.

## Detailed design

### Limited method

This feature plans to add multiple global limiters into TiKV to record different types of processing speed. These types include forefront CPU time, forefront request rate, read/write bandwidth, and write KV rate.

When the limiter reaches the quota value, the request will be forced to block for a period of time to compensate, based on the result returned from a speed limiter using the token bucket algorithm.

Several new configs will be added. Users need to limit each metric separately, of which the quota for cpu time is an approximate value.

* quota.forefront-cpu-time: usize
* quota.forefront-req-rate: usize
* quota.write-kvs: usize
* quota.write-bandwidth: ReadableSize
* quota.read-bandwidth: ReadableSize

### Limited position

Clock and limit at the location of code blocks for the above metric that will significantly affect.

**scheduler write:** All txn write requests will be handled in the `process_write` function.

**tidb query executor:** Coprocessor DAG processing will be handled in `BatchExecutorsRunner`.

**txn get:** Get value processing will be handled in `storage::get`

**txn batch get:** Get multiple value processing will be handled in `storage::batch_get`

**analyze request:** Coprocessor analyze processing will be handled in `collect_columns_stats`.

## Drawbacks

Setting a quota that is too small may cause significant performance degradation, which requires experienced users to config the quota.
22 changes: 22 additions & 0 deletions text/0092-sst-recovery.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Online SST Recovery

- RFC PR: [https://github.com/tikv/rfcs/pull/92](https://github.com/tikv/rfcs/pull/92)
- Tracking Issue: [https://github.com/tikv/tikv/issues/10578](https://github.com/tikv/tikv/issues/10578)

## Summary

When part of the SST files is corrupted or inaccessible and the errors are background, TiKV should not panic immediately. Damaged SSTs should be automatically deleted.

## Motivation

When SST files are corrupted or inaccessible, TiKV would panic and cannot start normally which would block the user’s business. In such situations, we need to intervene to manually help users recover.

Often the failure like this is caused by IO error or OS error. If the damaged store cannot provide services, the above leaders cannot handle read and write requests, which block the user’s business.

## Detailed design

RocksDB provides a hook for background error, and TiKV's current approach is to panic directly here. In this design, We can use the hook to run SST recovery worker when SST corrupted or inaccessible errors.

When the data belonging to the data range is damaged, it will be reported to PD through heartbeat, and PD will add `remove-peer` operator to remove this damaged peer. When the damaged peer still exists in the current store, the corruption SST files remain, and the KV storage engine can still put new content normally, but it will return error when reading corrupt data range.

If after max recovery time window, the peer where the corrupted data range located has not been removed from the current store, TiKV will panic.