Skip to content

Commit d182ff2

Browse files
authored
storcon: respect tenant scheduling policy in drain/fill (#9657)
## Problem Pinning a tenant by setting Pause scheduling policy doesn't work because drain/fill code moves the tenant around during deploys. Closes: #9612 ## Summary of changes - In drain, only move a tenant if it is in Active or Essential mode - In fill, only move a tenant if it is in Active mode. The asymmetry is a bit annoying, but it faithfully respects the purposes of the modes: Essential is meant to endeavor to keep the tenant available, which means it needs to be drained but doesn't need to be migrated during fills.
1 parent 4dfa0c2 commit d182ff2

File tree

2 files changed

+25
-1
lines changed

2 files changed

+25
-1
lines changed

storage_controller/src/drain_utils.rs

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ use std::{
33
sync::Arc,
44
};
55

6-
use pageserver_api::controller_api::NodeSchedulingPolicy;
6+
use pageserver_api::controller_api::{NodeSchedulingPolicy, ShardSchedulingPolicy};
77
use utils::{id::NodeId, shard::TenantShardId};
88

99
use crate::{
@@ -98,6 +98,20 @@ impl TenantShardDrain {
9898
return None;
9999
}
100100

101+
// Only tenants with a normal (Active) scheduling policy are proactively moved
102+
// around during a node drain. Shards which have been manually configured to a different
103+
// policy are only rescheduled by manual intervention.
104+
match tenant_shard.get_scheduling_policy() {
105+
ShardSchedulingPolicy::Active | ShardSchedulingPolicy::Essential => {
106+
// A migration during drain is classed as 'essential' because it is required to
107+
// uphold our availability goals for the tenant: this shard is elegible for migration.
108+
}
109+
ShardSchedulingPolicy::Pause | ShardSchedulingPolicy::Stop => {
110+
// If we have been asked to avoid rescheduling this shard, then do not migrate it during a drain
111+
return None;
112+
}
113+
}
114+
101115
match scheduler.node_preferred(tenant_shard.intent.get_secondary()) {
102116
Some(node) => Some(node),
103117
None => {

storage_controller/src/service.rs

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6721,6 +6721,16 @@ impl Service {
67216721
.tenants
67226722
.iter_mut()
67236723
.filter_map(|(tid, tenant_shard)| {
6724+
if !matches!(
6725+
tenant_shard.get_scheduling_policy(),
6726+
ShardSchedulingPolicy::Active
6727+
) {
6728+
// Only include tenants in fills if they have a normal (Active) scheduling policy. We
6729+
// even exclude Essential, because moving to fill a node is not essential to keeping this
6730+
// tenant available.
6731+
return None;
6732+
}
6733+
67246734
if tenant_shard.intent.get_secondary().contains(&node_id) {
67256735
if let Some(primary) = tenant_shard.intent.get_attached() {
67266736
return Some((*primary, *tid));

0 commit comments

Comments
 (0)