Skip to content

Commit

Permalink
add advanced configuration
Browse files Browse the repository at this point in the history
  • Loading branch information
awprice committed Mar 12, 2018
1 parent 1c46b2c commit 68cf7e9
Show file tree
Hide file tree
Showing 3 changed files with 53 additions and 4 deletions.
9 changes: 5 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,11 +8,11 @@ Kubernetes is a container orchestration framework that schedules Docker containe

**The key preliminary features are:**

- Cluster-level utilisation node scaling.
- Utilisation based node scaling
- Calculate requests and capacity to determine whether to scale up, down or to stay at the current scale
- Wait until non-daemonset pods on nodes have completed before terminating the node
- Designed to work on selected auto-scaling groups to allow the default
[Kubernetes Autoscaler](https://github.com/kubernetes/autoscaler) to continue to scale our service based workloads
[Kubernetes Autoscaler](https://github.com/kubernetes/autoscaler) to continue to scale service based workloads
- Automatically terminate oldest nodes first
- Support for different cloud providers - only AWS at the moment

Expand All @@ -36,7 +36,7 @@ make build
### Locally (out of cluster)

```bash
go run cmd/main.go --kubeconfig=~/.kube/config --nodegroups=nodegroups.yaml
go run cmd/main.go --kubeconfig=~/.kube/config --nodegroups=nodegroups_config.yaml
```

### Inside cluster
Expand Down Expand Up @@ -92,7 +92,8 @@ make test

#### Test a specific package

To test the controller package:
For example, to test the controller package:

```bash
go test ./pkg/controller
```
1 change: 1 addition & 0 deletions docs/configuration/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,4 @@

- [Command line configuration](./command-line.md) - configuring command line flags/args
- [Node group configuration](./nodegroup.md) - configuring nodegroup.yml file
- [Advanced Configuration](./advanced-configuration.md) - Configuring thresholds and slack capacity
47 changes: 47 additions & 0 deletions docs/configuration/advanced-configuration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# Advanced Configuration

## Threshold Configuration

Scaling thresholds are configured using the following main nodegroup configuration options:

```yaml
taint_upper_capacity_threshhold_percent: 40
taint_lower_capacity_threshhold_percent: 10
scale_up_threshhold_percent: 70
```
Full configuration options can be found in [Node Group Configuration](./nodegroup.md).
Through the configuration of these three options we can tell Escalator when we would like it to scale up or down the
cluster or when we want it to do nothing.
- `scale_up_threshhold_percent` defines the threshold for scaling up.
- `taint_upper_capacity_threshhold_percent` and `taint_lower_capacity_threshhold_percent` defines the thresholds for
tainting which will lead to scaling down.
- If the cluster utilisation falls between `taint_upper_capacity_threshhold_percent` and `scale_up_threshhold_percent`,
Escalator will do nothing.

Given the above threshold values, Escalator will do the following:

- Utilisation: 110% = **scale up** (exceeded `scale_up_threshhold_percent`)
- Utilisation: 75% = **scale up** (exceeded `scale_up_threshhold_percent`)
- Utilisation: 70% = **scale up** (exceeded `scale_up_threshhold_percent`)
- Utilisation: 50% = **do nothing**
- Utilisation: 40% = **do nothing** (utilisation has to be lower than the threshold for it to trigger)
- Utilisation: 38% = **scale down slowly** (below `taint_upper_capacity_threshhold_percent`
but above `taint_lower_capacity_threshhold_percent` so only scale down slowly)
- Utilisation: 10% = **scale down slowly** (below `taint_upper_capacity_threshhold_percent`
but above `taint_lower_capacity_threshhold_percent` so only scale down slowly)
- Utilisation: 9% = **scale down quickly** (below `taint_lower_capacity_threshhold_percent`)
- Utilisation: 0% = **scale down quickly** (below `taint_lower_capacity_threshhold_percent`)

## Slack Capacity Configuration

Slack capacity is configured through the `scale_up_threshhold_percent` option. If this option is set to **70** for
example, whenever the node group utilisation reaches or exceeds **70%**, a scale up will occur. Escalator will try and
keep the node group utilisation below **70%** and thus there will be a "slack capacity" of **30%**.

To completely eliminate slack capacity, `scale_up_threshhold_percent` can be set to **100**, which will mean that
Escalator will only scale up the node group when the utilisation reaches or exceeds **100%**.

It is recommended to have some slack capacity in the event that there is a sudden spike of new pods to allow for
Escalator time to increase the node group size before pods cannot be scheduled.

0 comments on commit 68cf7e9

Please sign in to comment.