Skip to content

Commit

Permalink
Merge pull request #1516 from ClickHouse/aashishkohli-patch-1
Browse files Browse the repository at this point in the history
Updated scaling.md for revised scaling docs
  • Loading branch information
justindeguzman authored Sep 18, 2023
2 parents 6c32178 + ff1ee91 commit 22df0fc
Showing 1 changed file with 32 additions and 18 deletions.
50 changes: 32 additions & 18 deletions docs/en/cloud/manage/scaling.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,35 +5,49 @@ slug: /en/manage/scaling
---

# Automatic Scaling

ClickHouse Cloud provides autoscaling for your services. The scaling of ClickHouse Cloud Production services can be adjusted by organization members with the **Admin** role on the service **Settings** page.
Scaling is the ability to adjust available resources to meet client demands. Services can be scaled manually by calling an API programmatically, or changing settings on the UI to adjust system resources. Alternatively, services can be **autoscaled** to meet application demands, which is how ClickHouse Cloud scales services.

:::note
Autoscaling only applies to Production services. Development services do not support autoscaling. You may upgrade your service from Development to Production to enable autoscaling.
Scaling is only applicable to Production tier services. Development tier services do not scale. You can **upgrade** a service from Development tier to Production in order to scale it.
:::

<img alt="Scaling settings page" style={{width: '450px', marginLeft: 0}} src={require('./images/AutoScaling.png').default} />

## Adjusting total memory for your services (vertical scaling)
## How autoscaling works in ClickHouse Cloud
ClickHouse Cloud scales services based on CPU and memory usage. We constantly monitor the historical usage of a service over a lookback window. If the usage falls above or below certain thresholds, we scale the service appropriately to match the demand. The **larger** of the CPU or memory recommendation is picked, and CPU and memory allocated to the service are scaled in lockstep.

Depending on your queries and use case, your services may require more or less memory.
### Vertical and Horizontal Scaling
By default, ClickHouse Cloud Production services operate with 3 replicas across 3 different availability zones. Production services can be scaled both vertically (by switching to larger replicas), or horizontally (by adding replicas of the same size). Vertical scaling typically helps with queries that need a large amount of memory for long running inserts / reads, and horizontal scaling can help with parallelization to support concurrent queries.

In the settings page, you can set the minimum and maximum **Total memory**. The compute allocated to your service scales linearly with its allocated memory.
In the current implementation, vertical autoscaling works well with slow incremental growth in memory and CPU needs, and we are working on improving it to better handle workload bursts. Also, autoscaling currently only scales a service vertically. In order to horizontally scale your service, please contact [email protected].

Each replica in your service will be allocated the same memory and CPU resources.
### Configuring vertical auto scaling
The scaling of ClickHouse Cloud Production services can be adjusted by organization members with the **Admin** role. To configure vertical autoscaling, go to the **Settings** tab on your service details page and adjust the minimum and maximum memory, alongwith CPU settings as shown below.

:::tip A tip before setting total memory
Generally, the amount of **total memory** needed by your service cannot be determined until after a few days of monitoring your service with normal use. We recommend waiting a few days before setting the minimum and maximum memory settings, and adjust as needed based on how your queries are performing.
:::

## Adding more replicas (horizontal scaling)
<img alt="Scaling settings page" style={{width: '450px', marginLeft: 0}} src={require('./images/AutoScaling.png').default} />

By default, Production services operate with 3 replica across 3 different availability zones. For applications that have higher concurrency or performance requirements, it is possible to horizontally scale your service by increasing the number of replicas for that service. If you would like to request more replicas for your service, please contact [email protected].
Set the **Maximum memory** for your replicas at a higher value than the **Minimum memory**. The service will then scale as needed within those bounds. These settings are also available during the initial service creation flow. Each replica in your service will be allocated the same memory and CPU resources.

## Automatic idling
You can also choose to set these values the same, essentially pinning the service to a specific configuration. Doing so will immediately force scaling to happen to the desired size you picked. It’s important to note that this will disable any auto scaling on the cluster, and your service will not be protected against increases in CPU or memory usage beyond these settings.

In the settings page, you can choose whether or not to allow automatic idling of your service when it is inactive (i.e. when the service is not executing any user-submitted queries). Automatic idling reduces the cost for your service as you are not billed for compute resources when the service is paused.
## Automatic Idling
In the settings page, you can also choose whether or not to allow automatic idling of your service when it is inactive as shown in the image above (i.e. when the service is not executing any user-submitted queries). Automatic idling reduces the cost for your service as you are not billed for compute resources when the service is paused.

:::danger When not to use automatic idling
Use automatic idling only if your use case can handle a delay before responding to queries, because when a service is paused, connections to the service will time out. Automatic idling is ideal for services that are used infrequently and where a delay can be tolerated. It is not recommended for services that power customer-facing features that are used frequently.
Use automatic idling only if your use case can handle a delay before responding to queries, because when a service is paused, connections to the service will time out. Automatic idling is ideal for services that are used infrequently and where a delay can be tolerated. It is not recommended for services that power customer-facing features that are used frequently.
:::

## Handling bursty workloads
If you have an upcoming expected spike in your workload, you can use the
[ClickHouse Cloud API](/docs/en/cloud/manage/api/services-api-reference.md) to preemptively scale up your service to handle the spike and scale it down once the demand subsides. To understand the current service size and the number of replicas, you can run the query below:

```
SELECT *
FROM clusterAllReplicas('default', view(
SELECT
hostname() AS server,
getSetting('max_threads') as cpu_cores,
formatReadableSize(getSetting('max_memory_usage')) as memory
FROM system.one
))
ORDER BY server ASC
SETTINGS skip_unavailable_shards = 1
```

0 comments on commit 22df0fc

Please sign in to comment.