Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add helm starter chart for developing helm chart #967

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

lianhao
Copy link
Collaborator

@lianhao lianhao commented Dec 3, 2024

Description

Helm starter chart pack can be used to create helm chart for new services.

Issues

n/a.

Type of change

List the type of change like below. Please delete options that are not relevant.

  • Others (enhancement, documentation, validation, etc.)

Dependencies

List the newly introduced 3rd party dependency if exists.

Tests

Describe the tests that you ran to verify your changes.

@lianhao lianhao force-pushed the helm-chart-starter branch from a8ed171 to bc37842 Compare December 3, 2024 08:03
@lianhao
Copy link
Collaborator Author

lianhao commented Dec 3, 2024

Let's hold this until we've got the helm charts directory layout figured out

Helm starter chart pack can be used to create helm chart for new
services.

Signed-off-by: Lianhao Lu <[email protected]>
@lianhao lianhao force-pushed the helm-chart-starter branch from bc37842 to fe34c8c Compare December 4, 2024 06:29
Copy link
Contributor

@eero-t eero-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think some instructions on using the template are also needed.

It could e.g. state which classes service needs to use, for it to have v1/health_check endpoint for the probes, and that services always need to handle SIGTERM, to avoid lost requests on service scale down.

spec:
containers:
- name: curl
image: python:3.10.14
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Latest Python 3.10 bug/security-fix version is 3.10.16 (and current latest Python version is 3.13.1).

Comment on lines +83 to +86
# We usually recommend not to specify default resources and to leave this as a conscious
# choice for the user. This also increases chances charts run on environments with little
# resources, such as Minikube. If you do want to specify resources, uncomment the following
# lines, adjust them as necessary, and remove the curly braces after 'resources:'.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update the comment to something more relevant.

Comment on lines +95 to +114
livenessProbe:
httpGet:
path: v1/health_check
port: <CHARTNAME>
initialDelaySeconds: 5
periodSeconds: 5
failureThreshold: 24
readinessProbe:
httpGet:
path: v1/health_check
port: <CHARTNAME>
initialDelaySeconds: 5
periodSeconds: 5
startupProbe:
httpGet:
path: v1/health_check
port: <CHARTNAME>
initialDelaySeconds: 5
periodSeconds: 5
failureThreshold: 120
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO better to comment out liveness probe. That's useful only if pod has habit of deadlocking and restarting the pod will fix that issue (instead of restart just making the service response worse).

# cpu: 100m
# memory: 128Mi

# This is to setup the liveness and readiness probes more information can be found here: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# This is to setup the liveness and readiness probes more information can be found here: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
# This is to setup the startup, liveness and readiness probes more information can be found here: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/

Or:

Suggested change
# This is to setup the liveness and readiness probes more information can be found here: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
# Set up pod health monitoring probes, see: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/

Comment on lines +116 to +120
# This section is for setting up autoscaling more information can be found here: https://kubernetes.io/docs/concepts/workloads/autoscaling/
# Enabling HPA will:
# - Ignore above replicaCount, as it will be controlled by HPA
# - Add example HPA scaling rules with thresholds suitable for Xeon deployments
# - Require custom metrics ConfigMap available in the main application chart
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix comment (similarly to the other PR).

Because HPA can sometimes change replica counts up and down rather frequently, dev documentation (or this comment) should emphatize that service needs to handle SIGTERM and:

  • stop accepting new requests
  • handle all of its buffered requests
  • terminate after tthose have been processed

To avoid k8s SIGKILL (sent to pod after specified grace period) causing request failures.

See: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination

# If set, it will overwrite serviceAccount.name.
# If set, and serviceAccount.create is false, it will assume this service account is already created by others.
sharedSAName: ""
# Install Prometheus serviceMonitor for service
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Install Prometheus serviceMonitor for service
# Install Prometheus serviceMonitor for service metrics

Comment on lines +36 to +47
- type: Object
object:
describedObject:
apiVersion: v1
# get metric for named object of given type (in same namespace)
kind: Service
name: {{ include "<CHARTNAME>.fullname" . }}
target:
type: AverageValue
averageValue: 15
metric:
name: {{ include "<CHARTNAME>.metricPrefix" . }}_dummy_queue_size
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this part as this template does not set up any custom metrics for this application, and because there would be no point in doing that (custom metrics are by definition application specific).

Comment on lines +66 to +70
{{/*
Convert chart name to a string suitable as metric prefix
*/}}
{{- define "<CHARTNAME>.metricPrefix" -}}
{{- include "<CHARTNAME>.fullname" . | replace "-" "_" | regexFind "[a-zA-Z_:][a-zA-Z0-9_:]*" }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this part as this template does not set up any custom metrics for this application.

Comment on lines +25 to +29
## Values

| Key | Type | Default | Description |
| ------------------------------- | ------ | ------------------------------------ | ---------------------- |
| global.HUGGINGFACEHUB_API_TOKEN | string | `insert-your-huggingface-token-here` | Hugging Face API token |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs a link to a shared documentation explaining all the common keys in the values file (proxies etc). Such doc could be split to sections based on how commonly given keys need to be specified by the user.

Comment on lines +125 to +126
# targetCPUUtilizationPercentage: 80
# targetMemoryUtilizationPercentage: 80
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Helm install fails if there are not targets. Because these do nothing unless autoscaling is enabled, I think they can be always enabled:

Suggested change
# targetCPUUtilizationPercentage: 80
# targetMemoryUtilizationPercentage: 80
targetCPUPercentage: 80
targetMemoryPercentage: 80

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants