|
| 1 | +# Get started with OpenTelemetry for Kubernetes Observability |
| 2 | + |
| 3 | +This guide describes how to: |
| 4 | + |
| 5 | +- Install the [OpenTelemetry Operator](https://github.com/open-telemetry/opentelemetry-operator/) using the [kube-stack Helm Chart](https://github.com/open-telemetry/opentelemetry-helm-charts/tree/main/charts/opentelemetry-kube-stack). |
| 6 | +- Use the EDOT Collector to send Kubernetes logs, metrics, and application traces to an Elasticsearch cluster. |
| 7 | +- Use the operator for applications [auto-instrumentation](https://opentelemetry.io/docs/kubernetes/operator/automatic/) in all supported languages. |
| 8 | + |
| 9 | +## Table of Contents |
| 10 | + |
| 11 | +- [Prerequisites](#prerequisites) |
| 12 | +- [Compatibility Matrix](#compatibility-matrix) |
| 13 | +- [Components description](#components-description) |
| 14 | +- [Deploying components using Kibana Onboarding UX](#deploying-components-using-kibana-onboarding-ux) |
| 15 | +- [Manual deployment of all components](#manual-deployment-of-all-components) |
| 16 | +- [Installation verification](#installation-verification) |
| 17 | +- [Instrumenting applications](#instrumenting-applications) |
| 18 | +- [Limitations](#limitations) |
| 19 | + |
| 20 | +## Prerequisites |
| 21 | + |
| 22 | +- Elastic Stack (self-managed or [Elastic Cloud](https://www.elastic.co/cloud)) version 8.16.0 or higher, or an [Elasticsearch serverless](https://www.elastic.co/docs/current/serverless/elasticsearch/get-started) project. |
| 23 | + |
| 24 | +- A Kubernetes version supported by the OpenTelemetry Operator (refer to the operator's [compatibility matrix](https://github.com/open-telemetry/opentelemetry-operator?#compatibility-matrix) for more details). |
| 25 | + |
| 26 | +## Compatibility Matrix |
| 27 | + |
| 28 | +The minimum supported version of the Elastic Stack for OpenTelemetry-based monitoring on Kubernetes is `8.16.0`. Different Elastic Stack releases support specific versions of the [kube-stack Helm Chart](https://github.com/open-telemetry/opentelemetry-helm-charts/tree/main/charts/opentelemetry-kube-stack). |
| 29 | + |
| 30 | +The following is the current list of supported versions: |
| 31 | + |
| 32 | +| Stack Version | Helm Chart Version | Values file | |
| 33 | +|---------------|--------------------|--------------------| |
| 34 | +| Serverless | 0.3.0 | values.yaml | |
| 35 | +| 8.16.0 | 0.3.0 | values.yaml | |
| 36 | + |
| 37 | +When [installing the release](#manual-deployment-of-all-components), ensure you use the right `--version` and `-f <values-file>` parameters. Values files are available in the [resources directory](/resources/kubernetes/operator/helm). |
| 38 | + |
| 39 | +## Components description |
| 40 | + |
| 41 | +### OpenTelemetry Operator |
| 42 | + |
| 43 | +The OpenTelemetry Operator is a [Kubernetes Operator](https://kubernetes.io/docs/concepts/extend-kubernetes/operator/) implementation designed to manage OpenTelemetry resources in a Kubernetes environment. It defines and oversees the following Custom Resource Definitions (CRDs): |
| 44 | + |
| 45 | +- [OpenTelemetry Collectors](https://github.com/open-telemetry/opentelemetry-collector): Agents responsible for receiving, processing and exporting telemetry data such as logs, metrics, and traces. |
| 46 | +- [Instrumentation](https://opentelemetry.io/docs/kubernetes/operator/automatic): Used for the atomatic instrumentation of workloads by leveraging OpenTelemetry instrumentation libraries. |
| 47 | + |
| 48 | +All signals including logs, metrics, traces are processed by the collectors and sent directly to Elasticsearch via the ES exporter. A collector's processor pipeline replaces the traditional APM server functionality for handling application traces. |
| 49 | + |
| 50 | +### Kube-stack Helm Chart |
| 51 | + |
| 52 | +The [kube-stack Helm Chart](https://github.com/open-telemetry/opentelemetry-helm-charts/tree/main/charts/opentelemetry-kube-stack) is used to manage the installation of the operator (including its CRDs) and to configure a suite of collectors, which instrument various Kubernetes components to enable comprehensive observability and monitoring. |
| 53 | + |
| 54 | +The chart is installed with a provided default `values.yaml` file that can be customized when needed. |
| 55 | + |
| 56 | +### DaemonSet collectors |
| 57 | + |
| 58 | +The OpenTelemetry components deployed within the DaemonSet collectors are responsible for observing specific signals from each node. To ensure complete data collection, these components must be deployed on every node in the cluster. Failing to do so will result in partial and potentially incomplete data. |
| 59 | + |
| 60 | +The DaemonSet collectors handle the following data: |
| 61 | + |
| 62 | +- Host Metrics: Collects host metrics (hostmetrics receiver) specific to each node. |
| 63 | +- Kubernetes Metrics: Captures metrics related to the Kubernetes infrastructure on each node. |
| 64 | +- Logs: Utilizes a filelog receiver to gather logs from all Pods running on the respective node. |
| 65 | +- OTLP Traces Receiver: Opens an HTTP and a GRPC port on the node to receive OTLP trace data. |
| 66 | + |
| 67 | +### Deployment collector |
| 68 | + |
| 69 | +The OpenTelemetry components deployed within a Deployment collector focus on gathering data at the cluster level rather than at individual nodes. Unlike DaemonSet collectors, which need to be deployed on every node, a Deployment collector operates as a standalone instance. |
| 70 | + |
| 71 | +The Deployment collector handles the following data: |
| 72 | + |
| 73 | +- Kubernetes Events: Monitors and collects events occurring across the entire Kubernetes cluster. |
| 74 | +- Cluster Metrics: Captures metrics that provide insights into the overall health and performance of the Kubernetes cluster. |
| 75 | + |
| 76 | +### Auto-instrumentation |
| 77 | + |
| 78 | +The Helm Chart is configured to enable zero-code instrumentation using the [Operator's Instrumentation resource](https://github.com/open-telemetry/opentelemetry-operator/?tab=readme-ov-file#opentelemetry-auto-instrumentation-injection) for the following programming languages: |
| 79 | + |
| 80 | +- Go |
| 81 | +- Java |
| 82 | +- Node.js |
| 83 | +- Python |
| 84 | +- .NET |
| 85 | + |
| 86 | +## Deploying components using Kibana Onboarding UX |
| 87 | + |
| 88 | +The preferred method for deploying all components is through the Kibana Onboarding UX. Follow these steps: |
| 89 | + |
| 90 | +1. Navigate in Kibana to **Observability** --> **Add data** |
| 91 | +2. Select **Kubernetes**, then choose **Kubernetes monitoring with EDOT Collector**. |
| 92 | +3. Follow the on-screen instructions to install the OpenTelemetry Operator using the Helm Chart and the provided `values.yaml`. |
| 93 | + |
| 94 | +Notes: |
| 95 | +- If the `elastic_endpoint` showed by the UI is not valid for your environment, replace it with the correct Elasticsearch endpoint. |
| 96 | +- The displayed `elastic_api_key` corresponds to an API key that is automatically generated when the onboarding process is initiated. |
| 97 | + |
| 98 | +## Manual deployment of all components |
| 99 | + |
| 100 | +### Elastic Stack preparations |
| 101 | + |
| 102 | +Before installing the operator follow these actions: |
| 103 | + |
| 104 | +1. Create an [API Key](https://www.elastic.co/guide/en/kibana/current/api-keys.html), and make note of its value. |
| 105 | +(TBD: details of API key permissions). |
| 106 | + |
| 107 | +2. Install the following integrations in Kibana: |
| 108 | + - `System` |
| 109 | + - `Kubernetes` |
| 110 | + - `Kubernetes OpenTelemetry Assets` |
| 111 | + |
| 112 | +Notes: |
| 113 | +- When using the [Kibana onboarding UX](#deploying-components-using-kibana-onboarding-ux), the previous actions are automatically handled by Kibana. |
| 114 | + |
| 115 | +### Operator Installation |
| 116 | + |
| 117 | +1. Create the `opentelemetry-operator-system` Kubernetes namespace: |
| 118 | +``` |
| 119 | +$ kubectl create namespace opentelemetry-operator-system |
| 120 | +``` |
| 121 | + |
| 122 | +2. Create a secret in Kubernetes with the following command. |
| 123 | + ``` |
| 124 | + kubectl create -n opentelemetry-operator-system secret generic elastic-secret-otel \ |
| 125 | + --from-literal=elastic_endpoint='YOUR_ELASTICSEARCH_ENDPOINT' \ |
| 126 | + --from-literal=elastic_api_key='YOUR_ELASTICSEARCH_API_KEY' |
| 127 | + ``` |
| 128 | + Don't forget to replace |
| 129 | + - `YOUR_ELASTICSEARCH_ENDPOINT`: your Elasticsearch endpoint (*with* `https://` prefix example: `https://1234567.us-west2.gcp.elastic-cloud.com:443`). |
| 130 | + - `YOUR_ELASTICSEARCH_API_KEY`: your Elasticsearch API Key |
| 131 | + |
| 132 | +3. Execute the following commands to deploy the Helm Chart. |
| 133 | + |
| 134 | +``` |
| 135 | +$ helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts |
| 136 | +$ helm repo update |
| 137 | +$ helm upgrade --install --namespace opentelemetry-operator-system opentelemetry-kube-stack open-telemetry/opentelemetry-kube-stack --values ./resources/kubernetes/operator/helm/values.yaml --version 0.3.0 |
| 138 | +``` |
| 139 | + |
| 140 | +## Installation verification: |
| 141 | + |
| 142 | +Regardless of the installation method followed, perform the following checks to verify that everything is running properly: |
| 143 | + |
| 144 | +1. **Check Pods Status** |
| 145 | + - Ensure the following components are running without errors: |
| 146 | + - **Operator Pod** |
| 147 | + - **DaemonSet Collector Pod** |
| 148 | + - **Deployment Collector Pod** |
| 149 | + |
| 150 | +2. **Validate Instrumentation Object** |
| 151 | + - Confirm that the **Instrumentation object** is deployed and configured with a valid **endpoint**. |
| 152 | + |
| 153 | +3. **Kibana Dashboard Check** |
| 154 | + - Verify that the **[OTEL][Metrics Kubernetes] Cluster Overview** dashboard in **Kibana** is displaying data correctly. |
| 155 | + |
| 156 | +4. **Log Data Availability in Kibana** |
| 157 | + - In **Kibana Discovery**, confirm the availability of data under the `__logs-*__` data view. |
| 158 | + |
| 159 | +5. **Metrics Data Availability in Kibana** |
| 160 | + - In **Kibana Discovery**, ensure data is available under the `__metrics-*__` data view. |
| 161 | + |
| 162 | +## Instrumenting Applications |
| 163 | + |
| 164 | +To enable auto-instrumentation, add the corresponding annotation to the pods of existing deployments (`spec.template.metadata.annotations`), or to the desired namespace (to auto-instrument all pods in the namespace): |
| 165 | + |
| 166 | +```yaml |
| 167 | +metadata: |
| 168 | + annotations: |
| 169 | + instrumentation.opentelemetry.io/inject-<LANGUAGE>: "opentelemetry-operator-system/elastic-instrumentation" |
| 170 | +``` |
| 171 | +
|
| 172 | +where <LANGUAGE> is one of: `go` , `java`, `nodejs`, `python`, `dotnet` |
| 173 | + |
| 174 | +For detailed instructions and examples on how to instrument applications in Kubernetes using the OpenTelemetry Operator, refer to this guide (TBD-add link and document). |
| 175 | + |
| 176 | +## Limitations |
| 177 | + |
| 178 | +### Cert manager |
| 179 | + |
| 180 | +In Kubernetes, in order for the API server to communicate with the webhook component (created by the Operator), the webhook requires a TLS certificate that the API server is configured to trust. The previous provided configurations sets the Helm Chart to auto generate the required TLS certificates with an expiration policy of 365 days. These certificates **won't be renewed** if the Helm Chart's release is not manually updated. For production environments, it is highly recommended to use a certificate manger like [cert-manager](https://cert-manager.io/docs/installation/). |
| 181 | + |
| 182 | +If `cert-manager` CRDs are already present in your Kubernetes environment, you can configure the Operator to use them with the following modifications in the values file: |
| 183 | + |
| 184 | + |
| 185 | +```diff |
| 186 | +opentelemetry-operator: |
| 187 | + manager: |
| 188 | + extraArgs: |
| 189 | + - --enable-go-instrumentation |
| 190 | + admissionWebhooks: |
| 191 | + certManager: |
| 192 | +- enabled: false |
| 193 | ++ enabled: true |
| 194 | +
|
| 195 | +-autoGenerateCert: |
| 196 | +- enabled: true |
| 197 | +- recreate: true |
| 198 | +``` |
0 commit comments