You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/configuration/configmap.md
+8-8Lines changed: 8 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,11 +6,11 @@ When deploying AMD Device Metrics Exporter on Kubernetes, a `ConfigMap` is deplo
6
6
7
7
-`ServerPort`: this field is ignored when Device Metrics Exporter is deployed by the [GPU Operator](https://instinct.docs.amd.com/projects/gpu-operator/en/latest/) to avoid conflicts with the service node port config.
8
8
-`GPUConfig`:
9
-
- Fields: An array of strings specifying what metrics field to be exported.
10
-
- Labels: `SERIAL_NUMBER`, `GPU_ID`, `POD`, `NAMESPACE`, `CONTAINER`, `JOB_ID`, `JOB_USER`, `JOB_PARTITION`, `CARD_MODEL`, `HOSTNAME`, `GPU_PARTITION_ID`, `GPU_COMPUTE_PARTITION_TYPE`, and `GPU_MEMORY_PARTITION_TYPE` are always set and cannot be removed. Labels supported are available in the provided example `configmap.yml`.
11
-
- CustomLabels: A map of user-defined labels and their values. Users can set up to 10 custom labels. From the `GPUMetricLabel` list, only `CLUSTER_NAME` is allowed to be set in `CustomLabels`. Any other labels from this list cannot be set. Users can define other custom labels outside of this restriction. These labels will be exported with every metric, ensuring consistent metadata across all metrics.
12
-
- ExtraPodLabels: This defines a map that links Prometheus label names to Kubernetes pod labels. Each key is the Prometheus label that will be exposed in metrics, and the value is the pod label to pull the data from. This lets you expose pod metadata as Prometheus labels for easier filtering and querying.<br>(e.g. Considering an entry like `"WORKLOAD_ID" : "amd-workload-id"`, where `WORKLOAD_ID` is a label visible in metrics and its value is the pod label value of a pod label key set as `amd-workload-id`).
13
-
- ProfilerMetrics: A map of toggle to enable Profiler Metrics either for `all` nodes or a specific hostname with desired state. Key with specific hostname `$HOSTNAME` takes precedense over a `all` key.
9
+
-`Fields`: An array of strings specifying what metrics field to be exported.
10
+
-`Labels`: `SERIAL_NUMBER`, `GPU_ID`, `POD`, `NAMESPACE`, `CONTAINER`, `JOB_ID`, `JOB_USER`, `JOB_PARTITION`, `CARD_MODEL`, `HOSTNAME`, `GPU_PARTITION_ID`, `GPU_COMPUTE_PARTITION_TYPE`, and `GPU_MEMORY_PARTITION_TYPE` are always set and cannot be removed. Labels supported are available in the provided example `configmap.yml`.
11
+
-`CustomLabels`: A map of user-defined labels and their values. Users can set up to 10 custom labels. From the `GPUMetricLabel` list, only `CLUSTER_NAME` is allowed to be set in `CustomLabels`. Any other labels from this list cannot be set. Users can define other custom labels outside of this restriction. These labels will be exported with every metric, ensuring consistent metadata across all metrics.
12
+
-`ExtraPodLabels`: This defines a map that links Prometheus label names to Kubernetes pod labels. Each key is the Prometheus label that will be exposed in metrics, and the value is the pod label to pull the data from. This lets you expose pod metadata as Prometheus labels for easier filtering and querying.<br>(e.g. Considering an entry like `"WORKLOAD_ID" : "amd-workload-id"`, where `WORKLOAD_ID` is a label visible in metrics and its value is the pod label value of a pod label key set as `amd-workload-id`).
13
+
-`ProfilerMetrics`: A map of toggle to enable Profiler Metrics either for `all` nodes or a specific hostname with desired state. Key with specific hostname `$HOSTNAME` takes precedense over a `all` key. This only controls the Profiler Metrics which has prefix of GPU_PROF_ from the metrics list.
14
14
-`CommonConfig`:
15
15
-`MetricsFieldPrefix`: Add prefix string for all the fields exporter. [Premetheus Metric Label formatted](https://prometheus.io/docs/concepts/data_model/#metric-names-and-labels) string prefix will be accepted, on any invalid prefix will default to empty prefix to allow exporting of the fields.
16
16
-`HealthService` : Health Service configurations for the exproter.
@@ -20,12 +20,12 @@ When deploying AMD Device Metrics Exporter on Kubernetes, a `ConfigMap` is deplo
20
20
21
21
To use a custom configuration when deploying the Metrics Exporter:
22
22
23
-
1. Create a `ConfigMap` based on the provided example [configmap.yml](https://github.com/ROCm/device-metrics-exporter/blob/main/example/configmap.yaml)
23
+
1. Create a `ConfigMap` based on the provided example [configmap.yml](../examples/configmap.yml) file.
24
24
2. Change the `configMap` property in `values.yaml` to `configmap.yml`
Copy file name to clipboardExpand all lines: docs/configuration/docker.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,7 +2,7 @@
2
2
3
3
To use a custom configuration with the AMD Device Metrics Exporter container:
4
4
5
-
1. Create a config file based on the provided example [config.json](https://raw.githubusercontent.com/ROCm/device-metrics-exporter/refs/heads/main/example/config.json)
5
+
1. Create a config file based on the provided example [config.json](../../example/config.json)
6
6
2. Save `config.json` in the `config/` folder
7
7
3. Mount the `config/` folder when starting the container:
8
8
@@ -13,7 +13,7 @@ docker run -d \
13
13
-p 5000:5000 \
14
14
-v ./config:/etc/metrics \
15
15
--name device-metrics-exporter \
16
-
rocm/device-metrics-exporter:v1.3.1
16
+
rocm/device-metrics-exporter:v1.4.0
17
17
```
18
18
19
19
The exporter polls for configuration changes every minute, so updates take effect without container restarts.
Copy file name to clipboardExpand all lines: docs/installation/deb-package.rst
+3-4Lines changed: 3 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -53,13 +53,12 @@ Step 2: Install AMDGPU Driver
53
53
.. note::
54
54
For the most up-to-date information on installing dkms drivers please see the `ROCm Install Quick Start <https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/quick-start.html>`_ page. The below instructions are the most current instructions as of ROCm 7.0.rc1.
55
55
56
-
1. Download the driver from the Radeon repository (`repo.radeon.com <https://repo.radeon.com/amdgpu-install>`_) for your operating system. For example if you want to get the latest ROCm 7.0.rc1 drivers for Ubuntu 22.04 you would run the following command:
56
+
1. Download the driver from the Radeon repository (`repo.radeon.com <https://repo.radeon.com/amdgpu-install>`_) for your operating system. For example if you want to get the latest ROCm 7.0.0 drivers for Ubuntu 22.04 you would run the following command:
Please note that the above url will be different depending on what version of the drivers you will be installing and type of Operating System you are using.
0 commit comments