Skip to content

Commit 128de0d

Browse files
authored
Merge branch 'cosmostation:develop' into develop
2 parents 3d1351f + e78d89a commit 128de0d

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

51 files changed

+3238
-114
lines changed

docker/grafana/provisioning/dashboards/validator/babylon_validator_fp_dashboard.json

Lines changed: 31 additions & 10 deletions
Large diffs are not rendered by default.

helm/.helmignore

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
# Patterns to ignore when building packages.
2+
# This supports shell glob matching, relative path matching, and
3+
# negation (prefixed with !). Only one pattern per line.
4+
.DS_Store
5+
# Common VCS dirs
6+
.git/
7+
.gitignore
8+
.bzr/
9+
.bzrignore
10+
.hg/
11+
.hgignore
12+
.svn/
13+
# Common backup files
14+
*.swp
15+
*.bak
16+
*.tmp
17+
*.orig
18+
*~
19+
# Various IDEs
20+
.project
21+
.idea/
22+
*.tmproj
23+
.vscode/
24+
# img folder
25+
img/
26+
# Changelog
27+
CHANGELOG.md

helm/Chart.lock

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
dependencies:
2+
- name: postgresql
3+
repository: oci://registry-1.docker.io/bitnamicharts
4+
version: 16.3.5
5+
- name: common
6+
repository: oci://registry-1.docker.io/bitnamicharts
7+
version: 2.29.0
8+
digest: sha256:8c7c60055f2b8834c742f2b389606d430018a7fd08449ce2c8c31b31a94f0ca4
9+
generated: "2025-01-10T15:05:20.436906+03:30"

helm/Chart.yaml

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
apiVersion: v2
2+
name: cvms
3+
description: The Cosmos Validator Monitoring Service (CVMS) is an integrated monitoring system for validators within the Cosmos app chain ecosystem
4+
5+
# A chart can be either an 'application' or a 'library' chart.
6+
#
7+
# Application charts are a collection of templates that can be packaged into versioned archives
8+
# to be deployed.
9+
#
10+
# Library charts provide useful utilities or functions for the chart developer. They're included as
11+
# a dependency of application charts to inject those utilities and functions into the rendering
12+
# pipeline. Library charts do not define any templates and therefore cannot be deployed.
13+
type: application
14+
15+
# This is the chart version. This version number should be incremented each time you make changes
16+
# to the chart and its templates, including the app version.
17+
# Versions are expected to follow Semantic Versioning (https://semver.org/)
18+
version: 0.1.0
19+
20+
# This is the version number of the application being deployed. This version number should be
21+
# incremented each time you make changes to the application. Versions are not expected to
22+
# follow Semantic Versioning. They should reflect the version the application is using.
23+
# It is recommended to use it with quotes.
24+
appVersion: "1.0.0"
25+
26+
dependencies:
27+
- condition: postgresql.enabled
28+
name: postgresql
29+
repository: oci://registry-1.docker.io/bitnamicharts
30+
version: 16.x.x
31+
- name: common
32+
repository: oci://registry-1.docker.io/bitnamicharts
33+
tags:
34+
- bitnami-common
35+
version: 2.x.x

helm/README.md

Lines changed: 132 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,132 @@
1+
# CVMS Helm Chart
2+
3+
The Cosmos Validator Monitoring Service (CVMS) is an integrated monitoring system for validators within the Cosmos app chain ecosystem. This helm chart is fot installing cvms on kubernetes.
4+
5+
## Prerequisites
6+
7+
- Kubernetes 1.19+
8+
- Helm 3.0+
9+
10+
## Installation
11+
12+
To install the chart with the release name `my-release`:
13+
14+
```bash
15+
helm install my-release cvms/helm
16+
```
17+
18+
## Uninstallation
19+
20+
To uninstall/delete the `my-release` deployment:
21+
22+
```bash
23+
helm uninstall my-release
24+
```
25+
26+
## Configuration
27+
28+
The following table lists the configurable parameters of the CVMS chart and their default values.
29+
30+
**Note**: You must provide cvmsConfig.monikers and cvmsConfig.chains for monitoring Cosmos validators with cvms, checkout the example in https://github.com/cosmostation/cvms/blob/release/docs/setup.md
31+
32+
| Parameter | Description | Default |
33+
|--------------------------|---------------------------------------|--------------------------------|
34+
| `imagePullSecrets` | Image pull secrets | `[]` |
35+
| `nameOverride` | Override the name of the chart | `""` |
36+
| `fullnameOverride` | Override the full name of the chart | `""` |
37+
| `namespaceOverride` | Override the namespace of the chart | `""` |
38+
| `customChainsConfig.enabled` | Enable custom chains configuration | `false` |
39+
| `customChainsConfig.name` | Name of the custom chains configmap | `custom-chains-cm` |
40+
| `cvmsConfig.name` | Name of the CVMS configmap | `cvms-cm` |
41+
| `cvmsConfig.monikers` | CVMS config for validator or network mode | `[]` |
42+
| `cvmsConfig.chains` | CVMS config for monitoring chains | `[]` |
43+
| `indexer.replicaCount` | Number of replicas for the indexer | `1` |
44+
| `indexer.image.repository` | Image repository for the indexer | `cosmostation/cvms` |
45+
| `indexer.image.pullPolicy` | Image pull policy for the indexer | `Always` |
46+
| `indexer.image.tag` | Image tag for the indexer | `latest` |
47+
| `indexer.command` | Command to run in the indexer container | `["/bin/cvms"]` |
48+
| `indexer.args` | Arguments for the indexer container | `["start", "indexer", "--config=/var/lib/cvms/config.yaml", "--log-color-disable", "false", "--log-level", "4", "--port=9300"]` |
49+
| `indexer.env` | Environment variables for the indexer | `{ DB_RETENTION_PERIOD: 1h }` |
50+
| `indexer.revisionHistoryLimit` | Number of old ReplicaSets to retain | `2` |
51+
| `indexer.podAnnotations` | Annotations for the indexer pods | `{}` |
52+
| `indexer.podLabels` | Labels for the indexer pods | `{}` |
53+
| `indexer.podSecurityContext.enabled` | Enable pod security context for the indexer | `false` |
54+
| `indexer.securityContext.enabled` | Enable security context for the indexer | `true` |
55+
| `indexer.service.type` | Service type for the indexer | `ClusterIP` |
56+
| `indexer.service.port` | Service port for the indexer | `9300` |
57+
| `indexer.resources.enabled` | Enable resource requests and limits for the indexer | `true` |
58+
| `indexer.livenessProbe.enabled` | Enable liveness probe for the indexer | `true` |
59+
| `indexer.readinessProbe.enabled` | Enable readiness probe for the indexer | `true` |
60+
| `indexer.volumes` | Additional volumes for the indexer | See `values.yaml` |
61+
| `indexer.volumeMounts` | Additional volume mounts for the indexer | See `values.yaml` |
62+
| `indexer.nodeSelector` | Node selector for the indexer pods | `{}` |
63+
| `indexer.tolerations` | Tolerations for the indexer pods | `[]` |
64+
| `indexer.affinity` | Affinity rules for the indexer pods | `{}` |
65+
| `indexer.metrics.type` | Metrics service type for the indexer | `ClusterIP` |
66+
| `indexer.metrics.servicePort` | Metrics service port for the indexer | `9300` |
67+
| `indexer.serviceMonitor.enabled` | Enable Prometheus ServiceMonitor for the indexer | `false` |
68+
| `exporter.replicaCount` | Number of replicas for the exporter | `1` |
69+
| `exporter.image.repository` | Image repository for the exporter | `cosmostation/cvms` |
70+
| `exporter.image.pullPolicy` | Image pull policy for the exporter | `Always` |
71+
| `exporter.image.tag` | Image tag for the exporter | `latest` |
72+
| `exporter.command` | Command to run in the exporter container | `["/bin/cvms"]` |
73+
| `exporter.args` | Arguments for the exporter container | `["start", "exporter", "--config=/var/lib/cvms/config.yaml", "--log-color-disable", "false", "--log-level", "4", "--port=9200"]` |
74+
| `exporter.env` | Environment variables for the exporter | `{}` |
75+
| `exporter.revisionHistoryLimit` | Number of old ReplicaSets to retain | `2` |
76+
| `exporter.podAnnotations` | Annotations for the exporter pods | `{}` |
77+
| `exporter.podLabels` | Labels for the exporter pods | `{}` |
78+
| `exporter.podSecurityContext.enabled` | Enable pod security context for the exporter | `false` |
79+
| `exporter.securityContext.enabled` | Enable security context for the exporter | `true` |
80+
| `exporter.service.type` | Service type for the exporter | `ClusterIP` |
81+
| `exporter.service.port` | Service port for the exporter | `9200` |
82+
| `exporter.resources.enabled` | Enable resource requests and limits for the exporter | `true` |
83+
| `exporter.livenessProbe.enabled` | Enable liveness probe for the exporter | `true` |
84+
| `exporter.readinessProbe.enabled` | Enable readiness probe for the exporter | `true` |
85+
| `exporter.volumes` | Additional volumes for the exporter | See `values.yaml` |
86+
| `exporter.volumeMounts` | Additional volume mounts for the exporter | See `values.yaml` |
87+
| `exporter.nodeSelector` | Node selector for the exporter pods | `{}` |
88+
| `exporter.tolerations` | Tolerations for the exporter pods | `[]` |
89+
| `exporter.affinity` | Affinity rules for the exporter pods | `{}` |
90+
| `exporter.metrics.type` | Metrics service type for the exporter | `ClusterIP` |
91+
| `exporter.metrics.servicePort` | Metrics service port for the exporter | `9200` |
92+
| `exporter.serviceMonitor.enabled` | Enable Prometheus ServiceMonitor for the exporter | `false` |
93+
| `postgresql.enabled` | Enable PostgreSQL | `true` |
94+
| `postgresql.auth.username` | PostgreSQL username | `cvms` |
95+
| `postgresql.auth.password` | PostgreSQL password | `mysecretpassword` |
96+
| `postgresql.auth.database` | PostgreSQL database | `cvms` |
97+
| `postgresql.architecture` | PostgreSQL architecture | `standalone` |
98+
| `postgresql.primary.resourcesPreset` | PostgreSQL primary resources preset | `nano` |
99+
| `postgresql.primary.resources` | PostgreSQL primary resources | `{}` |
100+
| `postgresql.primary.persistence.storageClass` | PostgreSQL primary storage class | `""` |
101+
| `postgresql.primary.persistence.size` | PostgreSQL primary storage size | `500Mi` |
102+
| `postgresql.external.host` | External PostgreSQL host | `""` |
103+
| `postgresql.external.port` | External PostgreSQL port | `5432` |
104+
| `postgresql.external.user` | External PostgreSQL user | `cvms` |
105+
| `postgresql.external.password` | External PostgreSQL password | `""` |
106+
| `postgresql.external.database` | External PostgreSQL database | `cvms` |
107+
108+
Specify each parameter using the `--set key=value[,key=value]` argument to `helm install`. For example:
109+
110+
```bash
111+
helm install my-release /path/to/cvms --set indexer.image.tag=1.0.0 --set exporter.image.tag=1.0.0
112+
```
113+
114+
Alternatively, a YAML file that specifies the values for the parameters can be provided while installing the chart. For example:
115+
116+
```bash
117+
helm install my-release /path/to/cvms -f values.yaml
118+
```
119+
120+
### Detailed Configuration
121+
122+
Below is a more detailed explanation of the values that can be set in the `values.yaml` file:
123+
124+
- `image.repository`: The Docker image repository for the CVMS application.
125+
- `image.tag`: The tag of the Docker image to use.
126+
- `image.pullPolicy`: The Kubernetes image pull policy.
127+
- `service.type`: The type of Kubernetes service to create (e.g., `ClusterIP`, `NodePort`, `LoadBalancer`).
128+
- `service.port`: The port on which the service will be exposed.
129+
- `resources`: Resource requests and limits for the CVMS pods.
130+
- `nodeSelector`: Node labels for pod assignment.
131+
- `tolerations`: Tolerations for pod assignment.
132+
- `affinity`: Affinity rules for pod assignment.

helm/files/axelar-evm.yaml

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
groups:
2+
- name: AxelarEVMPackage
3+
rules:
4+
- alert: AxelarEVMChainMaintainerInactive
5+
expr: cvms_axelar_evm_maintainer_status == 0
6+
for: 1m
7+
labels:
8+
severity: warning
9+
annotations:
10+
summary: 'Axelar EVM Maintainer for {{ $labels.evm_chain }} is inactive in {{ $labels.chain_id }}'

helm/files/balance.yaml

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
groups:
2+
- name: BalancePackage
3+
rules:
4+
- alert: AxelarEVMBroadcasterBalanceUnder10
5+
expr: cvms_balance_remaining_amount{balance_address='axelar146kdz9stlycvacm03hg0t5fxq6jszlc4gtxgpr'} < 10
6+
for: 5m
7+
labels:
8+
severity: info
9+
annotations:
10+
summary: 'The broadcaster ({{ $labels.balance_address }}) has less than 10 tokens remaining. Current balance: {{ $value }}'
11+
12+
- alert: KavaOracleBroadcasterBalanceUnder10
13+
expr: cvms_balance_remaining_amount{balance_address='kava1ujfrlcd0ted58mzplnyxzklsw0sqevlgxndanp'} < 10
14+
for: 5m
15+
labels:
16+
severity: info
17+
annotations:
18+
summary: 'The broadcaster ({{ $labels.balance_address }}) has less than 10 tokens remaining. Current balance: {{ $value }}'
19+
20+
- alert: InjectiveEventnonceBroadcasterBalanceUnder1
21+
expr: cvms_balance_remaining_amount{balance_address='inj1mtxhcchfyvvs6u4nmnylgkxvkrax7c2la69l8w'} < 1
22+
for: 5m
23+
labels:
24+
severity: info
25+
annotations:
26+
summary: 'The broadcaster ({{ $labels.balance_address }}) has less than 1 tokens remaining. Current balance: {{ $value }}'
27+
28+
- alert: NibiruOracleBroadcasterBalanceUnder10
29+
expr: cvms_balance_remaining_amount{balance_address='nibi14zc23q3qcewscy7wnt3s95h32chytenqxe633l'} < 10
30+
for: 5m
31+
labels:
32+
severity: info
33+
annotations:
34+
summary: 'The broadcaster ({{ $labels.balance_address }}) has less than 10 tokens remaining. Current balance: {{ $value }}'

helm/files/block.yaml

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
groups:
2+
- name: BlockPackage
3+
rules:
4+
- alert: LastestBlockTimeDiffOver60s
5+
expr: (time() - cvms_block_timestamp) > 60
6+
for: 30s
7+
labels:
8+
severity: warning
9+
annotations:
10+
summary: 'Latest Block Timestamp is over 60 seconds from now'
11+
description: |
12+
The block timestamp for the {{ $labels.chain }} chain at {{ $labels.endpoint }} has exceeded 60 seconds. Please check the synchronization status of the node.
13+
14+
- alert: LastestBlockTimeDiffOver300s
15+
expr: (time() - cvms_block_timestamp) > 300
16+
for: 2m
17+
labels:
18+
severity: critical
19+
annotations:
20+
summary: 'The {{ $labels.chain }}-{{ $labels.chain_id }} latest block timestamp has exceeded 5 minutes'
21+
description: |
22+
The block timestamp at {{ $labels.endpoint }} has exceeded 5 minutes. Please check the node's synchronization status immediately.

helm/files/consensus-vote.yaml

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
groups:
2+
- name: VoteIndexerPackage
3+
rules:
4+
- alert: VoteIndexerSyncSlow
5+
expr: (time() - cvms_consensus_vote_latest_index_pointer_block_timestamp) > 300
6+
for: 30s
7+
labels:
8+
severity: warning
9+
annotations:
10+
summary: The Vote Indexer Package sync is slow to sync blocks for {{ $labels.chain_id }}
11+
12+
- alert: IncreaseRecentConsensusVoteMiss
13+
expr: increase(cvms_consensus_vote_recent_miss_counter[1m]) > 15
14+
for: 30s
15+
labels:
16+
severity: critical
17+
annotations:
18+
summary: The validator is missing too many votes in the consensus for {{ $labels.chain_id }}

helm/files/custom_chains.yaml.example

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
---
2+
mintstation-1:
3+
protocol_type: cosmos
4+
support_asset:
5+
denom: umint
6+
decimal: 6
7+
packages:
8+
- block
9+
- upgrade
10+
- uptime
11+
- voteindexer #for cometbft consensus vote
12+
- veindexer # for vote-extension

helm/files/eventnonce.yaml

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
groups:
2+
- name: EventNoncePackage
3+
rules:
4+
- alert: CosmosEventNonceDiffOver0
5+
expr: (cvms_eventnonce_highest_nonce - on (chain_id) group_right cvms_eventnonce_nonce) > 0
6+
for: 15m
7+
labels:
8+
severity: warning
9+
annotations:
10+
summary: 'Validator node event nonce is behind in {{ $labels.chain }}'
11+
description: '{{ $labels.chain }} has an event nonce that is {{ $value }} behind other validators.'
12+
13+
- alert: CosmosEventNonceDiffOver1h
14+
expr: (cvms_eventnonce_highest_nonce - on (chain_id) group_right cvms_eventnonce_nonce) > 0
15+
for: 1h
16+
labels:
17+
severity: critical
18+
annotations:
19+
summary: 'Validator node event nonce is behind in {{ $labels.chain }} during 1h'
20+
description: |
21+
The event nonce for {{ $labels.chain }} is more than {{ $value }} behind other validators. Immediate action is required.

helm/files/extension-vote.yaml

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
groups:
2+
- name: VoteExtensionIndexerPackage
3+
rules:
4+
- alert: VoteExtensionIndexerSyncSlow
5+
expr: (time() - cvms_extension_vote_latest_index_pointer_block_timestamp) > 300
6+
for: 30s
7+
labels:
8+
severity: warning
9+
annotations:
10+
summary: The Vote Extension Indexer Package sync is slow to sync blocks for {{ $labels.chain_id }}
11+
12+
- alert: IncreaseRecentExtensionVoteMiss
13+
expr: increase(cvms_extension_vote_recent_miss_counter[1m]) > 15
14+
for: 30s
15+
labels:
16+
severity: critical
17+
annotations:
18+
summary: The validator is missing too many votes in the extension voting for {{ $labels.chain_id }}

helm/files/oracle.yaml

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
groups:
2+
- name: OraclePackage
3+
rules:
4+
- alert: IncreasingMissCounterOver30%During1h
5+
expr: (delta(cvms_oracle_miss_counter[1h]) / on (chain_id) group_left delta(cvms_oracle_block_height[1h])) * on (chain_id) group_left cvms_oracle_vote_period > 0.30
6+
for: 30m
7+
labels:
8+
severity: critical
9+
annotations:
10+
summary: The Oracle Miss Counter for the network {{ $labels.chain }}-{{ $labels.chain_id }} is increasing by over 30% during the past hour
11+
12+
- alert: OracleUptimeUnder50%
13+
expr: |
14+
(
15+
(cvms_oracle_vote_window - on (chain_id) group_right () cvms_oracle_miss_counter)
16+
/ on (chain_id) group_left ()
17+
cvms_oracle_vote_window
18+
) < 0.5
19+
for: 5m
20+
labels:
21+
severity: critical
22+
annotations:
23+
summary: The Validator's Oracle vote rate is too low for the {{ $labels.chain }}-{{ $labels.chain_id }} network
24+
description: |
25+
Oracle vote rate has dropped below 50%, indicating severe issues with the validator's participation.
26+
Current vote rate: {{ $value | humanizePercentage }}

0 commit comments

Comments
 (0)