-
Notifications
You must be signed in to change notification settings - Fork 518
[Chargeback] Alerting rule #16228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
[Chargeback] Alerting rule #16228
Changes from all commits
Commits
Show all changes
50 commits
Select commit
Hold shift + click to select a range
de717b8
WIP: early chargeback code for review
JohannesMahne 14941da
Working config integration - 0.0.2
JohannesMahne 5343625
Version 0.0.3: working from Stack monitoring data
JohannesMahne fb96f41
Fixed query for one visualisation
JohannesMahne c776186
Update instructions
JohannesMahne 5fa139c
Working with the correct alias
JohannesMahne 4a643d7
Changes to transforms
JohannesMahne fac07bc
Bug fix: Fix sorting on visualisation.
JohannesMahne 19a902b
Update setup instructions
JohannesMahne 668a7cf
0.1.0: Adding ECU value (normalised cost).
JohannesMahne 0482b9a
Bug: Aligned fields returned to field names used in visualisation
JohannesMahne 5b0607e
Fixing bug: aligning esql returned field names with field names used …
JohannesMahne aace7b6
move to packages
sholzhauer-es aad3921
not starting transforms on integration installation
sholzhauer-es 224b6ca
Update version number
JohannesMahne 9656a03
Made sure the colour palette is predictable by using the eui_amsterda…
JohannesMahne c271271
Update sequence and comments on pre-setup to promote ES integration
JohannesMahne 5155220
Consistent naming of datastream. Add LIMIT 5000 to ESQL top query to …
JohannesMahne 1ca257f
Add correct code owner
JohannesMahne 5b0e668
Delete wrong test files
JohannesMahne d3048ef
Updated the directory structure to remove superfluous directory
JohannesMahne 79508a5
Rem reference to sample logs and logos
JohannesMahne 4e471f0
Switch off dynamic mappings for the results of the transforms - we kn…
JohannesMahne e5efd8f
Removed agent folders in data stream, as it is not used.
JohannesMahne 18b0b0c
Updated the readme file to refer to integration, rather than module. …
JohannesMahne 3101319
Re-add image
JohannesMahne 4dfb302
Formatting
JohannesMahne 2a6ecdb
NOT WORKING: settings index.mode: lookup is not supported
JohannesMahne ae4263d
Fixing the control error in the dashboard by adding a data view.
JohannesMahne 55d66f2
Updated to push back usage data transform to ES Integration
JohannesMahne db0b2b7
Updated readme
JohannesMahne 5770da3
Update transfrom version numbers
JohannesMahne b2ff2df
Swap the use of deployment_id or deployment name to a concatenation o…
JohannesMahne c2dfd5c
Make use of the new elastic-package version, which will create the lo…
JohannesMahne 2f6c0e0
Update version number
JohannesMahne 2765eaf
Updated pre-setup, and version number
JohannesMahne 3f5d617
Adding casting to double for division to avoid null instead of very s…
JohannesMahne 5a442a8
Update version
JohannesMahne d70f74e
Allowing for setting converion rate per time window
sholzhauer-es 42d5b32
fixing pipeline versions
sholzhauer-es 57f1008
adding pipeline stuff
sholzhauer-es e7a85df
correcting version
sholzhauer-es a350469
[Chargeback] Dashboard control and Dataview (#16153)
sholzhauer-es 8ba706f
SKU based chargeback (#16182)
sholzhauer-es 325bdaa
Chargeback Integration: Extract deployment group from Billing tags (#…
JohannesMahne d694959
Fixing bug introduced in 0.2.4 (#16192)
sholzhauer-es 6361fd2
Add observability alerts for chargeback integration (#16205)
JohannesMahne 98bf328
Fix mustache template escaping in alert actions documentation
JohannesMahne a876616
Add alerting rule templates and enable auto-start for all transforms
JohannesMahne 3fd3ce6
Fix: Revert transform frequencies back to 60m
JohannesMahne File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,3 @@ | ||
| dependencies: | ||
| ecs: | ||
| reference: [email protected] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,183 @@ | ||
| # Chargeback | ||
|
|
||
| _Technical preview: This integration is being developed by Elastic's Customer Engineering team. Please report any issues to the Elastician who shared this integration with you._ | ||
|
|
||
| The Chargeback integration provides FinOps visibility into Elastic usage across tenants. By integrating data from the [**Elasticsearch Service Billing**](https://www.elastic.co/docs/reference/integrations/ess_billing/) and [**Elasticsearch**](https://www.elastic.co/docs/reference/integrations/elasticsearch/) integrations, it enables the determination of value provided by each deployment, data stream, and tier across the organisation. This allows Centre of Excellence (CoE) teams to accurately allocate costs back to the appropriate tenant. | ||
|
|
||
| The integration creates several transforms that aggregate billing and usage data into lookup indices optimized for cost analysis and chargeback reporting. | ||
|
|
||
| ## What is FinOps? | ||
|
|
||
| FinOps is an operational framework and cultural practice aimed at maximizing the business value of cloud usage. It facilitates timely, data-driven decision-making and promotes financial accountability through collaboration among engineering, finance, and business teams. | ||
|
|
||
| ## Purpose | ||
|
|
||
| The Chargeback integration assists organisations in addressing a crucial question: | ||
|
|
||
| > **"How is my organisation consuming the Elastic solution, and to which tenants can I allocate these costs?"** | ||
|
|
||
| The integration provides a breakdown of Elastic Consumption Units (ECUs) per: | ||
|
|
||
| - Deployment | ||
| - Data tier | ||
| - Data stream | ||
| - Day | ||
|
|
||
| Currently, Chargeback calculations consider only Elasticsearch data nodes. Contributions from other assets, like Kibana or ML nodes, are assumed to be shared proportionally among tenants. To incorporate indexing, querying, and storage in a weighted manner, a blended value is created using the following default weights (modifiable): | ||
| - Indexing: `20` (applicable only to the hot tier) | ||
| - Querying: `20` | ||
| - Storage: `40` | ||
|
|
||
| This default weighting means storage contributes most to the blended cost calculation, with indexing considered only on the hot tier. Adjust these weights based on your organisation's needs and best judgment. | ||
|
|
||
| Chargeback costs are presented based on a configured rate and unit. These are used to display cost in your local currency, for instance `EUR`, with a rate of `0.85` per ECU. | ||
|
|
||
| ## Configuration | ||
|
|
||
| Configuration values are stored in the `chargeback_conf_lookup` index. The dashboard automatically applies the correct configuration based on the billing date falling within the `conf_start_date` and `conf_end_date` range. | ||
|
|
||
| ### Update the default configuration: | ||
|
|
||
| Using `_update/config` updates the document with ID `config`: | ||
|
|
||
| ``` | ||
| POST chargeback_conf_lookup/_update/config | ||
| { | ||
| "doc": { | ||
| "conf_ecu_rate": 0.85, | ||
| "conf_ecu_rate_unit": "EUR", | ||
| "conf_indexing_weight": 20, | ||
| "conf_query_weight": 20, | ||
| "conf_storage_weight": 40, | ||
| "conf_start_date": "2024-01-01T00:00:00.000Z", | ||
| "conf_end_date": "2024-12-31T23:tie" | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| ### Add a new configuration period (for time-based rate changes): | ||
|
|
||
| Using `_doc` creates a new document with an auto-generated ID: | ||
|
|
||
| ``` | ||
| POST chargeback_conf_lookup/_doc | ||
| { | ||
| "conf_ecu_rate": 0.95, | ||
| "conf_ecu_rate_unit": "EUR", | ||
| "conf_indexing_weight": 20, | ||
| "conf_query_weight": 20, | ||
| "conf_storage_weight": 40, | ||
| "conf_start_date": "2025-01-01T00:00:00.000Z", | ||
| "conf_end_date": "2025-12-31T23:59:59.999Z" | ||
| } | ||
| ``` | ||
|
|
||
| This allows you to have different rates for different time periods (e.g., quarterly or annual rate changes). | ||
|
|
||
| **Configuration Options:** | ||
| - `conf_ecu_rate`: The monetary value per ECU (e.g., 0.85) | ||
| - `conf_ecu_rate_unit`: The currency code (e.g., "EUR", "USD", "GBP") | ||
| - `conf_indexing_weight`: Weight for indexing operations (default: 20, only applies to hot tier) | ||
| - `conf_query_weight`: Weight for query operations (default: 20) | ||
| - `conf_storage_weight`: Weight for storage (default: 40) | ||
| - `conf_start_date`: Start date/time for the configuration period (ISO 8601 format) | ||
| - `conf_end_date`: End date/time for the configuration period (ISO 8601 format) | ||
|
|
||
| ## Data and Transforms | ||
|
|
||
| The integration creates the following transforms to aggregate cost and usage data: | ||
|
|
||
| 1. **billing_cluster_cost** - Aggregates daily ECU usage per deployment from ESS Billing data, with support for deployment groups via `chargeback_group` tags | ||
| 2. **cluster_deployment_contribution** - Calculates per-deployment usage metrics (indexing time, query time, storage) from Elasticsearch monitoring data | ||
| 3. **cluster_datastream_contribution** - Aggregates usage per data stream for detailed cost attribution | ||
| 4. **cluster_tier_contribution** - Aggregates usage per data tier (hot, warm, cold, frozen) | ||
| 5. **cluster_tier_and_ds_contribution** - Combined view of usage by both tier and data stream | ||
|
|
||
| These transforms produce lookup indices that are queried by the dashboard using ES|QL LOOKUP JOINs to correlate billing costs with actual usage patterns. | ||
|
|
||
| ### Transform Auto-Start | ||
|
|
||
| All Chargeback transforms start automatically when the integration is installed. No manual intervention is required to start the transforms. | ||
|
|
||
| **Performance Note:** On clusters with months of historical monitoring data for multiple deployments, the initial transform execution may process a large volume of data. This can cause temporary performance impact during the first run. The transforms will then run incrementally on their configured schedules (15-60 minute intervals), processing only new data with minimal overhead. | ||
|
|
||
| You can verify the transforms are running by navigating to **Stack Management → Transforms** and filtering for `chargeback`. | ||
|
|
||
| ### Transform Health Monitoring | ||
|
|
||
| The integration includes a **Transform Health Monitoring** alert rule template that can be installed from the integration page. This rule monitors all Chargeback transforms and alerts when they encounter issues or failures, providing proactive notification of any problems with data processing. | ||
|
|
||
| ## Dashboard | ||
|
|
||
| Chargeback data can be viewed in the `[Chargeback] Cost and Consumption breakdown` dashboard, which provides: | ||
|
|
||
| - Cost breakdown by deployment, data tier, and data stream | ||
| - Time-series cost trends | ||
| - Deployment group filtering for team/project-based analysis | ||
| - Blended cost metrics combining indexing, querying, and storage usage | ||
| - ECU consumption vs. monetary cost comparison | ||
|
|
||
|  | ||
|
|
||
| ## Deployment Groups | ||
|
|
||
| The integration supports organizing deployments into logical groups using the `chargeback_group` tag on ESS Billing deployments. This enables cost allocation and filtering by teams, projects, or any organizational structure. | ||
|
|
||
| To assign a deployment to a chargeback group, add a tag to your deployment in the Elastic Cloud console in the format: | ||
| ``` | ||
| chargeback_group:<group-name> | ||
| ``` | ||
|
|
||
| For example: `chargeback_group:team-search` or `chargeback_group:project-analytics` | ||
|
|
||
| The `billing_cluster_cost` transform automatically extracts these tags from the `deployment_tags` field in ESS Billing data using runtime mappings. The dashboard includes a deployment group filter to view costs by specific groups, making it easy to track expenses per team or project. | ||
|
|
||
| **Note:** Each deployment should have only one `chargeback_group` tag. Having multiple tags can cause issues and lead to unpredictable cost allocation. | ||
|
|
||
| ## Observability Alerting | ||
|
|
||
| This integration includes 3 pre-configured alert rule templates that can be installed directly from the integration page in Kibana: | ||
|
|
||
| 1. **Transform Health Monitoring** - Monitors the health of all Chargeback transforms and alerts when they encounter issues or failures | ||
| 2. **New Chargeback Group Detected** - Notifies when a new `chargeback_group` tag is added to a deployment | ||
| 3. **Deployment with Chargeback Group Missing Usage Data** - Detects when a deployment has a chargeback group assigned but is not sending usage/consumption data | ||
|
|
||
| **Important:** For alert rules 2 and 3, ensure that the Chargeback transforms are running before setting them up. These alerting rules query the lookup indices created by the transforms (`billing_cluster_cost_lookup`, `cluster_deployment_contribution_lookup`, etc.). If the transforms are not started, the alerts will not function correctly. | ||
|
|
||
| ### Alert actions | ||
|
|
||
| **Configure an action** with the following message template appended to the default content (keep the new lines, as it helps with legibility): | ||
|
|
||
| ``` | ||
| Details: | ||
|
|
||
| {{`{{#context.hits}}`}} | ||
| • {{`{{_source}}`}} | ||
|
|
||
| {{`{{/context.hits}}`}} | ||
|
|
||
| Total: {{`{{context.hits.length}}`}} | ||
| ``` | ||
|
|
||
| ## Requirements | ||
|
|
||
| To use this integration, the following prerequisites must be met: | ||
|
|
||
| **Monitoring Cluster:** | ||
| - Must be on Elasticsearch version **9.2.0+** due to the use of smart [ES|QL LOOKUP JOIN](https://www.elastic.co/docs/reference/query-languages/esql/esql-lookup-join) (conditional joins) in transforms and dashboard queries | ||
| - This is where the Chargeback integration should be installed | ||
|
|
||
| **Required Integrations:** | ||
| - [**Elasticsearch Service Billing**](https://www.elastic.co/docs/reference/integrations/ess_billing/) integration (v1.4.1+) must be installed and collecting billing data from your Elastic Cloud organization | ||
| - [**Elasticsearch**](https://www.elastic.co/docs/reference/integrations/elasticsearch/) integration (v1.16.0+) must be installed and collecting [usage data](https://www.elastic.co/docs/reference/integrations/elasticsearch/#indices-and-data-streams-usage-analysis) from all deployments you want to include in chargeback calculations | ||
|
|
||
| **Required Transforms:** | ||
| - The transform `logs-elasticsearch.index_pivot-default-{VERSION}` (from the Elasticsearch integration) must be running to aggregate usage metrics per index | ||
|
|
||
| **Data Flow:** | ||
| 1. ESS Billing data is collected into `metrics-ess_billing.billing-*` | ||
| 2. Elasticsearch usage data is collected into `metrics-elasticsearch.stack_monitoring.*` (or `monitoring-indices` for Stack Monitoring) | ||
| 3. Chargeback transforms process and correlate this data | ||
| 4. Dashboard queries the resulting lookup indices using ES|QL | ||
|
|
||
| **Note:** This integration must be installed on a centralized monitoring cluster that has visibility to both billing and usage data from your deployments. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,86 @@ | ||
| # newer versions go on top | ||
| - version: 0.2.8 | ||
| changes: | ||
| - description: "Add Kibana alerting rule templates for transform health monitoring, detecting new chargeback groups, and identifying deployments with missing usage data. Templates can be installed directly from the package. All transforms now auto-start on installation." | ||
| type: enhancement | ||
| link: https://github.com/elastic/integrations/pull/11111 | ||
| - version: 0.2.7 | ||
| changes: | ||
| - description: "Add observability alerting rule templates and documentation for monitoring new chargeback groups and missing usage data. Update Elasticsearch version requirement to 9.2.0+ for smart lookup join support." | ||
| type: enhancement | ||
| link: https://github.com/elastic/integrations/pull/16205 | ||
| - version: 0.2.6 | ||
| changes: | ||
| - description: "Fixing bug around sku based cost allocation" | ||
| type: bugfix | ||
| link: https://github.com/elastic/integrations/pull/16192 | ||
| - version: 0.2.5 | ||
| changes: | ||
| - description: "Add deployment_group field extracted from ESS Billing deployment tags using runtime mappings to enable tag-based cost allocation and filtering. Fix transforms to use correct field type for elasticsearch.cluster.name." | ||
| type: enhancement | ||
| link: https://github.com/elastic/integrations/pull/16185 | ||
| - version: 0.2.4 | ||
| changes: | ||
| - description: "Adding sku and cost_type to the billing_cluster_cost_lookup for future utilization" | ||
| type: enhancement | ||
| link: https://github.com/elastic/integrations/pull/16182 | ||
| - version: 0.2.3 | ||
| changes: | ||
| - description: "Adding deployment filter, dataview and moving config portion to bottom of dashboard for better usability." | ||
| type: enhancement | ||
| link: https://github.com/elastic/integrations/pull/16153 | ||
| - version: 0.2.2 | ||
| changes: | ||
| - description: "Allow setting the Conversion Rate per time window in the configuration lookup index and adding collapsable sections in the dashboard for better usability." | ||
| type: enhancement | ||
| link: https://github.com/elastic/integrations/pull/14545 | ||
| - version: 0.2.1 | ||
| changes: | ||
| - description: "Fixing the issue of visualisation not displaying values due to integer value division in ESQL. Changed to use `double` values instead." | ||
| type: enhancement | ||
| link: https://github.com/elastic/integrations/pull/14545 | ||
| - version: 0.2.0 | ||
| changes: | ||
| - description: "Make use of the new elastic-package version, which will create the lookup index automatically when installing the package. " | ||
| type: enhancement | ||
| link: https://github.com/elastic/integrations/pull/14545 | ||
| - version: 0.1.7 | ||
| changes: | ||
| - description: "Swap the use of deployment_id or deployment name to a concatenation of both, to make it easier to identify the deployment in the dashboard." | ||
| type: enhancement | ||
| link: https://github.com/elastic/integrations/pull/14545 | ||
| - version: 0.1.6 | ||
| changes: | ||
| - description: "Remove the use of usage alias, and stick to using `monitoring-indices` as usage sorce. ES Integration transform should be run regardless of wether the ES integration has been installed on an agent or not. This fix will increase performance, when relying on Stack Monitoring data. Also, use `metrics-ess_billing.billing-*` to be able to use not only the default namespace." | ||
| type: enhancement | ||
| link: https://github.com/elastic/integrations/pull/14545 | ||
| - version: 0.1.5 | ||
| changes: | ||
| - description: "Fixing the control error in the dashboard by adding a data view." | ||
| type: enhancement | ||
| link: https://github.com/elastic/integrations/pull/14545 | ||
| - version: 0.1.4 | ||
| changes: | ||
| - description: "Consistent naming of `datastream`. Add `| LIMIT 5000` to ESQL top query to cater for large organisations." | ||
| type: enhancement | ||
| link: https://github.com/elastic/integrations/pull/14545 | ||
| - version: 0.1.3 | ||
| changes: | ||
| - description: "Made sure the colour palette is predictable by using the eui_amsterdam_color_blind palate. Add ECU rate to the dashboard." | ||
| type: enhancement | ||
| link: https://github.com/elastic/integrations/pull/14545 | ||
| - version: 0.1.2 | ||
| changes: | ||
| - description: "Added the necessary fields to the billing_cluster_cost_lookup in the Elasticsearch transform to allow for correlation with the ES integration." | ||
| type: enhancement | ||
| link: https://github.com/elastic/integrations/pull/14545 | ||
| - version: 0.1.1 | ||
| changes: | ||
| - description: "Fixed the dashboard chargeback timeframe calculation for cost and ECU utilisation" | ||
| type: enhancement | ||
| link: https://github.com/elastic/integrations/pull/14545 | ||
| - version: 0.1.0 | ||
| changes: | ||
| - description: "Initial release of the chargeback integration." | ||
| type: enhancement | ||
| link: https://github.com/elastic/integrations/pull/14545 |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected spelling of 'tie' to '59:59.999Z' to complete the timestamp format.