You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[exporter/elasticsearch] add support for batcher config (#34238)
**Description:**
Add opt-in support for the experimental batch sender
(open-telemetry/opentelemetry-collector#8122).
When opting into this functionality the exporter's `Consume*` methods
will make synchronous bulk requests to Elasticsearch, without additional
batching/buffering in the exporter.
By default the exporter continues to use its own buffering, which
supports thresholds for time, number of documents, and size of encoded
documents in bytes. The batch sender does not currently support a
bytes-based threshold, and is experimental, hence why we are not yet
making it the default for the Elasticsearch exporter.
This PR is based on
#32632,
but made to be non-breaking.
**Link to tracking Issue:**
#32377
**Testing:**
Added unit and integration tests.
Manually tested with Elasticsearch, using the following config to
highlight that client metadata can now flow through all the way:
```yaml
receivers:
otlp:
protocols:
http:
endpoint: 0.0.0.0:4318
include_metadata: true
exporters:
elasticsearch:
endpoint: "http://localhost:9200"
auth:
authenticator: headers_setter
batcher:
enabled: false
extensions:
headers_setter:
headers:
- action: insert
key: Authorization
from_context: authorization
service:
extensions: [headers_setter]
pipelines:
traces:
receivers: [otlp]
processors: []
exporters: [elasticsearch]
```
I have Elasticsearch running locally, with an "admin" user with the
password "changeme". Sending OTLP/HTTP to the collector with
`telemetrygen traces --otlp-http --otlp-insecure http://localhost:4318
--otlp-header "Authorization=\"Basic YWRtaW46Y2hhbmdlbWU=\""`, I observe
the following:
- Without the `batcher` config, the exporter fails to index data into
Elasticsearch due to an auth error. That's because the exporter is
buffering and dropping the context with client metadata, so there's no
Authorization header attached to the requests going out.
- With `batcher: {enabled: true}`, same behaviour as above. Unlike the
[`batch`
processor](https://github.com/open-telemetry/opentelemetry-collector/tree/main/processor/batchprocessor),
the batch sender does not maintain client metadata.
- With `batcher: {enabled: false}`, the exporter successfully indexes
data into Elasticsearch.
**Documentation:**
Updated the README.
---------
Co-authored-by: Carson Ip <[email protected]>
Copy file name to clipboardExpand all lines: exporter/elasticsearchexporter/README.md
+30Lines changed: 30 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -81,6 +81,33 @@ All other defaults are as defined by [confighttp].
81
81
82
82
The Elasticsearch exporter supports the common [`sending_queue` settings][exporterhelper]. However, the sending queue is currently disabled by default.
83
83
84
+
### Batching
85
+
86
+
> [!WARNING]
87
+
> The `batcher` config is experimental and may change without notice.
88
+
89
+
The Elasticsearch exporter supports the [common `batcher` settings](https://github.com/open-telemetry/opentelemetry-collector/blob/main/exporter/exporterbatcher/config.go).
90
+
91
+
- `batcher`:
92
+
- `enabled` (default=unset): Enable batching of requests into a single bulk request.
93
+
- `min_size_items` (default=5000): Minimum number of log records / spans in the buffer to trigger a flush immediately.
94
+
- `max_size_items` (default=10000): Maximum number of log records / spans in a request.
95
+
- `flush_timeout` (default=30s): Maximum time of the oldest item spent inside the buffer, aka "max age of buffer". A flush will happen regardless of the size of content in buffer.
96
+
97
+
By default, the exporter will perform its own buffering and batching, as configured through the
98
+
`flush`config, and `batcher` will be unused. By setting `batcher::enabled` to either `true` or
99
+
`false`, the exporter will not perform any of its own buffering or batching, and the `flush` config
100
+
will be ignored. In a future release when the `batcher` config is stable, and has feature parity
101
+
with the exporter's existing `flush` config, it will be enabled by default.
102
+
103
+
Using the common `batcher` functionality provides several benefits over the default behavior:
104
+
- Combined with a persistent queue, or no queue at all, `batcher` enables at least once delivery.
105
+
With the default behavior, the exporter will accept data and process it asynchronously,
106
+
which interacts poorly with queuing.
107
+
- By ensuring the exporter makes requests to Elasticsearch synchronously,
108
+
client metadata can be passed through to Elasticsearch requests,
109
+
e.g. by using the [`headers_setter` extension](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/extension/headerssetterextension/README.md).
110
+
84
111
### Elasticsearch document routing
85
112
86
113
Telemetry data will be written to signal specific data streams by default:
@@ -173,6 +200,9 @@ The behaviour of this bulk indexing can be configured with the following setting
173
200
- `max_interval` (default=1m): Max waiting time if a HTTP request failed.
174
201
- `retry_on_status` (default=[429, 500, 502, 503, 504]): Status codes that trigger request or document level retries. Request level retry and document level retry status codes are shared and cannot be configured separately. To avoid duplicates, it is recommended to set it to `[429]`. WARNING: The default will be changed to `[429]` in the future.
175
202
203
+
> [!NOTE]
204
+
> The `flush` config will be ignored when `batcher::enabled` config is explicitly set to `true` or `false`.
205
+
176
206
### Elasticsearch node discovery
177
207
178
208
The Elasticsearch Exporter will regularly check Elasticsearch for available nodes.
0 commit comments