-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[exporter/elasticsearch] Use exporterhelper/batchsender for reliability #32632
Changes from 132 commits
3cd3903
4341a37
5c04204
39f0fa1
bc692f0
c5fd34c
7fdf150
97d5c19
5608755
c5cbe79
889fe7f
f71d3b2
709ca01
4474b3b
215295e
961ffab
683dca3
e46b165
ca810d4
364e27d
946bf2f
ea3edd3
16ece1c
9d83826
bf115c2
7258704
2958362
6db1c65
ae1c8cb
332ee78
7978e89
33c4bfe
6bcb999
8aa720b
160de51
686e8ae
10a2748
562ebaa
d48f75f
6ecce33
6178e2f
8afdc94
04495ff
33df16b
7c4b5b2
91ea93d
ec37579
f596602
32029c2
babf6ad
454540a
14e1804
8e15426
7dfea92
622d6e8
36bcd9c
18e9ce4
179430b
6fc2df9
da8a530
caa68cf
428f7d7
714e26d
cc83ff1
99f2b63
5f05116
b91a2ad
3a7c19c
ca69b64
e68b427
7312e36
5d428e1
d7fbaa8
27f055d
3a17fa5
a354670
f6910bb
9313acb
96fa3f8
4f591d0
cf6f599
01e1d2c
0e5c381
f6ad0e2
16b3a79
b98f7b2
d17554d
77e99c9
78a8800
0939b45
71d84f6
4bd68f5
6fb2fc3
4c2ba65
1b14f4a
ad23b96
8d4d4c1
97494ed
d9bd55f
0e75f8b
664849f
4006cd9
3828a87
dae3590
1d10bbf
ebd99e2
a64b62b
d5a0d16
0b4acf3
ae9440d
8591c12
031c3c8
8d698f2
eae71c0
0c39b73
a0c4c06
887b06d
20c5799
18ce06d
5652cbf
4acd29e
6140933
5b25bba
c580f37
ac900bd
fd56e5f
96ce9d6
7c24c27
c62b153
fb6fa02
b860dc0
9ee882e
e783105
8d61908
6a5a2e3
50e9353
e87e797
38c01eb
9113d1c
1ad29d4
862876e
7b289f1
d7881b3
be1a7f7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
# Use this changelog template to create an entry for release notes. | ||
|
||
# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix' | ||
change_type: deprecation | ||
|
||
# The name of the component, or a single word describing the area of concern, (e.g. filelogreceiver) | ||
component: elasticsearchexporter | ||
|
||
# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`). | ||
note: Improve reliability when used with persistent queue. Deprecate config options `flush.*`, use `batcher.*` instead. | ||
|
||
# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists. | ||
issues: [32377] | ||
|
||
# (Optional) One or more lines of additional information to render under the primary note. | ||
# These lines will be padded with 2 spaces and then inserted directly into the document. | ||
# Use pipe (|) for multiline entries. | ||
subtext: | | ||
Move buffering from bulk indexer to batch sender to improve reliability. | ||
With this change, there should be no event loss when used with persistent queue in the event of a collector crash. | ||
Introduce `batcher.*` to configure the batch sender which is now enabled by default. | ||
Option `flush.bytes` is deprecated. Use the new `batcher.min_size_items` option to control the minimum number of items (log records, spans) to trigger a flush. `batcher.min_size_items` will be set to the value of `flush.bytes` / 1000 if `flush.bytes` is non-zero. | ||
Option `flush.interval` is deprecated. Use the new `batcher.flush_timeout` option to control max age of buffer. `batcher.flush_timeout` will be set to the value of `flush.interval` if `flush.interval` is non-zero. | ||
Queue sender `sending_queue.enabled` defaults to `true`. | ||
|
||
# If your change doesn't affect end users or the exported elements of any package, | ||
# you should instead start your pull request title with [chore] or use the "Skip Changelog" label. | ||
# Optional: The change log or logs in which this entry should be included. | ||
# e.g. '[user]' or '[user, api]' | ||
# Include 'user' if the change is relevant to end users. | ||
# Include 'api' if there is a change to a library API. | ||
# Default: '[user]' | ||
change_logs: [user] |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -78,7 +78,20 @@ All other defaults are as defined by [confighttp]. | |
|
||
### Queuing | ||
|
||
The Elasticsearch exporter supports the common [`sending_queue` settings][exporterhelper]. However, the sending queue is currently disabled by default. | ||
The Elasticsearch exporter supports the common [`sending_queue` settings][exporterhelper]. The sending queue is enabled by default. | ||
|
||
Default `num_consumers` is `100`. | ||
carsonip marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
When persistent queue is used, there should be no event loss even on collector crashes. | ||
carsonip marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
### Batching | ||
|
||
The Elasticsearch exporter supports the common `batcher` settings. | ||
carsonip marked this conversation as resolved.
Show resolved
Hide resolved
carsonip marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
- `enabled` (default=true): Enable batching of requests into a single bulk request. | ||
- `min_size_items` (default=5000): Minimum number of log records / spans in the buffer to trigger a flush immediately. | ||
- `max_size_items` (default=10000): Maximum number of log records / spans in a request. | ||
- `flush_timeout` (default=30s): Maximum time of the oldest item spent inside the buffer, aka "max age of buffer". A flush will happen regardless of the size of content in buffer. | ||
|
||
### Elasticsearch document routing | ||
|
||
|
@@ -160,10 +173,10 @@ This can be configured through the following settings: | |
The Elasticsearch exporter uses the [Elasticsearch Bulk API] for indexing documents. | ||
The behaviour of this bulk indexing can be configured with the following settings: | ||
|
||
- `num_workers` (default=runtime.NumCPU()): Number of workers publishing bulk requests concurrently. | ||
- `num_workers` (default=runtime.NumCPU()): Maximum number of concurrent bulk requests. | ||
- `flush`: Event bulk indexer buffer flush settings | ||
- `bytes` (default=5000000): Write buffer flush size limit. | ||
- `interval` (default=30s): Write buffer flush time limit. | ||
- `bytes` (DEPRECATED, use `batcher.min_size_items` instead): Write buffer flush size limit. | ||
- `interval` (DEPRECATED, use `batcher.flush_timeout` instead): Maximum time of the oldest item spent inside the buffer, aka "max age of buffer". A flush will happen regardless of the size of content in buffer. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Given that the What we could do is mention here the advantages of using the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It controls the same behavior of batching, and flush settings are converted to batcher settings anyway. It is marked deprecated to make it obvious to users that using batcher settings is the recommended way. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I understand the benefits of using the batch sender (configured in the
This pull request in current state not only makes the experimental batch sender the default, but also leaves users with it as the only option, effectively removing the functionality that has existed before. If I understand correctly, disabling the batcher sender by setting I propose the following changes to this pull request:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
That is incorrect. There is batching baked into bulk indexer implementation in the current main (see code). Batcher is added and enabled by default to replace the existing functionality of batching within bulk indexer without change in behavior.
As explained above, setting batcher::enabled to true is actually breaking, rather than the opposite.
I believe this is a separate topic. Marking it deprecated or not doesn't affect the actual behavior. Therefore, I would like to know if you are concerned about the fact that bytes based flushing will only be approximated after the PR, or the fact that new recommended config is experiemental.
Although batcher is new, I don't see why an upstream helper is necessarily less preferable to an existing batching feature within the exporter with possibly lower test coverage. |
||
- `retry`: Elasticsearch bulk request retry settings | ||
- `enabled` (default=true): Enable/Disable request retry on error. Failed requests are retried with exponential backoff. | ||
- `max_requests` (default=3): Number of HTTP request retries. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bike shed, but mostly interested if this is for parity with commit messages or a desire for how release notes/change log reads
I understand rationale of imperative form in commit messages (do this, vs this does this). That said, is it more natural to as a reader of release notes/change log to say the action taking place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't mind the bike shed. Not a native speaker so both read identical to me. I use imperative form in git commit messages and therefore the same here.
Would be nice if we can be consistent within this component and even across components. There is no guideline in CONTRIBUTING on what's preferred. Looking at https://github.com/open-telemetry/opentelemetry-collector-contrib/releases it is ~50/50 at the moment.
Would appreciate your thoughts on how to move forward in this PR and in future PRs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TL;DR; I suggest we follow Go on commit style, and happy to raise a PR to the CONTRIBUTING file if there are a couple 👍
Personally, I think release notes are different than commit messages, but I think it helps to consolidate the commit style since you noticed it is 50/50. Release notes, basically I would save to after we agree on commit style ;)
What reads imperative to me ("do this") sounds different if you prefix first. For example, the reason it doesn't sound bad in Go, is because they spell it out. They suggest to think of the prefix "This change modifies Go to _____.", and if that works out, use it as the commit first line (after the prefix of the package affected). You can imagine in this project, we could say "This change modifies the opentelemetry collector exporter/elasticsearch to use exporthelper/batchsender for reliability. Sounds fine to me!
Here's the doc on Go, and I suggest we use/blame it on them, basically derive and cite this is following go contribution style. https://go.dev/doc/contribute#commit_messages
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL. Yes, I agree on the commit style. It is also close to my usual practice.