ESQL: Account for field readers in breaker #140666

nik9000 · 2026-01-14T15:24:29Z

This seeks to prevent out of memory errors by accounting for the memory usage of field readers. We don't have measurements for the actual memory usage, so instead we add an estimate to the breaker. An overestimate, hopefully.

This makes us circuit break when loading many many many fields on small heaps rather than crashing with an out of memory.

This also allows readers to cache things a little more aggressively, so long as they are willing to circuit break when receiving a huge number of requests.

This seaks to prevent out of memory errors by accounting for the memory usage of field readers. We don't have measurements for the actual memory usage, so instead we add an estiamte to the breaker. An overestimate, hopefully. This makes us circuit break when loading many many many fields on small heaps rather than crashing with an out of memory. This also allows readers to cache things a little more aggressively, so long as they are willing to circuit break when receiving a huge number of requests.

nik9000 · 2026-01-14T15:26:05Z

server/src/main/java/org/elasticsearch/index/mapper/BlockLoader.java

    }

-    interface Reader {
+    interface Reader extends Releasable {


Here's the main change in the PR. Everything else is in service of making this Releasable and passing in CircuitBreaker.

nik9000 · 2026-01-14T15:26:49Z

server/src/main/java/org/elasticsearch/index/mapper/GeoPointFieldMapper.java

-     * This implies that we need to load the value from _source. This however is very slow, especially when synthetic source is enabled.
-     * We're better off reading from doc_values and converting to BytesRef to satisfy the checker. This is what this block loader is for.
-     */
-    static final class BytesRefFromLongsBlockLoader extends BlockDocValuesReader.DocValuesBlockLoader {


nik9000 · 2026-01-14T15:26:59Z

server/src/main/java/org/elasticsearch/index/mapper/RangeFieldMapper.java

            );
        }

-        public static class DateRangeDocValuesLoader extends BlockDocValuesReader.DocValuesBlockLoader {


nik9000 · 2026-01-14T15:28:29Z

server/src/main/java/org/elasticsearch/index/mapper/blockloader/DelegatingBlockLoader.java

+    }
+
+    @Override
+    public StoredFieldsSpec rowStrideStoredFieldSpec() {


Moved these methods so the inner classes are at the end of the file.

nik9000 · 2026-01-14T15:45:49Z

...g/elasticsearch/index/mapper/blockloader/docvalues/AbstractBytesRefsFromOrdsBlockLoader.java

+     * Circuit breaker space reserved for each reader. Measured in heap dumps
+     * around from 3.5kb to 65kb. This is an intentional overestimate.
+     */
+    public static final long ESTIMATED_SIZE = ByteSizeValue.ofKb(100).getBytes();


This 100kb is pretty controversial I think. Will want some discussion. I'd far prefer to have a good estimate, but I don't see a way to do so right now. I've seen the readers 1kb keyword fields be this large.

If SortedSetDocValues and friends would implement Accountable then we would have much better insight in the on heap memory usage. However I think this is also difficult as the memory usage might be variable as SortedSetDocValues gets used.

Additionally, should we make this configurable via a query pragma or something else?

Accountable
Yeah. I'd be ok doing if (thingy instanceof Accountable) here. I'd prefer not, but sometimes we can't have all we want.

Additionally, should we make this configurable via a query pragma or something else?

Hmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm.

elasticsearchmachine · 2026-01-14T15:53:00Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

elasticsearchmachine · 2026-01-14T15:53:01Z

Hi @nik9000, I've created a changelog YAML for you.

…ad_release

dnhatn

@nik9000 I left a comment. Thank you for taking care of this.

dnhatn · 2026-01-15T05:59:34Z

...n/java/org/elasticsearch/index/mapper/blockloader/docvalues/AbstractBooleansBlockLoader.java

    @Override
-    public AllReader reader(LeafReaderContext context) throws IOException {
+    public AllReader reader(CircuitBreaker breaker, LeafReaderContext context) throws IOException {
+        breaker.addEstimateBytesAndMaybeBreak(ESTIMATED_SIZE, "load blocks");


getSortedNumericDocValues can throw an IOException, so we might leak the memory requested here. I think we need a try/finally block. Also, should we acquire memory in the Reader's constructor and release it in close()? For example, in BooleansBlockDocValuesReader, we release memory - should we also acquire it in the constructor?

Good point Nhat. I think applies for all getXXXDocValues(...) invocation and so try/finally blocks are also needed else where.

Damn. I should write a test that hits that.

I had thought "this can throw if the index is tragically corrupted" so, like, if it threw we were toast anyway. But I'll fix it.

martijnvg

Thanks Nik, I think Nhat's comment should be addressed, but otherwise LGTM.

martijnvg · 2026-01-15T10:04:35Z

...n/java/org/elasticsearch/index/mapper/blockloader/docvalues/AbstractBooleansBlockLoader.java

    @Override
-    public AllReader reader(LeafReaderContext context) throws IOException {
+    public AllReader reader(CircuitBreaker breaker, LeafReaderContext context) throws IOException {
+        breaker.addEstimateBytesAndMaybeBreak(ESTIMATED_SIZE, "load blocks");


Good point Nhat. I think applies for all getXXXDocValues(...) invocation and so try/finally blocks are also needed else where.

martijnvg · 2026-01-15T10:06:49Z

...g/elasticsearch/index/mapper/blockloader/docvalues/AbstractBytesRefsFromOrdsBlockLoader.java

+     * Circuit breaker space reserved for each reader. Measured in heap dumps
+     * around from 3.5kb to 65kb. This is an intentional overestimate.
+     */
+    public static final long ESTIMATED_SIZE = ByteSizeValue.ofKb(100).getBytes();


If SortedSetDocValues and friends would implement Accountable then we would have much better insight in the on heap memory usage. However I think this is also difficult as the memory usage might be variable as SortedSetDocValues gets used.

Additionally, should we make this configurable via a query pragma or something else?

martijnvg · 2026-01-15T10:17:49Z

...ava/org/elasticsearch/index/mapper/blockloader/script/KeywordScriptBlockDocValuesReader.java

+     * shrug because we don't know what the script will do, and we don't know how many doc
+     * values it'll load. And, we're not sure much memory the script itself will actually use.
+     */
+    public static final long ESTIMATED_SIZE = ByteSizeValue.ofKb(300).getBytes();


Scripts have additional heap memory overhead... We can always adjust it down.

…ad_release

nik9000 added the :Analytics/ES|QL AKA ESQL label Jan 14, 2026

elasticsearchmachine added the v9.4.0 label Jan 14, 2026

nik9000 commented Jan 14, 2026

View reviewed changes

Fix

defe5b9

nik9000 commented Jan 14, 2026

View reviewed changes

nik9000 added the >bug label Jan 14, 2026

nik9000 requested review from dnhatn, fang-xing-esql and martijnvg January 14, 2026 15:52

nik9000 marked this pull request as ready for review January 14, 2026 15:52

Merge branch 'main' into esql_load_release

21558cd

elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jan 14, 2026

Update docs/changelog/140666.yaml

4ecf33a

nik9000 added 2 commits January 14, 2026 11:01

Merge remote-tracking branch 'nik9000/esql_load_release' into esql_lo…

8f3158d

…ad_release

Merge branch 'main' into esql_load_release

daa73c3

dnhatn reviewed Jan 15, 2026

View reviewed changes

dnhatn self-requested a review January 15, 2026 06:08

martijnvg approved these changes Jan 15, 2026

View reviewed changes

nik9000 added 4 commits January 15, 2026 10:50

Change limit on this one

b585e7e

Merge remote-tracking branch 'nik9000/esql_load_release' into esql_lo…

efefad7

…ad_release

Merge branch 'main' into esql_load_release

da62c50

Merge branch 'main' into esql_load_release

eee68a1

ESQL: Account for field readers in breaker #140666

Are you sure you want to change the base?

ESQL: Account for field readers in breaker #140666

Uh oh!

Conversation

nik9000 commented Jan 14, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

elasticsearchmachine commented Jan 14, 2026

Uh oh!

elasticsearchmachine commented Jan 14, 2026

Uh oh!

dnhatn left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

martijnvg left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants