[Agent Builder] Initial API tests with mocked LLM #234985

sorenlouv · 2025-09-12T21:53:38Z

Closes https://github.com/elastic/search-team/issues/10970

This PR adds two API tests for the Agent Builder converse endpoint:

Simple conversation test: Verifies basic conversation functionality, including title generation and response handling
ESQL query test: Tests complex tool-calling flows including esql generation, searching ES and returning a structured tool result to the LLM

Instead of calling a real LLM these tests call a simulated LLM (called "LLM Proxy"). It's a lightweight HTTP server that simulates LLM responses. The proxy:

Acts as a drop-in replacement for an actual LLM
Intercepts requests to the LLM and returns predefined responses based on pattern matching
Supports complex conversation flows including tool calls
Enables testing the full chain of interactions between Agent Builder and the LLM

This approach allows us to test the integration between Agent Builder and LLMs in a deterministic, cheap and fast way.

…urce-definitions/scripts/fix-location-collection.ts'

…no-cache --fix'

x-pack/platform/packages/shared/onechat/onechat-common/tools/tool_result.ts

x-pack/solutions/chat/plugins/workchat-app/server/services/chat/generate_conversation_title.ts

x-pack/platform/test/onechat_api_integration/utils/llm_proxy/scenarios.ts

jbudz

.buildkite/ftr_platform_stateful_configs.yml

csr

x-pack/platform/test/tsconfig.json changes LGTM 👍

elasticmachine · 2025-09-16T14:21:16Z

💛 Build succeeded, but was flaky

Buildkite Build
Commit: 99bedeb

Failed CI Steps

FTR Configs #136

Test Failures

[job] [logs] FTR Configs #136 / discover/security/context_awareness cell renderer ES|QL mode should render alert workflow status badge

Metrics [docs]

✅ unchanged

History

💔 Build #339721 failed 5476ebf
💚 Build #339333 succeeded d1817a0
💔 Build #339160 failed 29f2cc8
💔 Build #339113 failed 02085e5
💔 Build #339067 failed a874030

A spec was left out in #234985.

Follow up to #234985 The chart type was hardcoded to `Line`. This is removed and instead Lens will auto-detect the most likely chart type

Closes elastic/search-team#10970 This PR adds two API tests for the Agent Builder converse endpoint: - Simple conversation test: Verifies basic conversation functionality, including title generation and response handling - ESQL query test: Tests complex tool-calling flows including esql generation, searching ES and returning a structured tool result to the LLM Instead of calling a real LLM these tests call a simulated LLM (called "LLM Proxy"). It's a lightweight HTTP server that simulates LLM responses. The proxy: - Acts as a drop-in replacement for an actual LLM - Intercepts requests to the LLM and returns predefined responses based on pattern matching - Supports complex conversation flows including tool calls - Enables testing the full chain of interactions between Agent Builder and the LLM This approach allows us to test the integration between Agent Builder and LLMs in a deterministic, cheap and fast way. --------- Co-authored-by: kibanamachine <[email protected]>

…c#235757) A spec was left out in elastic#234985.

Follow up to elastic#234985 The chart type was hardcoded to `Line`. This is removed and instead Lens will auto-detect the most likely chart type

Closes elastic/search-team#10970 This PR adds two API tests for the Agent Builder converse endpoint: - Simple conversation test: Verifies basic conversation functionality, including title generation and response handling - ESQL query test: Tests complex tool-calling flows including esql generation, searching ES and returning a structured tool result to the LLM Instead of calling a real LLM these tests call a simulated LLM (called "LLM Proxy"). It's a lightweight HTTP server that simulates LLM responses. The proxy: - Acts as a drop-in replacement for an actual LLM - Intercepts requests to the LLM and returns predefined responses based on pattern matching - Supports complex conversation flows including tool calls - Enables testing the full chain of interactions between Agent Builder and the LLM This approach allows us to test the integration between Agent Builder and LLMs in a deterministic, cheap and fast way. --------- Co-authored-by: kibanamachine <[email protected]>

A spec was left out in #234985.

Follow up to #234985 The chart type was hardcoded to `Line`. This is removed and instead Lens will auto-detect the most likely chart type

Closes elastic/search-team#10970 This PR adds two API tests for the Agent Builder converse endpoint: - Simple conversation test: Verifies basic conversation functionality, including title generation and response handling - ESQL query test: Tests complex tool-calling flows including esql generation, searching ES and returning a structured tool result to the LLM Instead of calling a real LLM these tests call a simulated LLM (called "LLM Proxy"). It's a lightweight HTTP server that simulates LLM responses. The proxy: - Acts as a drop-in replacement for an actual LLM - Intercepts requests to the LLM and returns predefined responses based on pattern matching - Supports complex conversation flows including tool calls - Enables testing the full chain of interactions between Agent Builder and the LLM This approach allows us to test the integration between Agent Builder and LLMs in a deterministic, cheap and fast way. --------- Co-authored-by: kibanamachine <[email protected]>

…c#235757) A spec was left out in elastic#234985.

Follow up to elastic#234985 The chart type was hardcoded to `Line`. This is removed and instead Lens will auto-detect the most likely chart type

sorenlouv force-pushed the one-chat-api-test branch 2 times, most recently from bd2855f to 5de7dc4 Compare September 12, 2025 22:04

sorenlouv added 5 commits September 13, 2025 15:02

[OneChat] API tests

99602e1

Initial simple API test

b4fec5e

Refactor and add synthtrace

680483b

WIP

ef6ff44

Tests passing

214bfa5

sorenlouv force-pushed the one-chat-api-test branch from 6ad3862 to 214bfa5 Compare September 13, 2025 13:11

kibanamachine and others added 4 commits September 13, 2025 15:03

[CI] Auto-commit changed files from 'ts-node .buildkite/pipeline-reso…

0e9b3b3

…urce-definitions/scripts/fix-location-collection.ts'

[CI] Auto-commit changed files from 'node scripts/eslint_all_files --…

b14b947

…no-cache --fix'

Merge branch 'main' of github.com:elastic/kibana into one-chat-api-test

f33af2e

Undo eslint changes

a874030

sorenlouv force-pushed the one-chat-api-test branch from ef74183 to a874030 Compare September 14, 2025 08:39

sorenlouv added 6 commits September 14, 2025 21:00

Merge branch 'main' of github.com:elastic/kibana into one-chat-api-test

d464d4b

Add void

02085e5

Add void to esql test

29f2cc8

Fix API tests and test for correct esql

8d47957

Add scenarios file

8bf4a13

Merge branch 'main' into one-chat-api-test

e670cf5

sorenlouv marked this pull request as ready for review September 15, 2025 11:20

sorenlouv requested review from a team as code owners September 15, 2025 11:20

sorenlouv added release_note:skip Skip the PR/issue when compiling release notes backport:skip This PR does not require backporting labels Sep 15, 2025

Change QueryResult type

d41824f

sorenlouv changed the title ~~[OneChat] Initial API tests with mocked LLM~~ [Agent Builder] Initial API tests with mocked LLM Sep 15, 2025

Extract LLMSimulator

d1817a0

pgayvallet approved these changes Sep 15, 2025

View reviewed changes

jbudz approved these changes Sep 15, 2025

View reviewed changes

csr approved these changes Sep 16, 2025

View reviewed changes

joemcelroy approved these changes Sep 16, 2025

View reviewed changes

Address feedback and minor cleanup

5476ebf

sorenlouv enabled auto-merge (squash) September 16, 2025 09:49

sorenlouv added 3 commits September 16, 2025 13:10

Fix type

91474ef

Improve types

7df3deb

Merge branch 'main' into one-chat-api-test

99bedeb

sorenlouv merged commit eea4c86 into elastic:main Sep 16, 2025
13 checks passed

kibanamachine added the v9.2.0 label Sep 16, 2025

sorenlouv mentioned this pull request Sep 19, 2025

[Agent Builder] Minor improvement to API test with mocked LLM #235757

Merged

sorenlouv added a commit that referenced this pull request Sep 19, 2025

[Agent Builder] Minor improvement to API test with mocked LLM (#235757)

5aa4605

A spec was left out in #234985.

sorenlouv mentioned this pull request Sep 20, 2025

[Agent Builder] Remove hardcoded Line chart type #235906

Merged

sorenlouv added a commit that referenced this pull request Sep 22, 2025

[Agent Builder] Remove hardcoded Line chart type (#235906)

739ec59

Follow up to #234985 The chart type was hardcoded to `Line`. This is removed and instead Lens will auto-detect the most likely chart type

CAWilson94 pushed a commit to CAWilson94/kibana that referenced this pull request Sep 24, 2025

[Agent Builder] Minor improvement to API test with mocked LLM (elasti…

ecbb9fa

…c#235757) A spec was left out in elastic#234985.

niros1 pushed a commit that referenced this pull request Sep 30, 2025

[Agent Builder] Minor improvement to API test with mocked LLM (#235757)

f805dc1

A spec was left out in #234985.

niros1 pushed a commit that referenced this pull request Sep 30, 2025

[Agent Builder] Remove hardcoded Line chart type (#235906)

acf2841

Follow up to #234985 The chart type was hardcoded to `Line`. This is removed and instead Lens will auto-detect the most likely chart type

rylnd pushed a commit to rylnd/kibana that referenced this pull request Oct 17, 2025

[Agent Builder] Minor improvement to API test with mocked LLM (elasti…

2356850

…c#235757) A spec was left out in elastic#234985.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Agent Builder] Initial API tests with mocked LLM #234985

[Agent Builder] Initial API tests with mocked LLM #234985

Uh oh!

sorenlouv commented Sep 12, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jbudz left a comment

Uh oh!

csr left a comment

Uh oh!

Uh oh!

elasticmachine commented Sep 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

[Agent Builder] Initial API tests with mocked LLM #234985

[Agent Builder] Initial API tests with mocked LLM #234985

Uh oh!

Conversation

sorenlouv commented Sep 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jbudz left a comment

Choose a reason for hiding this comment

Uh oh!

csr left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

elasticmachine commented Sep 16, 2025

💛 Build succeeded, but was flaky

Failed CI Steps

Test Failures

Metrics [docs]

History

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

sorenlouv commented Sep 12, 2025 •

edited

Loading