Add HostedToolSearchTool with DeferredTools/NonDeferredTools for tool search / deferred loading support#7377
Add HostedToolSearchTool with DeferredTools/NonDeferredTools for tool search / deferred loading support#7377
Conversation
…r tool search support Co-authored-by: stephentoub <[email protected]>
|
@copilot, I'm wondering if a different design would be more beneficial. What would you think about isolating it to just the HostedToolSearchTool, which would have two collections of either AITool instances or string names, one for Enable and one for Disable. By default, both of these collections would be null, in which case it would treat every tool as Enable and mark every tool as defer-loading. If Disable was non-null but Enable was null, then it would treat every tool as Enable except for those listed in Disable. Conversely if Enable was non-null and Disable was null, it would only treat those tools in Enable as defer-loading, and if both were non-null, it would treat everything in Enable as defer-loading unless it was also listed in Disable. Thoughts? Is the dominant case for tool search with OpenAI and Anthropic one where most tools are in the same boat and only zero to a small number are special-cased the other way? Or is the right way to model this what we currently have? |
Yes, I think the dominant case is indeed "most/all tools are searchable" — the whole point of tool search is to scale to many tools, so users will almost always want blanket defer-loading with at most a few exceptions. Your proposed design is better for that common case. Today, with 50 tools you'd wrap all 50 in A few thoughts on the details: Enable/Disable semantics: The logic makes sense and is consistent with how String names vs AITool references: String names (like Namespace: The current design has per-tool What we'd lose: The Overall I think this is a better design. Want me to implement it? |
@copilot, yes, please |
…rredTools/NonDeferredTools - Redesign HostedToolSearchTool with DeferredTools/NonDeferredTools properties - Remove SearchableAIFunctionDeclaration (no longer needed) - Revert DelegatingAIFunctionDeclaration to internal - Update OpenAI provider: use HostedToolSearchTool enable/disable logic for defer_loading - Add ChatOptions parameter to AsOpenAIResponseTool extension method - Use AOT-safe ModelReaderWriter.Read with OpenAIContext.Default - Update API baselines and tests Co-authored-by: stephentoub <[email protected]>
Implemented in f295d10. Changes:
|
src/Libraries/Microsoft.Extensions.AI.Abstractions/Microsoft.Extensions.AI.Abstractions.json
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.OpenAI/OpenAIResponsesChatClient.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.OpenAI/OpenAIResponsesChatClient.cs
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.OpenAI/OpenAIResponsesChatClient.cs
Outdated
Show resolved
Hide resolved
|
@PederHP any thoughts? |
…tor ToResponseTool signature - Fix O(N²) by finding HostedToolSearchTool once before the tools loop instead of scanning the list for each tool - Remove HostedToolSearchTool from json baseline (experimental types don't need entries) - Refactor ToResponseTool(AITool, ...) to take HostedToolSearchTool? directly instead of extracting from ChatOptions each time - Remove FindToolSearchTool helper method (inlined into callers) Co-authored-by: stephentoub <[email protected]>
Co-authored-by: stephentoub <[email protected]>
There was a problem hiding this comment.
Pull request overview
Adds a new experimental hosted tool marker (HostedToolSearchTool) to enable “tool_search” and implements deferred loading (defer_loading) support in the OpenAI Responses provider by patching function tools based on DeferredTools / NonDeferredTools configuration.
Changes:
- Introduces
HostedToolSearchTool(experimental) and a newDiagnosticIds.Experiments.AIToolSearchconstant. - Updates OpenAI Responses tool conversion to emit
tool_searchand optionally patch function tools withdefer_loading: true. - Adds/extends unit tests covering tool_search conversion, caching, and deferred-loading selection logic.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| test/Libraries/Microsoft.Extensions.AI.OpenAI.Tests/OpenAIConversionTests.cs | Adds OpenAI conversion tests for tool_search and defer_loading behavior. |
| test/Libraries/Microsoft.Extensions.AI.Abstractions.Tests/Tools/HostedToolSearchToolTests.cs | Adds tests for HostedToolSearchTool construction and list properties. |
| src/Shared/DiagnosticIds/DiagnosticIds.cs | Adds DiagnosticIds.Experiments.AIToolSearch. |
| src/Libraries/Microsoft.Extensions.AI.OpenAI/OpenAIResponsesChatClient.cs | Adds tool_search mapping and deferred-loading patching for function tools. |
| src/Libraries/Microsoft.Extensions.AI.OpenAI/MicrosoftExtensionsAIResponsesExtensions.cs | Updates AsOpenAIResponseTool to accept ChatOptions? for deferred-loading context. |
| src/Libraries/Microsoft.Extensions.AI.OpenAI/Microsoft.Extensions.AI.OpenAI.json | Updates API baseline for the modified extension method signature. |
| src/Libraries/Microsoft.Extensions.AI.Abstractions/Tools/HostedToolSearchTool.cs | Adds the new experimental HostedToolSearchTool abstraction with configuration lists. |
| using System.Collections.Generic; | ||
| using System.Diagnostics.CodeAnalysis; | ||
| using Microsoft.Shared.DiagnosticIds; | ||
|
|
There was a problem hiding this comment.
System.Diagnostics.CodeAnalysis is imported but not used in this file. If analyzers like IDE0005 are enforced (often as warnings-as-errors), this will break the build; please remove the unused using.
There was a problem hiding this comment.
@copilot remove the using if it's actually unnecessary
There was a problem hiding this comment.
The System.Diagnostics.CodeAnalysis using is actually needed — it's required for the [Experimental] attribute on line 22. The automated reviewer was incorrect. HostedImageGenerationTool has the same pattern (experimental + same using).
src/Libraries/Microsoft.Extensions.AI.OpenAI/OpenAIResponsesChatClient.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.OpenAI/MicrosoftExtensionsAIResponsesExtensions.cs
Show resolved
Hide resolved
test/Libraries/Microsoft.Extensions.AI.OpenAI.Tests/OpenAIConversionTests.cs
Show resolved
Hide resolved
test/Libraries/Microsoft.Extensions.AI.Abstractions.Tests/Tools/HostedToolSearchToolTests.cs
Show resolved
Hide resolved
|
|
||
| return OpenAIResponsesChatClient.ToResponseTool( | ||
| tool, | ||
| OpenAIResponsesChatClient.FindToolSearchTool(options?.Tools), |
There was a problem hiding this comment.
@copilot why was it valuable to change the signature of ToResponseTool like this? I'm not seeing the benefits rather than just having ToResponseTool call FindToolSearchTool if/when it needs to.
There was a problem hiding this comment.
Oh, nevermind, I see, because we need to call ToResponseTool for each AIFunction
There was a problem hiding this comment.
You're right — there was no benefit to exposing the HostedToolSearchTool? parameter on the internal ToResponseTool. Fixed in b53a6b9: added a simpler ToResponseTool(AITool, ChatOptions?) overload that calls FindToolSearchTool internally. The 3-param version and FindToolSearchTool are now both private. The internal loop still pre-finds the tool for O(N) efficiency.
| var json = ModelReaderWriter.Write(result!, ModelReaderWriterOptions.Json).ToString(); | ||
| Assert.DoesNotContain("defer_loading", json); | ||
| } | ||
|
|
There was a problem hiding this comment.
@copilot please add one or more OpenAI unit tests that confirm the right JSON is sent when using HostedToolSearchTool in various ways, and an integration test as well.
There was a problem hiding this comment.
Added in ead292d. Six new VerbatimHttpHandler unit tests in OpenAIResponseClientTests.cs that validate the exact JSON request body for various HostedToolSearchTool configurations:
ToolSearchTool_OnlyToolSearch_NonStreaming— justtool_searchin tools arrayToolSearchTool_AllToolsDeferred_NonStreaming— both lists null → all function tools getdefer_loading: trueToolSearchTool_SpecificDeferredTools_NonStreaming—DeferredTools = ["GetWeather"]→ only that tool deferredToolSearchTool_NonDeferredExclusion_NonStreaming—NonDeferredTools = ["ImportantTool"]→ all except that tool deferredToolSearchTool_BothLists_DisableTakesPrecedence_NonStreaming— both lists set, NonDeferred winsToolSearchTool_NoFunctionTools_NonStreaming— tool_search + web_search, no function tools
Plus an integration test UseToolSearch_WithDeferredFunctions in OpenAIResponseClientIntegrationTests.cs.
…SearchTool private Co-authored-by: stephentoub <[email protected]>
…ation test Co-authored-by: stephentoub <[email protected]>
| /// </remarks> | ||
| public static ResponseTool? AsOpenAIResponseTool(this AITool tool) => | ||
| OpenAIResponsesChatClient.ToResponseTool(Throw.IfNull(tool)); | ||
| public static ResponseTool? AsOpenAIResponseTool(this AITool tool, ChatOptions? options = null) |
There was a problem hiding this comment.
| public static ResponseTool? AsOpenAIResponseTool(this AITool tool, ChatOptions? options = null) | |
| public static ResponseTool? AsOpenAIResponseTool(this AITool tool, ChatOptions? options) |
options = null may make it too easy to miss the defer_loading savings. Only alternative that comes to mind is to make it non-optional but that will make code look ugly meaiTool.AsOpenAIResponseTool(null).
Since this is a subset of a subset of cases i.e. MEAI to Responses conversion with HostedFileSearchTool, it may be fine as is.
| return functionTool; | ||
|
|
||
| case HostedToolSearchTool: | ||
| return _toolSearchResponseTool ??= ModelReaderWriter.Read<ResponseTool>(BinaryData.FromString("""{"type": "tool_search"}"""), ModelReaderWriterOptions.Json, OpenAIContext.Default)!; |
There was a problem hiding this comment.
Is it OK to return the shared singleton? It may be corrupted by callers.
| null, | ||
| }; | ||
|
|
||
| case HostedMcpServerTool mcpTool: |
There was a problem hiding this comment.
Mcp toolsets also interact with the tool-search tool.
https://developers.openai.com/api/docs/guides/tools-connectors-mcp#defer-loading-tools-in-an-mcp-server
https://platform.claude.com/docs/en/agents-and-tools/tool-use/tool-search-tool#mcp-integration
| null, | ||
| }; | ||
|
|
||
| case HostedMcpServerTool mcpTool: |
There was a problem hiding this comment.
Anthropic also allows setting per-tool config for an MCP server (e.g., defer_loading overrides for individual tools). HostedMcpServerTool doesn't currently have a way to express per-tool config, so an Anthropic IChatClient implementation would likely need a convention like ServerName/ToolName in the DeferredTools/NonDeferredTools lists as a workaround.
Implements tool search and deferred loading support (issue #7371): a
HostedToolSearchToolmarker withDeferredTools/NonDeferredToolscollections that control which tools get deferred loading. By default, all tools are deferred when aHostedToolSearchToolis present. OpenAI Responses API support included; Anthropic follows separately.New abstractions (
Microsoft.Extensions.AI.Abstractions)HostedToolSearchTool— markerAITool(same pattern asHostedWebSearchTool/HostedCodeInterpreterTool); maps to thetool_searchhosted tool. IncludesDeferredToolsandNonDeferredToolsproperties (IList<string>?) to control per-tool deferred loading:null(default) → all tools get deferred loadingDeferredToolsnon-null,NonDeferredToolsnull → only listed tools get deferred loadingDeferredToolsnull,NonDeferredToolsnon-null → all tools except listed ones get deferred loadingDeferredToolsget deferred loading unless also inNonDeferredToolsDiagnosticIds.Experiments.AIToolSearch— new constant (maps to existingMEAI001)HostedToolSearchToolis[Experimental(AIToolSearch)].OpenAI provider (
Microsoft.Extensions.AI.OpenAI)HostedToolSearchTool→ deserializedResponseToolfrom{"type":"tool_search"}via AOT-safeModelReaderWriter.ReadwithOpenAIContext.Default, cached in a static fieldHostedToolSearchToolinChatOptions.Toolsand patchesdefer_loading: trueonto matchingFunctionToolinstances based on theDeferredTools/NonDeferredToolslogicAsOpenAIResponseToolextension now accepts optionalChatOptions?parameter for defer_loading contextTests
OpenAIResponseClientTests.cs): SixVerbatimHttpHandler-based tests validating the exact JSON request body sent for variousHostedToolSearchToolconfigurations: tool_search only, all tools deferred, specific deferred tools, non-deferred exclusion, both lists with precedence, and mixed with other hosted tools (web search)OpenAIConversionTests.cs): Tests forAsOpenAIResponseToolextension covering tool_search conversion, caching, defer_loading with all combinations ofDeferredTools/NonDeferredToolsHostedToolSearchToolTests.cs): Tests forHostedToolSearchToolproperties and defaultsOpenAIResponseClientIntegrationTests.cs):UseToolSearch_WithDeferredFunctionstest exercisingHostedToolSearchToolwith real function tools against the OpenAI APIUsage
Original prompt
Problem
Implement tool search and deferred loading support as described in #7371. Both OpenAI and Anthropic now support tool search, where tool definitions can be sent with deferred loading (only name/description sent upfront, full schema deferred) and a special
tool_searchhosted tool is included that the model can invoke to search for and load full tool definitions on demand.Design
Follow Option A from the issue discussion — a
HostedToolSearchToolmarker tool + aSearchableAIFunctionDeclarationdecorator, consistent with existing patterns (HostedWebSearchTool,ApprovalRequiredAIFunction, etc.).Requirements
1. New types in
Microsoft.Extensions.AI.AbstractionsHostedToolSearchTool(insrc/Libraries/Microsoft.Extensions.AI.Abstractions/Tools/HostedToolSearchTool.cs)AIToolsubclass, following the exact same pattern asHostedWebSearchToolandHostedCodeInterpreterTool.Namereturns"tool_search".IReadOnlyDictionary<string, object?>? additionalProperties).[Experimental(DiagnosticIds.Experiments.AIToolSearch, UrlFormat = DiagnosticIds.UrlFormat)].SearchableAIFunctionDeclaration(insrc/Libraries/Microsoft.Extensions.AI.Abstractions/Functions/SearchableAIFunctionDeclaration.cs)DelegatingAIFunctionDeclaration(which is currentlyinternal). Important:DelegatingAIFunctionDeclarationis the declaration-only delegating base (notDelegatingAIFunctionwhich requiresAIFunction). This is becauseSearchableAIFunctionDeclarationshould work withAIFunctionDeclarationinstances that may not haveInvokeAsync.AIFunctionDeclaration innerFunctionandstring? namespaceName = null.Namespaceproperty (string?) for grouping related tools.sealed.[Experimental(DiagnosticIds.Experiments.AIToolSearch, UrlFormat = DiagnosticIds.UrlFormat)].public static IList<AITool> CreateToolSet(IEnumerable<AIFunctionDeclaration> functions, string? namespaceName = null, IReadOnlyDictionary<string, object?>? toolSearchProperties = null)that wraps all functions asSearchableAIFunctionDeclarationand prepends aHostedToolSearchTool, returning a complete tool list ready forChatOptions.Tools.2. DiagnosticIds update
In
src/Shared/DiagnosticIds/DiagnosticIds.cs, add a new constant in theExperimentsclass:Place it alongside the other AI experiment constants (near
AIWebSearch,AICodeInterpreter, etc.).3. OpenAI provider implementation
In
src/Libraries/Microsoft.Extensions.AI.OpenAI/OpenAIResponsesChatClient.cs:In
ToResponseTool(AITool tool, ChatOptions? options = null)method:Add handling for
HostedToolSearchTool— this maps to the OpenAItool_searchresponse tool. Since the underlying OpenAI .NET SDK likely doesn't have aToolSearchToolclass yet, you need to manually construct aResponseToolfrom JSON. Cache the deserializedResponseToolinstance in a static field so it's only created once. Use theModelReaderWriterpattern or direct JSON deserialization to create aResponseToolfrom the JSON{"type": "tool_search"}. Pattern:Add a
private static ResponseTool? s_toolSearchResponseTool;field to cache it.For
SearchableAIFunctionDeclaration: When anAIFunctionDeclarationis detected as having aSearchableAIFunctionDeclarationviaGetService<SearchableAIFunctionDeclaration>(), the generatedFunctionToolshould havedefer_loadingset totrueand optionally include thenamespacemetadata. Since the OpenAI SDK'sFunctionToolclass may not have these properties yet, use thePatchproperty to set them on the JSON. The check should happen in the existingcase AIFunctionDeclaration aiFunction:branch — after callingToResponseTool(aiFunction, options), check if the originaltool(oraiFunction) hasGetService<SearchableAIFunctionDeclaration>()and if so, patch the resultingFunctionToolwithdefer_loadingandnamespace. This is done in theToResponseTool(AITool, ChatOptions?)method so it doesn't infect the generalToResponseTool(AIFunctionDeclaration, ChatOptions?)helper. Specifically, thecase AIFunctionDeclaration aiFunction:case should become: