Skip to content

Explore adding support for tool calling in HuggingFaceAPIChatGenerator when streaming #9369

@sjrl

Description

@sjrl

Describe the Feature
It would be great to add support for tool calling when running HuggingFaceAPIChatGenerator in streaming mode.

As shown here

text = choice.delta.content or ""
generated_text += text

we only process the generated text here and only store it as text content here

message = ChatMessage.from_assistant(text=generated_text, meta=meta)

whereas we should properly populate the tool_calls param of ChatMessage if a tool call is present.

The underlying HuggingFace streaming chunk dataclass does contain tool call information

@dataclass_with_extra
class ChatCompletionStreamOutputDelta(BaseInferenceType):
    role: str
    content: Optional[str] = None
    tool_call_id: Optional[str] = None
    tool_calls: Optional[List[ChatCompletionStreamOutputDeltaToolCall]] = None

Additional context
It looks like _run_streaming would need to be updated to process tool calling streaming chunks.

To Reproduce

from haystack.tools import Tool
from haystack.dataclasses import ChatMessage
from haystack.components.generators.chat.hugging_face_api import HuggingFaceAPIChatGenerator
from haystack.components.generators.utils import print_streaming_chunk

def get_weather(city: str) -> str:
    """Get weather information for a city."""
    return f"The weather in {city} is Sunny and 22 C"

tool_parameters = {"type": "object", "properties": {"city": {"type": "string"}}, "required": ["city"]}
tool = Tool(
    name="weather",
    description="useful to determine the weather in a given location",
    parameters=tool_parameters,
    function=get_weather,
)

chat_messages = [ChatMessage.from_user("What's the weather like in Paris?")]
generator = HuggingFaceAPIChatGenerator(
    api_type=HFGenerationAPIType.SERVERLESS_INFERENCE_API,
    api_params={"model": "NousResearch/Hermes-3-Llama-3.1-8B"},
    generation_kwargs={"temperature": 0.5},
    streaming_callback=print_streaming_chunk,
)
results = generator.run(chat_messages, tools=[tool])

Metadata

Metadata

Assignees

No one assigned

    Labels

    P3Low priority, leave it in the backlogtype:featureNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions