Assistants API | New 'role' to differentiate final message while streaming #1370

hayescode · 2024-04-25T00:25:36Z

hayescode
Apr 25, 2024

I am loving the new Assistants API and the constant improvements. One item I'm struggling with is determining if the thread.message is the final answer (it has everything it needs and is responding) or if it's trying to fix an issue with a tool call. This seems limited to streaming, since non-streaming the last message is the final answer. My UI is Chainlit and looks something like this, where tool calls are nested/collapsible and the final response is displayed.

In this example, if there is an issue with any of the tools it "talks out loud" and tries to fix the issue. This is great but in the API this looks identical to the final response. It would be great to be able to differentiate this while streaming.

To reproduce, ask your assistant to Create a visualization of a sine wave using Plotly. It will fail because plotly isn't installed and will print messages followed by more tool calls.

Suggestion: Either a new role besides assistant/user or a new field for "intention" with values like "troubleshoot"/"tool_call_id" or "final answer".

class MessageDelta(BaseModel):
    content: Optional[List[MessageContentDelta]] = None
    """The content of the message in array of text and/or images."""

    role: Optional[Literal["user", "assistant"]] = None
    """The entity that produced the message. One of `user` or `assistant`."""

rattrayalex · 2024-05-13T01:58:04Z

rattrayalex
May 13, 2024
Collaborator

Thanks for the suggestion, I've passed it along!

0 replies

darshjme-codes · 2026-02-21T22:45:07Z

darshjme-codes
Feb 21, 2026

Answer for #1370: Assistants API | New 'role' to differentiate final message

Solution: Message State Machine Pattern for Streaming Assistants

You've identified a critical UX problem in streaming Assistants API responses. The core issue is semantic overloading of the assistant role - it represents both intermediate reasoning and final answers. Here's a production-grade solution:

1. Immediate Workaround: State Inference from Message Content

Until OpenAI adds explicit role differentiation, infer message intent from content structure:

from enum import Enum
from typing import Literal

class MessageIntent(Enum):
    """Inferred intent of an assistant message"""
    FINAL_ANSWER = "final_answer"           # User-facing response
    TOOL_TROUBLESHOOTING = "troubleshooting"  # Fixing tool errors
    TOOL_PLANNING = "planning"               # Deciding which tools to use
    TOOL_RESULT_ANALYSIS = "analysis"        # Processing tool outputs

def infer_message_intent(message_delta: MessageDelta) -> MessageIntent:
    """
    Infer message intent from streaming delta content.
    
    This is a heuristic until OpenAI provides explicit intent signals.
    """
    
    # Check if this is a tool call round
    run_status = get_current_run_status()  # From your run polling
    
    if run_status == "requires_action":
        # Currently executing tools - any text is planning/troubleshooting
        return MessageIntent.TOOL_PLANNING
    
    # Check message content structure
    content = message_delta.content or []
    
    for block in content:
        if block.type == "text":
            text = block.text.value.lower()
            
            # Heuristics for troubleshooting
            if any(phrase in text for phrase in [
                "let me try",
                "there was an error",
                "i'll fix",
                "attempting to",
                "retrying"
            ]):
                return MessageIntent.TOOL_TROUBLESHOOTING
            
            # Heuristics for tool planning
            if any(phrase in text for phrase in [
                "i'll use",
                "i need to",
                "first, i'll",
                "let me check"
            ]):
                return MessageIntent.TOOL_PLANNING
    
    # Check run step details for tool result analysis
    if run_status == "in_progress":
        step = get_latest_run_step()
        if step.type == "message_creation":
            # Check if previous step was tool_calls
            prev_step = get_previous_run_step()
            if prev_step and prev_step.type == "tool_calls":
                return MessageIntent.TOOL_RESULT_ANALYSIS
    
    # Default: final answer
    return MessageIntent.FINAL_ANSWER

# Usage in Chainlit UI
async def handle_streaming_message(message_delta: MessageDelta):
    intent = infer_message_intent(message_delta)
    
    match intent:
        case MessageIntent.FINAL_ANSWER:
            # Display prominently in UI
            await cl.Message(
                content=message_delta.content[0].text.value,
                author="Assistant"
            ).send()
        
        case MessageIntent.TOOL_TROUBLESHOOTING:
            # Show as collapsible debug info
            await cl.Message(
                content=message_delta.content[0].text.value,
                author="System",
                indent=1,  # Nested display
                type="tool_output"
            ).send()
        
        case MessageIntent.TOOL_PLANNING | MessageIntent.TOOL_RESULT_ANALYSIS:
            # Show as lightweight status update
            await cl.Message(
                content=message_delta.content[0].text.value,
                author="Assistant",
                indent=1,
                type="thinking"
            ).send()

2. Robust Solution: Run Step State Machine

The real answer lies in run step tracking. Each run step has detailed status that tells you exactly what's happening:

from dataclasses import dataclass
from typing import Optional

@dataclass
class MessageContext:
    """Rich context for understanding message intent"""
    run_id: str
    thread_id: str
    step_type: Literal["message_creation", "tool_calls"]
    step_status: Literal["in_progress", "completed", "failed"]
    is_final: bool  # No more pending steps
    tool_call_failures: list[str]  # Failed tool names

async def stream_assistant_with_context(
    client: AsyncOpenAI,
    thread_id: str,
    assistant_id: str
):
    """Stream assistant responses with full context tracking"""
    
    # Create run
    run = await client.beta.threads.runs.create(
        thread_id=thread_id,
        assistant_id=assistant_id,
        stream=True
    )
    
    # Track run steps
    step_history = []
    current_step = None
    
    async for event in run:
        match event.event:
            case "thread.run.step.created":
                current_step = event.data
                step_history.append(current_step)
            
            case "thread.message.delta":
                # Build context
                ctx = MessageContext(
                    run_id=run.id,
                    thread_id=thread_id,
                    step_type=current_step.type,
                    step_status=current_step.status,
                    is_final=is_final_step(current_step, step_history),
                    tool_call_failures=get_failed_tools(step_history)
                )
                
                # Yield with context
                yield (event.data, ctx)
            
            case "thread.run.completed":
                # Mark final message
                yield (None, MessageContext(
                    run_id=run.id,
                    thread_id=thread_id,
                    step_type="message_creation",
                    step_status="completed",
                    is_final=True,
                    tool_call_failures=[]
                ))

def is_final_step(current_step, step_history) -> bool:
    """Determine if this is the final message"""
    
    # If current step is message_creation and status is completed
    if current_step.type == "message_creation" and current_step.status == "completed":
        # Check if any subsequent tool_calls steps exist
        has_pending_tools = any(
            step.type == "tool_calls" and step.status != "completed"
            for step in step_history
            if step.created_at > current_step.created_at
        )
        return not has_pending_tools
    
    return False

def get_failed_tools(step_history) -> list[str]:
    """Extract names of failed tool calls"""
    failed = []
    for step in step_history:
        if step.type == "tool_calls":
            for tool_call in step.step_details.tool_calls:
                if hasattr(tool_call, 'error') and tool_call.error:
                    failed.append(tool_call.function.name)
    return failed

# Chainlit integration
async def display_message_with_context(delta: MessageDelta, ctx: MessageContext):
    """Display message based on context"""
    
    if ctx.is_final and not ctx.tool_call_failures:
        # Final answer - prominent display
        await cl.Message(
            content=delta.content[0].text.value,
            author="Assistant",
            type="final_answer"
        ).send()
    
    elif ctx.tool_call_failures:
        # Troubleshooting - collapsible
        await cl.Message(
            content=f"⚠️ Fixing {', '.join(ctx.tool_call_failures)}: {delta.content[0].text.value}",
            author="System",
            type="tool_troubleshooting",
            indent=1
        ).send()
    
    else:
        # Intermediate reasoning - lightweight
        await cl.Message(
            content=delta.content[0].text.value,
            author="Assistant",
            type="thinking",
            indent=1
        ).send()

3. Feature Request: Explicit Intent Field

Your suggestion for an explicit intention field is architecturally sound. Here's what it should look like:

# Proposed API enhancement
class MessageDelta(BaseModel):
    content: Optional[List[MessageContentDelta]] = None
    role: Optional[Literal["user", "assistant"]] = None
    
    # NEW FIELD
    intent: Optional[Literal[
        "final_answer",         # User-facing response
        "tool_planning",        # Deciding which tools to call
        "tool_troubleshooting", # Fixing tool errors
        "tool_result_summary"   # Summarizing tool outputs
    ]] = None

# This would enable clean UI logic:
async for event in stream:
    if event.data.intent == "final_answer":
        display_prominently(event.data)
    elif event.data.intent == "tool_troubleshooting":
        display_as_debug_info(event.data)

Why this is better than a new role:

role should represent who sent the message (user/assistant/system)
intent represents why the message was sent (final answer vs intermediate reasoning)
Keeps role semantics clean while adding the needed granularity

4. Production Pattern: Message Queue with Intent Classification

For complex UIs, use a message queue with post-processing:

from asyncio import Queue
from typing import AsyncIterator

class MessageClassifier:
    """ML-based message intent classifier (optional enhancement)"""
    
    def __init__(self):
        # Could use simple heuristics or a fine-tuned model
        self.patterns = {
            "final_answer": [r"in summary", r"therefore", r"the answer is"],
            "troubleshooting": [r"error", r"failed", r"let me try again"],
            "planning": [r"i'll use", r"first", r"i need to check"]
        }
    
    def classify(self, text: str, context: MessageContext) -> MessageIntent:
        """Classify message intent using text patterns and context"""
        
        # Context-based classification (most reliable)
        if context.is_final:
            return MessageIntent.FINAL_ANSWER
        
        if context.tool_call_failures:
            return MessageIntent.TOOL_TROUBLESHOOTING
        
        # Pattern-based fallback
        text_lower = text.lower()
        for intent, patterns in self.patterns.items():
            if any(re.search(pattern, text_lower) for pattern in patterns):
                return MessageIntent(intent)
        
        return MessageIntent.FINAL_ANSWER

async def stream_with_classification(
    client: AsyncOpenAI,
    thread_id: str,
    assistant_id: str
) -> AsyncIterator[tuple[MessageDelta, MessageIntent]]:
    """Stream messages with automatic intent classification"""
    
    classifier = MessageClassifier()
    
    async for delta, context in stream_assistant_with_context(client, thread_id, assistant_id):
        if delta:
            text = delta.content[0].text.value if delta.content else ""
            intent = classifier.classify(text, context)
            yield (delta, intent)

Summary

Immediate action:

Use run step tracking to infer message intent
Implement the MessageContext pattern above
Display messages differently based on context

Long-term solution:

+1 your feature request for an explicit intent field
This is a real production need affecting all streaming Assistant UIs

Why this matters:
Without intent differentiation, users see confusing "thinking out loud" messages mixed with final answers. This degrades UX and makes it impossible to build clean chat interfaces. Your Plotly example perfectly illustrates the problem.

The workarounds above will unblock you today, but OpenAI should add this field to the API.

0 replies

xXMrNidaXx · 2026-02-23T16:47:44Z

xXMrNidaXx
Feb 23, 2026

Role differentiation for streaming is useful! At RevolutionAI (https://revolutionai.io) we build streaming UIs.

Current pattern:

async for event in client.beta.assistants.stream(...):
    if event.event == "thread.message.completed":
        # Final message
        handle_final(event.data)
    elif event.event == "thread.message.delta":
        # Streaming delta
        handle_delta(event.data)

What would help:

Clear "final" indicator in delta stream
Distinct event for user-facing vs internal messages
Role metadata in stream events

Use cases:

Show thinking vs answer differently
UI transitions on completion
Logging and analytics

Would make building polished UIs much easier!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Assistants API | New 'role' to differentiate final message while streaming #1370

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Assistants API | New 'role' to differentiate final message while streaming #1370

Uh oh!

hayescode Apr 25, 2024

Replies: 3 comments

Uh oh!

rattrayalex May 13, 2024 Collaborator

Uh oh!

darshjme-codes Feb 21, 2026

Answer for #1370: Assistants API | New 'role' to differentiate final message

Solution: Message State Machine Pattern for Streaming Assistants

1. Immediate Workaround: State Inference from Message Content

2. Robust Solution: Run Step State Machine

3. Feature Request: Explicit Intent Field

4. Production Pattern: Message Queue with Intent Classification

Summary

Uh oh!

xXMrNidaXx Feb 23, 2026

hayescode
Apr 25, 2024

rattrayalex
May 13, 2024
Collaborator

darshjme-codes
Feb 21, 2026

xXMrNidaXx
Feb 23, 2026