feat: @budget decorator for Flows — cost, token & request limits with HITL approval#4837
feat: @budget decorator for Flows — cost, token & request limits with HITL approval#4837alex-clawd wants to merge 7 commits intocrewAIInc:mainfrom
Conversation
Add a @cost_governor decorator for Flow methods that enables budget and
token limit enforcement. Key features:
- Budget limits in USD with per-model pricing (GPT-4o, Claude, Gemini, etc.)
- Token limits for hard caps on total token usage
- Three on_exceed modes:
- 'pause': Uses existing HITL infrastructure to ask human approval
- 'stop': Raises BudgetExceededError immediately
- 'warn': Logs warning and continues
- Cumulative cost tracking across flow execution via flow.cost_summary
- Custom cost_map support for non-standard model pricing
- Works with both sync and async flow methods
New exports from crewai.flow:
- cost_governor: The decorator
- CostGovernorConfig: Configuration dataclass
- CostTracker: Internal tracking class
- BudgetExceededError: Exception for stop/denied scenarios
Example usage:
@start()
@cost_governor(budget_limit=5.00, on_exceed='pause')
def expensive_task(self):
return crew.kickoff()
Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Fix prefix matching to prefer longest match (e.g., gpt-4o-mini over gpt-4o) - Use word-boundary matching for denial detection to avoid false positives - Add approved_tokens tracking for token limit continuation - Add effective_token_limit property to track total allowed tokens - Update cost_summary to include new token tracking fields - Add tests for prefix matching and token limit approval Co-Authored-By: Claude Opus 4.5 <[email protected]>
…nsive token tracking BREAKING CHANGE: Renamed @cost_governor to @Budget decorator Renamed: - cost_governor.py → budget.py - @cost_governor() → @Budget() - CostGovernorConfig → BudgetConfig - CostTracker → BudgetTracker - cost_summary → budget_summary - _cost_tracker → _budget_tracker - budget_limit → max_cost - token_limit → max_tokens New features: - max_requests: Limit total LLM requests (tracked via event bus) - cost_per_prompt_token / cost_per_completion_token: Custom flat per-token pricing - cost_map now supports per-model override pricing - Priority: flat pricing > cost_map > DEFAULT_MODEL_COSTS - Request counting via LLMCallStartedEvent listener - Enhanced HITL approval message shows which limit was hit - Extracts usage from LiteAgentOutput.usage_metrics New BudgetTracker fields: - total_requests: LLM request count - approved_requests: Additional approved requests - is_request_limit_exceeded property - effective_request_limit property Tests: - 63 comprehensive tests covering all new functionality - Tests for request limits, custom pricing, combined limits - Async flow method support verified Co-Authored-By: Claude Opus 4.5 <[email protected]>
…pshots - Listen to LLMCallCompletedEvent to capture tokens from ALL LLM calls (raw LLM.call(), Agent.kickoff(), Crew.kickoff()) - Use pre/post snapshot of BaseLLM._token_usage for per-call deltas - Avoid double-counting: skip result extraction when events captured tokens - Wait for async event handler completion before checking limits - Verified with real OpenAI API calls across all three scenarios 63 unit tests + 67 flow tests passing.
…ion snapshots, sleep removal - Issue 3: Guard format strings for effective_budget/token_limit/request_limit that could be None with "N/A" fallback - Issue 4: Only increase approved_tokens/approved_requests when their respective limits are exceeded (was always increasing approved_budget) - Issue 5: Check approval patterns (yes, approve, continue, go ahead, proceed, ok, okay) BEFORE denial patterns to avoid false positives like "no problem" Also check for structured HITL emit responses first (approved/denied) - Issue 6: Make budget regex require $ prefix, token regex require explicit "tokens" suffix or k/K suffix to avoid cross-parsing same numbers - Issue 7: Add docstring note about concurrent flow limitation with global event bus - Issue 8: Replace hardcoded 100ms sleep with polling loop (50ms max, 5ms intervals) that checks if tokens arrived via events, skipping immediately if no LLM requests were made - Issue 9: Move _llm_snapshots dict inside wrapper functions so each method invocation gets its own dict, avoiding cross-instance interference Co-Authored-By: Claude Opus 4.5 <[email protected]>
… approval Co-Authored-By: Claude Opus 4.5 <[email protected]>
| limits_str = ", ".join(limits_hit) | ||
|
|
||
| # Build message for human | ||
| budget_str = f"${tracker.max_cost:.2f}" if tracker.max_cost else "unlimited" |
There was a problem hiding this comment.
HITL approval message shows wrong budget limit
Medium Severity
The HITL approval message displays tracker.max_cost (the original budget) instead of tracker.effective_budget (which includes previously approved amounts). After a first approval (e.g., max_cost=$5, approved_budget=$5, effective_budget=$10), if the budget is exceeded again at $12, the message says "cost ($12.00 >= $5.00)" and "$5.00 budget" — implying the limit was $5, when the actual enforced limit was $10. This gives the human reviewer incorrect information to base their approval decision on.
Additional Locations (1)
| if re.search(approval_pattern, feedback_lower): | ||
| is_approved = True | ||
| elif re.search(denial_pattern, feedback_lower): | ||
| is_denied = True |
There was a problem hiding this comment.
Negated approval phrases falsely match as approved
Medium Severity
The feedback text parsing checks approval patterns before denial patterns, so negated phrases like "not ok", "not okay", or "no, ok fine" incorrectly match the approval regex (\bok\b / \bokay\b) and are treated as approval. Since the approval pattern is checked first and short-circuits, the denial keyword is never evaluated. This could cause unintended budget continuation with real cost implications.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 3 total unresolved issues (including 2 from previous reviews).
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
| def _create_completion_tracker( | ||
| tracker: BudgetTracker, | ||
| cfg: BudgetConfig, | ||
| llm_snapshots: dict[int, dict[str, int]], |
There was a problem hiding this comment.
Unused cfg parameter in _create_completion_tracker
Low Severity
The cfg: BudgetConfig parameter of _create_completion_tracker is never referenced in the function body or its inner handler closure. It's passed at both call sites (lines 883 and 905) but serves no purpose, adding unnecessary noise to the interface.


Summary
This PR introduces a native Budget decorator for CrewAI Flows, enabling cost, token, and request limit enforcement with human-in-the-loop (HITL) approval when limits are exceeded.
Key Features
@budgetdecorator for Flow methods with configurable limits:max_cost: Maximum cost in USDmax_tokens: Maximum total tokensmax_requests: Maximum LLM request count (NEW)on_exceed: Action when limits are exceeded ('pause', 'stop', 'warn')Custom pricing support (NEW):
cost_per_prompt_token/cost_per_completion_token: Flat per-token pricingcost_map: Per-model pricing overridesComprehensive token tracking:
LLMCallStartedEventto count LLM requestsCrewOutput,LiteAgentOutput, and any object withtoken_usageorusage_metricsattributesThree enforcement modes:
'pause'(default): Uses existing HITL infrastructure to request human approval to continue'stop': RaisesBudgetExceededErrorimmediately'warn': Logs warning and continues executionBetter HITL approval UX:
API
Usage Example
Breaking Changes
@cost_governor→@budgetCostGovernorConfig→BudgetConfigCostTracker→BudgetTrackercost_summary→budget_summarybudget_limit→max_cost,token_limit→max_tokensBackwards-compatible aliases are provided:
cost_governor,CostGovernorConfig,CostTrackerNew Exports from
crewai.flowbudget: The decoratorBudgetConfig: Configuration dataclassBudgetTracker: Internal tracking class (for advanced use)BudgetExceededError: Exception raised when limits exceeded and denied/stoppedTest Plan
🤖 Generated with Claude Code
Note
Medium Risk
Adds new opt-in runtime governance around LLM usage by hooking into the global event bus and Flow method wrapping, which could affect execution timing and accounting (especially with concurrent flows) if misconfigured.
Overview
Introduces a new
@budgetdecorator for Flow methods that tracks token usage, estimated cost, and LLM request count, then enforces configurable limits via warn, stop (BudgetExceededError), or pause (HITL approval) behavior.Budget tracking is integrated into
Flowvia a per-instance tracker and a newbudget_summaryproperty, and Flow method wrappers now preserve__budget_config__metadata. The implementation adds event-bus listeners (LLMCallStartedEvent/LLMCallCompletedEvent) to count requests and capture per-call token deltas, with support for custom pricing (cost_mapor per-token rates) and comprehensive new tests covering sync/async paths and approval/denial flows.Written by Cursor Bugbot for commit 3ac17a4. This will update automatically on new commits. Configure here.