dev -> main (Jul 18th) #1532

Ahmad-mtos · 2025-07-18T09:08:04Z

PR Type

Bug fix, Enhancement, Documentation

Description

Fix cartesian product bug in get_task query
Enhance LLM client with custom API key support
Remove deprecated continue_ field from message models
Add automated model documentation sync workflow

Diagram Walkthrough

flowchart LR
  A["Task Query Bug"] --> B["Fixed Cartesian Product"]
  C["LLM Client"] --> D["Custom API Key Support"]
  E["Message Models"] --> F["Remove continue_ Field"]
  G["Model Config"] --> H["Auto-sync Documentation"]
  B --> I["Better Tool Filtering"]
  D --> J["User Secret Integration"]
  F --> K["Cleaner API Schema"]
  H --> L["Updated Model Tables"]

File Walkthrough

	Relevant files

Fixed cartesian product bug in get_task query that caused duplicate tools. Co-authored-by: Ahmad Haidar <[email protected]>

…mestamp

fix(agents-api): refactor get_task query to fix cartesian product bug

- Added custom API key support feature for LLM client - Added model name conversion utility for LiteLLM integration - Re-enabled model validation in session chat rendering Co-authored-by: Ahmad Haidar <[email protected]>

fix(agents-api): properly handle custom-api-key and secrets api key

…pt for updating model data from configuration

docs: better transparent supported models page

…ld for non openai models

fix(agents-api): remove ``continue_`` field + filter out ``name`` field for non openai models

.github/workflows/sync-model-docs.yml

+    runs-on: ubuntu-latest
+
+    steps:
+    - name: Checkout repository
+      uses: actions/checkout@v4
+      with:
+        token: ${{ secrets.GITHUB_TOKEN }}
+
+    - name: Set up Python
+      uses: actions/setup-python@v4
+      with:
+        python-version: '3.12'
+
+    - name: Install dependencies
+      run: |
+        pip install pyyaml
+
+    - name: Run sync script
+      run: |
+        python src/scripts/sync_model_docs.py
+
+    - name: Check for changes
+      id: check_changes
+      run: |
+        if git diff --quiet documentation/integrations/supported-models.mdx; then
+          echo "changed=false" >> $GITHUB_OUTPUT
+        else
+          echo "changed=true" >> $GITHUB_OUTPUT
+        fi
+
+    - name: Commit and push changes
+      if: steps.check_changes.outputs.changed == 'true'
+      run: |
+        git config --global user.name 'github-actions[bot]'
+        git config --global user.email 'github-actions[bot]@users.noreply.github.com'
+        git add documentation/integrations/supported-models.mdx
+        git commit -m "chore: sync model documentation from litellm-config.yaml [skip ci]"
+        git push


To fix the problem, you should add an explicit permissions block to the workflow file to restrict the GITHUB_TOKEN scope to the minimum needed. Since this workflow checks out code and pushes changes (committing to the repository), it requires contents: write permission. The contents: write scope is the least privilege needed for pushing to the repository, while other permissions (such as pull-requests: write) are not required here. The best place to add this is at the top level of the workflow file, so it applies to all jobs (unless overridden).

You should add the following block right after the name: field (before on:):

permissions: contents: write

No other code or imports are needed.

claude · 2025-07-18T09:08:18Z

Claude finished @Ahmad-mtos's task —— View job

PR Review Analysis

I'm reviewing this dev -> main PR with 22 commits and 25 changed files. Here's my review plan:

Review Checklist:

Security Analysis - Address GitHub Advanced Security workflow permissions concern
Workflow Changes - Review new GitHub Actions workflow
Core Code Changes - Examine agents-api and integrations-service modifications
Documentation Updates - Review documentation changes for accuracy
Test Coverage - Analyze new tests and coverage
Configuration Changes - Review schema and config updates
Performance Considerations - Assess performance implications
Best Practices - Check adherence to coding standards

Security Analysis

🔴 Critical Security Issues

GitHub Actions Workflow Permissions (Already flagged by security scanner)
- File: .github/workflows/sync-model-docs.yml
- Issue: Workflow lacks explicit permissions block
- Risk: Defaults to broad permissions which could be exploited
- Fix: Add explicit permissions block:
```
permissions:
  contents: write
  pull-requests: write
```
External URL Fetch Without Validation
- File: src/scripts/sync_model_docs.py:359
- Issue: Script fetches data from external GitHub URL without proper validation
- Risk: Potential for supply chain attacks if GitHub repository is compromised
- Fix: Add URL validation, certificate verification, and content validation

🟡 Medium Security Issues

Error Handling Exposes Implementation Details
- File: src/agents-api/agents_api/clients/litellm.py:128-129
- Issue: Exception details printed to console may expose sensitive information
- Fix: Use proper logging with sanitization instead of print statements

Code Quality & Best Practices

✅ Positive Aspects

Comprehensive Test Coverage: Good test additions in test_litellm_utils.py and test_task_queries.py
Proper Error Handling: Usage of contextlib.suppress for graceful secret handling
Type Safety: Consistent use of beartype decorators and type hints
Documentation: Well-documented functions with docstrings

🔴 Issues Found

Hardcoded Path: src/agents-api/agents_api/common/utils/llm_providers.py:5
- Issue: /app/litellm-config.yaml is hardcoded
- Impact: Makes code less flexible and harder to test
- Fix: Use environment variable or configurable path
Missing Error Handling: src/scripts/sync_model_docs.py:361-372
- Issue: Network errors may not be handled gracefully
- Impact: Script could fail silently or with unclear error messages
SQL Query Complexity: src/agents-api/agents_api/queries/tasks/get_task.py:12-53
- Issue: Complex SQL with multiple subqueries
- Impact: Performance implications and maintenance difficulty
- Consider: Breaking into smaller, more focused queries

Performance Considerations

🟡 Potential Performance Issues

Blocking Network Calls: src/scripts/sync_model_docs.py:363
- Issue: Synchronous HTTP request in async context
- Impact: Could block event loop
- Fix: Use aiohttp for async requests
Memory Usage: Model data processing loads entire config and pricing data into memory
- Impact: Could be problematic with large datasets
- Consider: Streaming or chunked processing for large files

Test Coverage Assessment

✅ Good Test Coverage

LiteLLM Utils: Comprehensive tests for model name resolution and API key handling
Task Queries: Good coverage of database query functionality
Mock Usage: Proper use of mocks for external dependencies

🔴 Missing Test Coverage

Error Scenarios: Limited testing of failure cases (network errors, invalid configs)
Integration Tests: Missing tests for end-to-end workflow functionality
Security Tests: No tests for input validation or security scenarios

Recommendations

High Priority

Fix workflow permissions - Add explicit permissions block to GitHub Actions workflow
Secure external data fetching - Add validation and error handling for GitHub API calls
Remove hardcoded paths - Make configuration paths configurable

Medium Priority

Improve error handling - Replace print statements with proper logging
Add integration tests - Test complete workflows end-to-end
Performance optimization - Consider async patterns for I/O operations

Low Priority

Code documentation - Add more inline comments for complex logic
Refactor complex queries - Break down large SQL queries into smaller parts

Overall Assessment

This PR introduces significant functionality improvements including:

Enhanced LiteLLM integration with custom API key support
Automated documentation sync from configuration
Better model resolution and provider mapping
Comprehensive test coverage for new features

However, it also introduces security vulnerabilities that should be addressed before merging to main. The code quality is generally good with proper type safety and error handling patterns, but needs security hardening.

Recommendation: Address security issues before merging. Consider implementing security fixes as a follow-up PR if time is critical.

github-actions · 2025-07-18T09:08:23Z

Claude finished @Ahmad-mtos's task —— View job

Changelog Bot Analysis ✅

Analyzed the dev → main PR and successfully updated changelog with user-visible changes.

Todo List:

Read diff.patch to understand all changes
Analyze changes to identify user-visible features/fixes
Create/update changelog file in changelog/main/
Update Templates/index-template/home-summary.html if needed
Stage, commit, and push changes

Changes Made:

✅ Created changelog/stable (main)/2025-07-18.md with proper YAML front-matter
✅ Updated changelog/Templates/index-template/home-summary.html with July 18, 2025 release highlights
✅ Committed and pushed changes to the dev branch

User-Visible Features Added:

Enhanced LLM client to support custom API keys from user secrets
Added model name conversion utility for LiteLLM integration
Added automated workflow to sync model documentation from configuration
Enhanced supported models page with transparent feature comparison tables
Added comprehensive model capability tracking (context window, max output, tool calling, vision, audio, caching, cost tiers)

Bug Fixes:

Fixed cartesian product bug in get_task query that caused duplicate tools
Re-enabled model validation in session chat rendering

Commit Details:

Hash: 6a449e2d
Message: "changelog: add July 18, 2025 release notes"
Files: 2 changed, 27 insertions, 9 deletions

github-actions · 2025-07-18T09:08:23Z

Claude finished @Ahmad-mtos's task —— View job

Analysis Complete

Todo List:

Read diff.patch file
Analyze changes for user-visible functionality
Determine if changes qualify as UPDATE or SKIP
Provide final response

Analysis Summary

Based on my analysis of the diff.patch file, I identified several key changes that qualify as user-visible functionality additions and behavior changes:

New JSON Response Format Support: Addition of response_format parameter with support for json_object and json_schema types
Streaming Support: Addition of stream: true parameter for enabling streaming responses
API Field Removal: The continue_ field has been removed from multiple API models (Chat.py, Tasks.py, entries models)
Enhanced Documentation: New documentation sections explaining the new features
Model Support Updates: Updated supported models documentation with feature comparison tables

Since the changes include both new user-visible functionality and changes to existing behavior, the response is:

UPDATE

qodo-merge-for-open-source · 2025-07-18T09:08:51Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
🧪 PR contains tests
🔒 No security concerns identified
⚡ Recommended focus areas for review Logic Complexity The custom API key handling logic in both acompletion and aembedding functions has complex conditional flows that determine model name transformation and base URL selection. The logic should be thoroughly tested to ensure all combinations of custom_api_key, secret retrieval, and model name handling work correctly. # Check if user has a custom API key in secrets api_key_env_var_name = get_api_key_env_var_name(model) secret = None if api_user is not None and api_key_env_var_name: developer_id: UUID = UUID(api_user) with contextlib.suppress(SecretNotFoundError): secret = await get_secret_by_name( developer_id=developer_id, name=api_key_env_var_name, decrypt=True, ) # Determine if we have a custom API key (either provided or from secrets) has_custom_key = custom_api_key or (secret and secret.value) if has_custom_key: # User has their own API key - convert model name and use direct provider if not custom_api_key and secret: custom_api_key = secret.value model = get_litellm_model_name(model) else: # No custom key - use our proxy with openai/ prefix model = f"openai/{model}" Query Performance The subquery approach to fix the cartesian product issue uses a timestamp-based filter (updated_at >= t.updated_at) which may have performance implications on large datasets. The query should be tested with realistic data volumes to ensure acceptable performance. ( SELECT jsonb_agg(to_jsonb(tl)) FROM tools tl WHERE tl.developer_id = t.developer_id AND tl.task_id = t.task_id AND tl.updated_at >= t.updated_at ), '[]'::jsonb ) as tools Message Filtering The message filtering logic removes the 'name' field from messages when using non-OpenAI models, but this filtering is applied broadly without considering if the target model actually supports or requires the name field. This could potentially break functionality for models that do support named messages. if get_api_key_env_var_name(agent_model) != "OPENAI_API_KEY": prompt = [{k: v for k, v in message.items() if k != "name"} for message in prompt]

qodo-merge-for-open-source · 2025-07-18T09:09:46Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
General	Use epsilon for floating-point zero comparison Using exact floating-point equality comparison (`== 0`) can be unreliable due to floating-point precision issues. Consider using a small epsilon for the zero comparison to handle potential floating-point rounding errors. src/scripts/sync_model_docs.py [23-34] def classify_cost_tier(input_cost_per_token: float) -> str: """Classify model cost tier based on input token cost.""" - if input_cost_per_token == 0: + if input_cost_per_token < 1e-10: # Effectively zero with epsilon return "Free" elif input_cost_per_token < 0.0000005: # $0.50 per 1M tokens return "Budget" elif input_cost_per_token < 0.000002: # $2.00 per 1M tokens return "Standard" elif input_cost_per_token < 0.00001: # $10.00 per 1M tokens return "Premium" else: return "Enterprise" Apply / Chat Suggestion importance[1-10]: 5 __ Why: The suggestion correctly identifies a potential floating-point precision issue with `== 0` and proposes a robust alternative, improving the reliability of the cost classification logic.	Low
More

ellipsis-dev

Important

Looks good to me! 👍

Reviewed everything up to 29c18f3 in 2 minutes and 35 seconds. Click for details.

Reviewed 1839 lines of code in 25 files
Skipped 0 files when reviewing.
Skipped posting 12 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. .github/workflows/sync-model-docs.yml:50

Draft comment:
Minor: The file lacks a trailing newline. Consider adding one for POSIX compliance.
Reason this comment was not posted:
Confidence changes required: 30% <= threshold 50% None

2. documentation/advanced/chat.mdx:55

Draft comment:
Ensure removal of the 'continue_' field is intentional and propagated throughout the API docs.
Reason this comment was not posted:
Comment looked like it was already resolved.

3. src/agents-api/agents_api/activities/task_steps/prompt_step.py:49

Draft comment:
Stripping the 'name' key for non-OpenAI models seems deliberate. Confirm that removing 'name' does not break required identification for other providers.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% This comment appears to be asking the author to verify/confirm something, which violates our rules. It's asking "confirm that removing name doesn't break..." which is exactly the kind of speculative "are you sure?" comment we want to avoid. The code change appears deliberate and targeted at non-OpenAI models specifically. Maybe this is actually pointing out a real compatibility issue with other LLM providers that could cause bugs? If there was a real compatibility issue, the comment should state it directly with evidence. Asking for verification without specific evidence of a problem is not helpful. Delete this comment as it's asking for unnecessary verification without providing evidence of an actual problem.

4. src/agents-api/agents_api/clients/litellm.py:59

Draft comment:
The usage of contextlib.suppress requires an import. Ensure 'import contextlib' is added if not already present.
Reason this comment was not posted:
Comment was on unchanged code.

5. src/agents-api/agents_api/common/utils/llm_providers.py:34

Draft comment:
get_litellm_model_name calls get_config() but there is no import for get_config. Ensure it's imported or defined.
Reason this comment was not posted:
Comment looked like it was already resolved.

6. src/agents-api/agents_api/queries/tasks/get_task.py:28

Draft comment:
The SQL subquery filters tools with 'tl.updated_at >= t.updated_at'. Double-check that this condition correctly filters outdated tool entries.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

7. src/agents-api/agents_api/routers/sessions/auto_tools/render.py:73

Draft comment:
Re-enabling validate_model() may affect legacy sessions. Confirm that model validation errors are handled gracefully.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 50% The comment is asking the PR author to confirm that model validation errors are handled gracefully, which falls under asking the author to ensure behavior is intended or tested. This violates the rules.

8. src/agents-api/tests/test_task_queries.py:400

Draft comment:
Using asyncio.sleep(0.1) to separate timestamps can be flaky. Consider mocking time to control updated_at values in tests.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

9. src/scripts/sync_model_docs.py:230

Draft comment:
update_documentation() relies on simple string searches for '## Available Models'. This approach is fragile if the doc structure changes; consider more robust parsing.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% While the comment raises a valid point about potential fragility, the actual implementation shows careful consideration of edge cases with multiple fallback mechanisms. The code is not as fragile as the comment suggests. More robust parsing (like using a markdown parser) would add complexity without clear benefits given the simple document structure being handled. The comment identifies a real architectural concern about maintainability. A change in document structure could break the parsing. The current implementation already handles various edge cases and has fallbacks. The added complexity of a full markdown parser may not be worth it for this specific use case. The comment should be deleted as it raises concerns about fragility that are already adequately addressed in the implementation through multiple fallback mechanisms.

10. src/scripts/sync_model_docs.py:300

Draft comment:
Project root detection logic in main() may be fragile in non-standard setups. Verify it works in all expected environments.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 50% The comment is asking the PR author to verify that the project root detection logic works in all expected environments. This falls under the category of asking the author to ensure the behavior is intended or to double-check things, which is against the rules.

11. documentation/advanced/chat.mdx:258

Draft comment:
The code fence for the Python example includes an extra label ('Python'). Consider using just 'python' unless the extra label is intentional.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 0% vs. threshold = 50% This is a very minor stylistic suggestion. The casing doesn't affect functionality. The document uses 'Python' consistently throughout, suggesting it's an intentional style choice. This kind of nitpicky comment about documentation formatting doesn't add value. Maybe there's a style guide that mandates lowercase language names in code fences that I'm not aware of? Even if there was such a style guide, this kind of minor formatting detail in documentation isn't worth a PR comment. The current casing is consistent and readable. Delete this comment as it's an unimportant stylistic nitpick about documentation formatting that doesn't affect functionality.

12. documentation/advanced/chat.mdx:275

Draft comment:
The code fence for the JavaScript example includes an extra label ('Node.js'). Consider using just 'javascript' unless the extra label is intentional.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 0% vs. threshold = 50% The Node.js label appears to be intentional as part of a pattern in the documentation. The Python examples also have additional labels. The labels help distinguish between different runtime environments. Removing it would actually make the documentation less clear. This is also a UI/documentation style choice rather than a code logic issue. The comment might have a point if there was inconsistent labeling across the codebase, but we can only see this file. The labeling appears consistent within this file and serves a clear purpose in distinguishing runtime environments. The comment should be deleted as it suggests removing an intentional and useful documentation label, and it's also about documentation style rather than code logic.

Workflow ID: wflow_zTDrbzD57l35ugxz

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

Ahmad-mtos and others added 22 commits July 9, 2025 08:57

fix(agents-api): refactor get_task query to fix cartesian product bug

5b0c23b

chore: add changelog entry for July 9, 2025 bug fix

240e54d

Fixed cartesian product bug in get_task query that caused duplicate tools. Co-authored-by: Ahmad Haidar <[email protected]>

refactor: Lint agents-api (CI)

2812395

fix(agents-api): update get_task query to filter tools by updated_at

259ac7f

chore(agents-api): add tests for task tool filtering by updated_at ti…

f0c1726

…mestamp

refactor: Lint agents-api (CI)

1ed5d19

Merge pull request #1526 from julep-ai/x/duplicate-task-tools

c36c944

fix(agents-api): refactor get_task query to fix cartesian product bug

feat(agents-api): enhance LLM client to support custom API keys

2c76670

chore(agents-api): add tests

565fdd6

chore(agents-api): re-enable model validation on session chat

451b015

Merge pull request #1528 from julep-ai/x/custom-api-key

ae1540a

fix(agents-api): properly handle custom-api-key and secrets api key

docs: add workflow to sync models in documentation and implement scri…

5b148a2

…pt for updating model data from configuration

chore: update changelog with enhanced model documentation features

6b674c6

docs: hot fix

4494d62

Merge pull request #1529 from julep-ai/d/better-supported-models

f07295c

docs: better transparent supported models page

chore: sync model documentation from litellm-config.yaml [skip ci]

720e597

docs: add response_format to docs and streaming support to chat

810bf01

fix(agents-api): remove continue_ field + filter out name fie…

38f04e1

…ld for non openai models

docs: update docs

bc8431d

refactor: Lint agents-api (CI)

9bf6bc9

Merge pull request #1530 from julep-ai/x/groq-models

29c18f3

fix(agents-api): remove ``continue_`` field + filter out ``name`` field for non openai models

github-advanced-security bot found potential problems Jul 18, 2025

View reviewed changes

qodo-merge-for-open-source bot added the Review effort 4/5 label Jul 18, 2025

ellipsis-dev bot reviewed Jul 18, 2025

View reviewed changes

changelog: add July 18, 2025 release notes

6a449e2

Ahmad-mtos merged commit d7262c6 into main Jul 19, 2025
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

dev -> main (Jul 18th) #1532

dev -> main (Jul 18th) #1532

Uh oh!

Ahmad-mtos commented Jul 18, 2025 •

edited by qodo-merge-for-open-source bot

Loading

Uh oh!

Check warning

Copilot Autofix

claude bot commented Jul 18, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jul 18, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jul 18, 2025 •

edited

Loading

Uh oh!

qodo-merge-for-open-source bot commented Jul 18, 2025

Uh oh!

qodo-merge-for-open-source bot commented Jul 18, 2025

Uh oh!

ellipsis-dev bot left a comment

Uh oh!

Uh oh!

Uh oh!

@@ -1,3 +1,5 @@
+            permissions:
+              contents: write
             name: Sync Model Documentation
             on:

dev -> main (Jul 18th) #1532

dev -> main (Jul 18th) #1532

Uh oh!

Conversation

Ahmad-mtos commented Jul 18, 2025 • edited by qodo-merge-for-open-source bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Type

Description

Diagram Walkthrough

File Walkthrough

Uh oh!

Check warning

Copilot Autofix

claude bot commented Jul 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review Analysis

Review Checklist:

Security Analysis

🔴 Critical Security Issues

🟡 Medium Security Issues

Code Quality & Best Practices

✅ Positive Aspects

🔴 Issues Found

Performance Considerations

🟡 Potential Performance Issues

Test Coverage Assessment

✅ Good Test Coverage

🔴 Missing Test Coverage

Recommendations

High Priority

Medium Priority

Low Priority

Overall Assessment

Uh oh!

github-actions bot commented Jul 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changelog Bot Analysis ✅

Uh oh!

github-actions bot commented Jul 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Analysis Complete

Analysis Summary

Uh oh!

qodo-merge-for-open-source bot commented Jul 18, 2025

PR Reviewer Guide 🔍

Uh oh!

qodo-merge-for-open-source bot commented Jul 18, 2025

PR Code Suggestions ✨

Uh oh!

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Ahmad-mtos commented Jul 18, 2025 •

edited by qodo-merge-for-open-source bot

Loading

claude bot commented Jul 18, 2025 •

edited

Loading

github-actions bot commented Jul 18, 2025 •

edited

Loading

github-actions bot commented Jul 18, 2025 •

edited

Loading