Skip to content

Support python expressions in prompt settings #1422

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: dev
Choose a base branch
from

Conversation

creatorrr
Copy link
Contributor

@creatorrr creatorrr commented May 20, 2025

User description

Summary

  • evaluate prompt step settings using base_evaluate

Testing

  • poetry run poe check (fails: Failed to fetch https://pypi.org/simple/poethepoet/)

PR Type

Enhancement


Description

  • Evaluate Python expressions in prompt step settings using base_evaluate

  • Ensure settings are processed before LLM invocation


Changes walkthrough 📝

Relevant files
Enhancement
prompt_step.py
Evaluate and process prompt step settings before LLM call

agents-api/agents_api/activities/task_steps/prompt_step.py

  • Added evaluation of Python expressions in passed_settings via
    base_evaluate
  • Ensured settings are processed asynchronously before LLM call
  • +3/-0     

    Need help?
  • Type /help how to ... in the comments thread for any questions about Qodo Merge usage.
  • Check out the documentation for more information.

  • Important

    Evaluate Python expressions in prompt_step settings using base_evaluate in prompt_step.py.

    • Behavior:
      • In prompt_step() in prompt_step.py, evaluate Python expressions in passed_settings using base_evaluate before calling the language model.
    • Testing:
      • poetry run poe check fails due to a dependency fetch issue.

    This description was created by Ellipsis for ba318c3. You can customize this summary. It will automatically update as commits are pushed.

    Copy link
    Contributor

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
    🧪 No relevant tests
    🔒 Security concerns

    Python expression evaluation:
    The PR introduces evaluation of Python expressions from settings, which could potentially lead to code injection vulnerabilities if user-provided input flows into these settings without proper validation. The security implications depend on how base_evaluate is implemented and whether it has proper sandboxing. Review how settings are sourced and ensure that untrusted input cannot be evaluated as Python code.

    ⚡ Recommended focus areas for review

    Error Handling

    The implementation evaluates Python expressions in settings but doesn't include error handling for malformed expressions. Consider how the system will behave if evaluation fails.

    passed_settings = await base_evaluate(passed_settings, context)
    Function Dependency

    The code uses base_evaluate function but there's no visible import for this function in the diff. Verify that this function is properly imported.

    passed_settings = await base_evaluate(passed_settings, context)

    Copy link
    Contributor

    PR Code Suggestions ✨

    No code suggestions found for the PR.

    Copy link
    Contributor

    @ellipsis-dev ellipsis-dev bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Important

    Looks good to me! 👍

    Reviewed everything up to ba318c3 in 1 minute and 18 seconds. Click for details.
    • Reviewed 14 lines of code in 1 files
    • Skipped 0 files when reviewing.
    • Skipped posting 3 draft comments. View those below.
    • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
    1. agents-api/agents_api/activities/task_steps/prompt_step.py:110
    • Draft comment:
      Ensure base_evaluate safely handles arbitrary code execution when evaluating settings.
    • Reason this comment was not posted:
      Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% While security around code evaluation is important, this comment violates our rules by starting with "Ensure that..." and asking the author to verify something. It's also speculative - we don't have evidence that base_evaluate is unsafe. Without seeing the implementation of base_evaluate, we can't make strong claims about its safety. The security concern could be valid and important. Arbitrary code execution is a serious security risk. However, the comment is phrased as a verification request rather than pointing out a specific issue. We don't have evidence that base_evaluate is actually unsafe. The comment should be deleted because it asks for verification rather than pointing out a specific issue, and we don't have strong evidence that there's actually a security problem.
    2. agents-api/agents_api/activities/task_steps/prompt_step.py:110
    • Draft comment:
      Consider adding error handling for base_evaluate failures during settings evaluation.
    • Reason this comment was not posted:
      Confidence changes required: 50% <= threshold 50% None
    3. agents-api/agents_api/activities/task_steps/prompt_step.py:110
    • Draft comment:
      Typographical suggestion: In the comment on line 110, consider capitalizing 'python' to 'Python' for consistency.
    • Reason this comment was not posted:
      Comment did not seem useful. Confidence is useful = 0% <= threshold 50% This comment is purely informative and suggests a typographical change that doesn't impact the functionality or logic of the code. It doesn't align with the rules for useful comments, which should focus on code logic, potential issues, or improvements.

    Workflow ID: wflow_I4zvZozKwEjAy0fa

    You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

    Copy link
    Contributor

    CI Feedback 🧐

    A test triggered by this PR failed. Here is an AI-generated analysis of the failure:

    Action: Test

    Failed stage: Run tests [❌]

    Failed test name: test_mmr:24 utility: test to apply_mmr_to_docs

    Failure summary:

    The action failed because the test test_mmr:24 utility: test to apply_mmr_to_docs failed. The test
    was checking the functionality of the apply_mmr_to_docs function. The failure occurred at line 57-58
    in tests/test_mmr.py, where the test expected the result to contain 5 documents, but the actual
    result contained documents with different UUIDs than expected. The specific difference was between:

  • Expected: UUID('550e8400-e29b-41d4-a716-446655441122')
  • Actual: UUID('550e8400-e29b-41d4-a716-446655440000')

  • Relevant error logs:
    1:  ##[group]Operating System
    2:  Ubuntu
    ...
    
    956:  prune-cache: true
    957:  ignore-nothing-to-cache: false
    958:  ##[endgroup]
    959:  Downloading uv from "https://github.com/astral-sh/uv/releases/download/0.7.6/uv-x86_64-unknown-linux-gnu.tar.gz" ...
    960:  [command]/usr/bin/tar xz --warning=no-unknown-keyword --overwrite -C /home/runner/work/_temp/836748bf-e3b9-4929-9afa-41eb0650e354 -f /home/runner/work/_temp/1fd5a7a8-5037-451e-8079-df2ba5e28083
    961:  Added /opt/hostedtoolcache/uv/0.7.6/x86_64 to the path
    962:  Added /home/runner/.local/bin to the path
    963:  Set UV_CACHE_DIR to /home/runner/work/_temp/setup-uv-cache
    964:  Successfully installed uv version 0.7.6
    965:  Searching files using cache dependency glob: **/uv.lock
    966:  /home/runner/work/julep/julep/agents-api/uv.lock
    967:  /home/runner/work/julep/julep/cli/uv.lock
    968:  /home/runner/work/julep/julep/integrations-service/uv.lock
    969:  Found 3 files to hash.
    970:  Trying to restore uv cache from GitHub Actions cache with key: setup-uv-1-x86_64-unknown-linux-gnu-0.7.6-d92603d25acef1c08e643c37cc2475e5e190deb9690356b084828d60043a591f
    971:  ##[warning]Failed to restore: Cache service responded with 422
    972:  No GitHub Actions cache found for key: setup-uv-1-x86_64-unknown-linux-gnu-0.7.6-d92603d25acef1c08e643c37cc2475e5e190deb9690356b084828d60043a591f
    ...
    
    1479:  sql                                               
    1480:  PASS  test_agent_queries:126 query: update agent sql                         4%
    1481:  PASS  test_agent_queries:153 query: update agent with project sql            4%
    1482:  PASS  test_agent_queries:177 query: update agent, project does not exist     4%
    1483:  PASS  test_agent_queries:201 query: patch agent sql                          5%
    1484:  PASS  test_agent_queries:225 query: patch agent with project sql             5%
    1485:  PASS  test_agent_queries:260 query: patch agent, project does not exist      5%
    1486:  PASS  test_agent_queries:283 query: get agent not exists sql                 5%
    1487:  PASS  test_agent_queries:294 query: get agent exists sql                     6%
    1488:  PASS  test_agent_queries:315 query: list agents sql                          6%
    1489:  PASS  test_agent_queries:326 query: list agents with project filter sql      6%
    1490:  PASS  test_agent_queries:352 query: list agents sql, invalid sort            6%
    1491:  direction                                         
    1492:  PASS  test_agent_queries:368 query: delete agent sql                         6%
    1493:  INFO:httpx:HTTP Request: POST http://testserver/agents "HTTP/1.1 403 Forbidden"
    1494:  PASS  test_agent_routes:9 route: unauthorized should fail                    7%
    1495:  INFO:httpx:HTTP Request: POST http://testserver/agents "HTTP/1.1 201 Created"
    ...
    
    1587:  PASS  test_docs_queries:445 query: delete user doc                          20%
    1588:  PASS  test_docs_queries:482 query: delete agent doc                         20%
    1589:  PASS  test_docs_queries:519 query: search docs by text                      20%
    1590:  PASS  test_docs_queries:556 query: search docs by text with technical       21%
    1591:  terms and phrases                                  
    1592:  PASS  test_docs_queries:619 query: search docs by embedding                 21%
    1593:  PASS  test_docs_queries:647 query: search docs by hybrid                    21%
    1594:  INFO:httpx:HTTP Request: POST http://testserver/users/0682c7f1-6248-7771-8000-3b91f1ed0ad9/docs "HTTP/1.1 201 Created"
    1595:  PASS  test_docs_routes:15 route: create user doc                            21%
    1596:  INFO:httpx:HTTP Request: POST http://testserver/agents/0682c7f1-6a52-7048-8000-0dfea73c29e6/docs "HTTP/1.1 201 Created"
    1597:  PASS  test_docs_routes:32 route: create agent doc                           22%
    1598:  INFO:httpx:HTTP Request: POST http://testserver/agents/0682c7f1-723e-7311-8000-49fabb511359/docs "HTTP/1.1 201 Created"
    1599:  INFO:httpx:HTTP Request: POST http://testserver/agents/0682c7f1-723e-7311-8000-49fabb511359/docs "HTTP/1.1 409 Conflict"
    1600:  INFO:httpx:HTTP Request: POST http://testserver/users/0682c7f1-7539-7850-8000-8c175c6c256d/docs "HTTP/1.1 201 Created"
    1601:  PASS  test_docs_routes:49 route: create agent doc with duplicate title      22%
    1602:  should fail                                          
    1603:  INFO:httpx:HTTP Request: POST http://testserver/agents/0682c7f1-7d50-7caa-8000-7cab36d66d26/docs "HTTP/1.1 201 Created"
    ...
    
    1682:  PASS  test_execution_queries:136 query: list executions, invalid offset     29%
    1683:  PASS  test_execution_queries:157 query: list executions, invalid sort by    29%
    1684:  PASS  test_execution_queries:178 query: list executions, invalid sort       29%
    1685:  direction                                     
    1686:  PASS  test_execution_queries:199 query: count executions                    30%
    1687:  PASS  test_execution_queries:217 query: create execution transition         30%
    1688:  PASS  test_execution_queries:238 query: create execution transition -       30%
    1689:  validate transition targets                   
    1690:  PASS  test_execution_queries:283 query: create execution transition with    30%
    1691:  execution update                              
    1692:  PASS  test_execution_queries:310 query: get execution with transitions      31%
    1693:  count                                         
    1694:  PASS  test_execution_queries:325 query: list executions with                31%
    1695:  latest_executions view                        
    1696:  PASS  test_execution_queries:348 query: execution with finish transition    31%
    1697:  PASS  test_execution_queries:382 query: execution with error transition     31%
    1698:  SKIP  test_execution_workflow… workflow: evaluate step   needs to be fixed  31%
    1699:  single                                          
    1700:  SKIP  test_execution_workflow… workflow: evaluate step   needs to be fixed  32%
    1701:  multiple                                        
    1702:  SKIP  test_execution_workflo… workflow: variable access  needs to be fixed  32%
    1703:  in expressions                                   
    1704:  SKIP  test_execution_workflo… workflow: yield step       needs to be fixed  32%
    1705:  SKIP  test_execution_workflo… workflow: sleep step       needs to be fixed  32%
    1706:  SKIP  test_execution_workflo… workflow: return step      needs to be fixed  33%
    1707:  direct                                           
    1708:  SKIP  test_execution_workflo… workflow: return step      needs to be fixed  33%
    1709:  nested                                           
    1710:  SKIP  test_execution_workflo… workflow: log step         needs to be fixed  33%
    1711:  SKIP  test_execution_workflo… workflow: log step         needs to be fixed  33%
    1712:  expression fail                                  
    1713:  SKIP  test_execution_workf… workflow: system call   workflow: thread race   34%
    ...
    
    1848:  INFO:httpx:HTTP Request: PUT http://testserver/test-paid-methods/put "HTTP/1.1 200 OK"
    1849:  PASS  test_middleware:176 middleware: paid tag bypasses cost limit check    50%
    1850:  INFO:httpx:HTTP Request: DELETE http://testserver/test-paid-methods/delete "HTTP/1.1 200 OK"
    1851:  INFO:httpx:HTTP Request: GET http://testserver/test-get-with-cost-limit "HTTP/1.1 200 OK"
    1852:  PASS  test_middleware:238 middleware: GET request with cost limit exceeded  50%
    1853:  passes through                                       
    1854:  INFO:httpx:HTTP Request: POST http://testserver/test-none-cost "HTTP/1.1 403 Forbidden"
    1855:  PASS  test_middleware:270 middleware: cost is None treats as exceeded       50%
    1856:  limit                                                
    1857:  INFO:httpx:HTTP Request: POST http://testserver/test-null-tags "HTTP/1.1 403 Forbidden"
    1858:  PASS  test_middleware:305 middleware: null tags field handled properly      50%
    1859:  INFO:httpx:HTTP Request: GET http://testserver/test-no-developer-id "HTTP/1.1 200 OK"
    1860:  PASS  test_middleware:340 middleware: no developer_id header passes         51%
    1861:  through                                              
    1862:  INFO:httpx:HTTP Request: GET http://testserver/test-user-not-found "HTTP/1.1 403 Forbidden"
    1863:  INFO:httpx:HTTP Request: GET http://testserver/test-404-error "HTTP/1.1 403 Forbidden"
    1864:  PASS  test_middleware:357 middleware: forbidden, if user is not found       51%
    1865:  INFO:httpx:HTTP Request: GET http://testserver/test-500-error "HTTP/1.1 500 Internal Server Error"
    1866:  PASS  test_middleware:397 middleware: hand over all the http errors except  51%
    1867:  of 404                                               
    ...
    
    1870:  INFO:httpx:HTTP Request: GET http://testserver/test-valid-user "HTTP/1.1 200 OK"
    1871:  PASS  test_middleware:442 middleware: valid user passes through             51%
    1872:  INFO:httpx:HTTP Request: POST http://testserver/sessions "HTTP/1.1 403 Forbidden"
    1873:  PASS  test_middleware:472 middleware: can't create session when cost limit  52%
    1874:  is reached                                           
    1875:  INFO:httpx:HTTP Request: POST http://testserver/sessions "HTTP/1.1 201 Created"
    1876:  INFO:httpx:HTTP Request: DELETE http://testserver/sessions/0682c7f4-20b3-7087-8000-70640e952408 "HTTP/1.1 403 Forbidden"
    1877:  INFO:httpx:HTTP Request: GET http://testserver/sessions/0682c7f4-20b3-7087-8000-70640e952408 "HTTP/1.1 200 OK"
    1878:  PASS  test_middleware:515 middleware: can't delete session when cost limit  52%
    1879:  is reached                                           
    1880:  FAIL  test_mmr:24 utility: test to apply_mmr_to_docs                        52%
    1881:  PASS  test_mmr:61 utility: test mmr with different mmr_strength values      52%
    1882:  PASS  test_mmr:101 utility: test mmr with empty docs list                   53%
    1883:  PASS  test_model_validation:10 validate_model: succeeds when model is       53%
    1884:  available in model list                         
    1885:  PASS  test_model_validation:19 validate_model: fails when model is          53%
    1886:  unavailable in model list                       
    1887:  PASS  test_model_validation:31 validate_model: fails when model is None     53%
    1888:  PASS  test_nlp_utilities:6 utility: clean_keyword                           54%
    ...
    
    1904:  PASS  test_prepare_for_step:202 utility: get_workflow_name - raises         56%
    1905:  PASS  test_prepare_for_step:240 utility: get_inputs - 2 parallel            57%
    1906:  subworkflows                                   
    1907:  PASS  test_query_utils:5 utility: sanitize_string - strings                 57%
    1908:  PASS  test_query_utils:15 utility: sanitize_string - nested data            57%
    1909:  structures                                           
    1910:  PASS  test_query_utils:41 utility: sanitize_string - non-string types       57%
    1911:  PASS  test_secrets_queries:17 query: create secret                          57%
    1912:  PASS  test_secrets_queries:44 query: list secrets                           58%
    1913:  PASS  test_secrets_queries:90 query: list secrets (decrypt=False)           58%
    1914:  PASS  test_secrets_queries:135 query: get secret by name                    58%
    1915:  PASS  test_secrets_queries:163 query: get secret by name (decrypt=False)    58%
    1916:  PASS  test_secrets_queries:191 query: update secret                         59%
    1917:  PASS  test_secrets_queries:246 query: delete secret                         59%
    1918:  INFO:httpx:HTTP Request: GET http://testserver/secrets "HTTP/1.1 403 Forbidden"
    1919:  PASS  test_secrets_routes:10 route: unauthorized secrets route should fail  59%
    1920:  INFO:httpx:HTTP Request: POST http://testserver/secrets "HTTP/1.1 201 Created"
    1921:  PASS  test_secrets_routes:27 route: create secret                           59%
    1922:  INFO:httpx:HTTP Request: POST http://testserver/secrets "HTTP/1.1 201 Created"
    1923:  INFO:httpx:HTTP Request: GET http://testserver/secrets "HTTP/1.1 200 OK"
    1924:  PASS  test_secrets_routes:52 route: list secrets                            60%
    1925:  INFO:httpx:HTTP Request: POST http://testserver/secrets "HTTP/1.1 201 Created"
    1926:  INFO:httpx:HTTP Request: PUT http://testserver/secrets/0682c7f4-6d5c-7d42-8000-617fe39cc2f7 "HTTP/1.1 200 OK"
    1927:  PASS  test_secrets_routes:86 route: update secret                           60%
    1928:  INFO:httpx:HTTP Request: POST http://testserver/secrets "HTTP/1.1 201 Created"
    1929:  INFO:httpx:HTTP Request: DELETE http://testserver/secrets/0682c7f4-6d7f-797b-8000-bf0032623c69 "HTTP/1.1 202 Accepted"
    1930:  INFO:httpx:HTTP Request: GET http://testserver/secrets "HTTP/1.1 200 OK"
    1931:  PASS  test_secrets_routes:131 route: delete secret                          60%
    1932:  INFO:httpx:HTTP Request: POST http://testserver/secrets "HTTP/1.1 201 Created"
    1933:  INFO:httpx:HTTP Request: POST http://testserver/secrets "HTTP/1.1 409 Conflict"
    1934:  PASS  test_secrets_routes:172 route: create duplicate secret name fails     60%
    1935:  SKIP  test_secrets_usage:27 render:                    Skipping secrets     60%
    ...
    
    1942:  SKIP  test_secrets_usage:2… tasks:                     Skipping secrets     61%
    1943:  list_secrets_query in         usage tests          
    1944:  StepContext.tools                                  
    1945:  method                                             
    1946:  PASS  test_session_queries:37 query: create session sql                     61%
    1947:  PASS  test_session_queries:60 query: create or update session sql           61%
    1948:  PASS  test_session_queries:84 query: get session exists                     62%
    1949:  PASS  test_session_queries:100 query: get session does not exist            62%
    1950:  PASS  test_session_queries:114 query: list sessions                         62%
    1951:  PASS  test_session_queries:131 query: list sessions with filters            62%
    1952:  PASS  test_session_queries:150 query: count sessions                        63%
    1953:  PASS  test_session_queries:164 query: update session sql                    63%
    1954:  PASS  test_session_queries:199 query: patch session sql                     63%
    1955:  PASS  test_session_queries:226 query: delete session sql                    63%
    1956:  INFO:httpx:HTTP Request: GET http://testserver/sessions "HTTP/1.1 403 Forbidden"
    1957:  PASS  test_session_routes:7 route: unauthorized should fail                 63%
    1958:  INFO:httpx:HTTP Request: POST http://testserver/sessions "HTTP/1.1 201 Created"
    ...
    
    2052:  PASS  test_task_queries:70 query: get task sql - exists                     73%
    2053:  PASS  test_task_queries:89 query: get task sql - not exists                 73%
    2054:  PASS  test_task_queries:107 query: delete task sql - exists                 73%
    2055:  PASS  test_task_queries:143 query: delete task sql - not exists             74%
    2056:  PASS  test_task_queries:162 query: list tasks sql - with filters            74%
    2057:  PASS  test_task_queries:183 query: list tasks sql - no filters              74%
    2058:  PASS  test_task_queries:201 query: list tasks sql, invalid limit            74%
    2059:  PASS  test_task_queries:227 query: list tasks sql, invalid offset           74%
    2060:  PASS  test_task_queries:243 query: list tasks sql, invalid sort by          75%
    2061:  PASS  test_task_queries:259 query: list tasks sql, invalid sort direction   75%
    2062:  PASS  test_task_queries:275 query: update task sql - exists                 75%
    2063:  PASS  test_task_queries:311 query: update task sql - not exists             75%
    2064:  PASS  test_task_queries:338 query: patch task sql - exists                  76%
    2065:  PASS  test_task_queries:386 query: patch task sql - not exists              76%
    2066:  INFO:httpx:HTTP Request: POST http://testserver/agents/0682c7f6-0228-79bb-8000-9761e1a39e04/tasks "HTTP/1.1 403 Forbidden"
    2067:  PASS  test_task_routes:27 route: unauthorized should fail                   76%
    2068:  INFO:httpx:HTTP Request: POST http://testserver/agents/0682c7f6-081d-74e6-8000-66a607d77565/tasks "HTTP/1.1 201 Created"
    ...
    
    2078:  INFO:httpx:HTTP Request: GET http://testserver/tasks/0682c7f6-1cc3-7770-8000-46c55e55cdbd "HTTP/1.1 200 OK"
    2079:  PASS  test_task_routes:124 route: get task exists                           77%
    2080:  INFO:httpx:HTTP Request: GET http://testserver/executions/0682c7f6-28b8-72a9-8000-f394875c7d49/transitions "HTTP/1.1 200 OK"
    2081:  PASS  test_task_routes:134 route: list all execution transition             78%
    2082:  INFO:httpx:HTTP Request: GET http://testserver/executions/0682c7f6-352b-7ce2-8000-5821f01e7523/transitions/0682c7f6-3859-7694-8000-974f21b20921 "HTTP/1.1 200 OK"
    2083:  PASS  test_task_routes:149 route: list a single execution transition        78%
    2084:  INFO:httpx:HTTP Request: GET http://testserver/tasks/0682c7f2-340a-7b76-8000-2ee1c21d1add/executions "HTTP/1.1 200 OK"
    2085:  PASS  test_task_routes:190 route: list task executions                      78%
    2086:  INFO:httpx:HTTP Request: GET http://testserver/agents/0682c7f6-3e8c-726f-8000-4794707406f0/tasks "HTTP/1.1 200 OK"
    2087:  INFO:httpx:HTTP Request: POST http://testserver/agents/0682c7f6-3e8c-726f-8000-4794707406f0/tasks "HTTP/1.1 201 Created"
    2088:  INFO:httpx:HTTP Request: GET http://testserver/agents/0682c7f6-3e8c-726f-8000-4794707406f0/tasks "HTTP/1.1 200 OK"
    2089:  PASS  test_task_routes:205 route: list tasks                                78%
    2090:  SKIP  test_task_routes:248 route: update execution   Temporal connection    79%
    2091:  issue              
    2092:  PASS  test_task_validation:7 task_validation: Python expression validator   79%
    2093:  detects syntax errors                             
    2094:  PASS  test_task_validation:16 task_validation: Python expression validator  79%
    2095:  detects undefined names                          
    2096:  PASS  test_task_validation:25 task_validation: Python expression validator  79%
    2097:  allows steps variable access                     
    2098:  PASS  test_task_validation:33 task_validation: Python expression validator  80%
    2099:  detects unsafe operations                        
    2100:  PASS  test_task_validation:42 task_validation: Python expression validator  80%
    2101:  detects unsafe dunder attributes                 
    2102:  PASS  test_task_validation:63 task_validation: Python expression validator  80%
    2103:  detects potential runtime errors                 
    2104:  PASS  test_task_validation:72 task_validation: Python expression            80%
    ...
    
    2183:  PASS  test_user_queries:156 query: update user with project sql             91%
    2184:  PASS  test_user_queries:193 query: update user, project does not exist      91%
    2185:  PASS  test_user_queries:214 query: get user not exists sql                  91%
    2186:  PASS  test_user_queries:230 query: get user exists sql                      91%
    2187:  PASS  test_user_queries:245 query: list users sql                           92%
    2188:  PASS  test_user_queries:260 query: list users with project filter sql       92%
    2189:  PASS  test_user_queries:287 query: list users sql, invalid limit            92%
    2190:  PASS  test_user_queries:313 query: list users sql, invalid offset           92%
    2191:  PASS  test_user_queries:328 query: list users sql, invalid sort by          93%
    2192:  PASS  test_user_queries:344 query: list users sql, invalid sort direction   93%
    2193:  PASS  test_user_queries:361 query: patch user sql                           93%
    2194:  PASS  test_user_queries:381 query: patch user with project sql              93%
    2195:  PASS  test_user_queries:419 query: patch user, project does not exist       94%
    2196:  PASS  test_user_queries:441 query: delete user sql                          94%
    2197:  INFO:httpx:HTTP Request: POST http://testserver/users "HTTP/1.1 403 Forbidden"
    2198:  PASS  test_user_routes:9 route: unauthorized should fail                    94%
    2199:  INFO:httpx:HTTP Request: POST http://testserver/users "HTTP/1.1 201 Created"
    ...
    
    2215:  INFO:httpx:HTTP Request: GET http://testserver/users/0682c7f7-4f49-79d7-8000-4e373ce3b1ee "HTTP/1.1 200 OK"
    2216:  PASS  test_user_routes:142 route: update user with project                  96%
    2217:  INFO:httpx:HTTP Request: PATCH http://testserver/users/0682c7f7-5549-7bcc-8000-0bd2291e5f10 "HTTP/1.1 200 OK"
    2218:  INFO:httpx:HTTP Request: GET http://testserver/users/0682c7f7-5549-7bcc-8000-0bd2291e5f10 "HTTP/1.1 200 OK"
    2219:  PASS  test_user_routes:174 query: patch user                                96%
    2220:  INFO:httpx:HTTP Request: PATCH http://testserver/users/0682c7f7-58b2-7f7d-8000-f6271c878153 "HTTP/1.1 200 OK"
    2221:  INFO:httpx:HTTP Request: GET http://testserver/users/0682c7f7-58b2-7f7d-8000-f6271c878153 "HTTP/1.1 200 OK"
    2222:  PASS  test_user_routes:205 query: patch user with project                   96%
    2223:  INFO:httpx:HTTP Request: GET http://testserver/users "HTTP/1.1 200 OK"
    2224:  PASS  test_user_routes:238 query: list users                                96%
    2225:  INFO:httpx:HTTP Request: POST http://testserver/users "HTTP/1.1 201 Created"
    2226:  INFO:httpx:HTTP Request: GET http://testserver/users?project=serious_riemann_oy7 "HTTP/1.1 200 OK"
    2227:  PASS  test_user_routes:253 query: list users with project filter            97%
    2228:  INFO:httpx:HTTP Request: GET http://testserver/users?metadata_filter=%7B%27test%27%3A+%27test%27%7D "HTTP/1.1 200 OK"
    2229:  PASS  test_user_routes:286 query: list users with right metadata filter     97%
    2230:  PASS  test_validation_errors:9 format_location: formats error location      97%
    2231:  paths correctly                                 
    2232:  PASS  test_validation_errors:31 get_error_suggestions: generates helpful    97%
    2233:  suggestions for missing fields                 
    2234:  PASS  test_validation_errors:42 get_error_suggestions: generates helpful    97%
    2235:  suggestions for type errors                    
    2236:  PASS  test_validation_errors:64 get_error_suggestions: generates helpful    98%
    2237:  suggestions for string length errors           
    2238:  PASS  test_validation_errors:85 get_error_suggestions: generates helpful    98%
    2239:  suggestions for number range errors            
    2240:  WARNING:agents_api.web:Validation error: [{'type': 'dict_type', 'msg': 'Input should be a valid dictionary', 'loc': 'metadata', 'received': 'not-an-object'}, {'type': 'missing', 'msg': 'Field required', 'loc': 'name', 'fix': 'Add this required field to your request', 'example': '{ "field_name": "value" }', 'received': "{'about': 'Test agent description', 'model': 'invalid-model-id', 'metadata': 'not-an-object'}"}]
    2241:  INFO:httpx:HTTP Request: POST http://testserver/agents "HTTP/1.1 422 Unprocessable Entity"
    2242:  PASS  test_validation_errors:107 validation_error_handler: returns          98%
    2243:  formatted error response for validation       
    2244:  errors                                        
    2245:  PASS  test_validation_errors:148 validation_error_suggestions: function     98%
    2246:  generates helpful suggestions for all         
    2247:  error types                                   
    2248:  PASS  test_workflow_helpers:25 execute_map_reduce_step_parallel:            99%
    ...
    
    2255:  INFO:httpx:HTTP Request: POST http://testserver/tasks/0682c7f7-6882-71e3-8000-6d74053783da/executions "HTTP/1.1 201 Created"
    2256:  PASS  test_workflow_routes:10 workflow route: evaluate step single          99%
    2257:  INFO:httpx:HTTP Request: POST http://testserver/agents/0682c7f7-70aa-7b48-8000-e6b9d97262fe/tasks "HTTP/1.1 201 Created"
    2258:  INFO:httpx:HTTP Request: POST http://testserver/tasks/0682c7f7-72b2-72c5-8000-f44a3f4116cb/executions "HTTP/1.1 201 Created"
    2259:  PASS  test_workflow_routes:41 workflow route: evaluate step single with    100%
    2260:  yaml                                             
    2261:  INFO:httpx:HTTP Request: POST http://testserver/agents/0682c7f7-78e2-71f2-8000-9798e1ab7dc7/tasks "HTTP/1.1 201 Created"
    2262:  INFO:httpx:HTTP Request: POST http://testserver/tasks/0682c7f7-7ae7-74c7-8000-5b91526f4a50/executions "HTTP/1.1 201 Created"
    2263:  PASS  test_workflow_routes:83 workflow route: evaluate step single with    100%
    2264:  yaml - nested                                    
    2265:  INFO:httpx:HTTP Request: POST http://testserver/agents/0682c7f7-813e-7363-8000-d8693179f6d9/tasks/0682c7f7-815d-79bf-8000-885818cd9752 "HTTP/1.1 201 Created"
    2266:  INFO:httpx:HTTP Request: POST http://testserver/tasks/0682c7f7-815d-79bf-8000-885818cd9752/executions "HTTP/1.1 201 Created"
    2267:  PASS  test_workflow_routes:128 workflow route: create or update: evaluate  100%
    2268:  step single with yaml                           
    2269:  ────────────────────── utility: test to apply_mmr_to_docs ──────────────────────
    2270:  Failed at tests/test_mmr.py:53                                                
    2271:  24 @test("utility: test to apply_mmr_to_docs")                            
    ...
    
    2302:  55                                                                        
    2303:  56     # Test with limit greater than available docs                      
    2304:  57     result = apply_mmr_to_docs(docs, query_embedding, limit=10, mmr_str
    2305:  58     assert len(result) == 5  # Only 5 docs have embeddings             
    2306:  59                                                                        
    2307:  ╭─ Difference (LHS vs RHS) ──────────────────────────────────────────────────╮
    2308:  │                                                                            │
    2309:  │ UUID('550e8400-e29b-41d4-a716-446655441122')                               │
    2310:  │ UUID('550e8400-e29b-41d4-a716-446655440000')                               │
    2311:  │                                                                            │
    2312:  ╰────────────────────────────────────────────────────────────────────────────╯
    2313:  ────────────────────────────────────────────────────────────────────────────────
    2314:  ╭───────────── Results ─────────────╮
    2315:  │  435  Tests Encountered           │
    2316:  │  402  Passes             (92.4%)  │
    2317:  │    1  Failures           (0.2%)   │
    2318:  │   32  Skips              (7.4%)   │
    2319:  ╰───────────────────────────────────╯
    2320:  ─────────────────────────── FAILED in 220.70 seconds ───────────────────────────
    2321:  ##[error]Process completed with exit code 1.
    2322:  Post job cleanup.
    

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    1 participant