Workflow runs · openai/evals

Actions

All workflows

Actions

Loading...
Loading

Showing runs from all workflows

10 workflow runs

Updating readme to link to OpenAI hosted evals experience (#1572) Run unit tests #1788: Commit cdb8ce9 pushed by kwhinnery-openai

December 18, 2024 22:09

3m 41s main

main

December 18, 2024 22:09

3m 41s

Updating readme to link to OpenAI hosted evals experience Run unit tests #1787: Pull request #1572 opened by dmitry-openai

December 18, 2024 21:57

3m 49s dmitry/readme-update

dmitry/readme-update

December 18, 2024 21:57

3m 49s

20240930 steven exception handling usage tokens (#1560) Run unit tests #1783: Commit a32c982 pushed by sjadler2004

September 30, 2024 21:30

3m 43s main

main

September 30, 2024 21:30

3m 43s

20240930 steven exception handling usage tokens Run unit tests #1782: Pull request #1560 opened by sjadler2004

September 30, 2024 21:15

3m 42s 20240930-steven-exception-handling-usage-tokens

20240930-steven-exception-handling-usage-tokens

September 30, 2024 21:15

3m 42s

Fix the is_chat_model function to work with gpt-4o Run unit tests #1775: Pull request #1550 opened by LoryPack

August 22, 2024 15:16

3m 36s LoryPack:add-4o

LoryPack:add-4o

August 22, 2024 15:16

3m 36s

Remove global OpenAI client initialization Run unit tests #1764: Pull request #1539 opened by michaelAlvarino

July 21, 2024 17:04

3m 40s michaelAlvarino:main

michaelAlvarino:main

July 21, 2024 17:04

3m 40s

Remove global OpenAI client initialization Run new evals #2276: Pull request #1539 opened by michaelAlvarino

July 21, 2024 17:04

2m 13s michaelAlvarino:main

michaelAlvarino:main

July 21, 2024 17:04

2m 13s

[eval] Add IMO problems with exact answers (#1528) Run unit tests #1763: Commit 234bcde pushed by kliu128

July 13, 2024 19:52

3m 49s main

main

July 13, 2024 19:52

3m 49s

Added Quran Eval & Simple Fact Model-Graded Definition Run new evals #2271: Pull request #1511 synchronize by sakher

June 20, 2024 14:13

2m 22s sakher:quran-eval

sakher:quran-eval

June 20, 2024 14:13

2m 22s

Added Quran Eval & Simple Fact Model-Graded Definition Run unit tests #1754: Pull request #1511 synchronize by sakher

June 20, 2024 14:13

3m 43s sakher:quran-eval

sakher:quran-eval

June 20, 2024 14:13

3m 43s

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Actions

Workflows

Management

All workflows

Actions

Loading...
Loading

All workflows

Uh oh!

Filter by Event

Sorry, something went wrong.

Sorry, something went wrong.

No matching events.

Filter by Status

Sorry, something went wrong.

Sorry, something went wrong.

No matching statuses.

Filter by Branch

Sorry, something went wrong.

Sorry, something went wrong.

No matching branches.

Filter by Actor

Sorry, something went wrong.

Sorry, something went wrong.

No matching users.

Actions: openai/evals

Actions

All workflows All workflows Actions Loading... Loading Sorry, something went wrong. Uh oh! There was an error while loading. Please reload this page.

All workflows

All workflows

Actions

Loading...
Loading