Skip to content

Actions: openai/evals

Actions

All workflows

Actions

Loading...
Loading

Showing runs from all workflows
19 workflow runs
19 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

Fix typos
Run unit tests #1800: Pull request #1585 opened by GameRoMan
May 30, 2025 12:50 Action required GameRoMan:fix-typos
May 30, 2025 12:50 Action required
Fix typos
Run new evals #2290: Pull request #1585 opened by GameRoMan
May 30, 2025 12:50 Action required GameRoMan:fix-typos
May 30, 2025 12:50 Action required
fix(bug): Code injection evalcommand
Run unit tests #1799: Pull request #1584 opened by odaysec
May 17, 2025 02:44 Action required odaysec:patch-1
May 17, 2025 02:44 Action required
Updating readme to link to OpenAI hosted evals experience (#1572)
Run unit tests #1788: Commit cdb8ce9 pushed by kwhinnery-openai
December 18, 2024 22:09 3m 41s main
December 18, 2024 22:09 3m 41s
Updating readme to link to OpenAI hosted evals experience
Run unit tests #1787: Pull request #1572 opened by dmitry-openai
December 18, 2024 21:57 3m 49s dmitry/readme-update
December 18, 2024 21:57 3m 49s
20240930 steven exception handling usage tokens (#1560)
Run unit tests #1783: Commit a32c982 pushed by sjadler2004
September 30, 2024 21:30 3m 43s main
September 30, 2024 21:30 3m 43s
Fix the is_chat_model function to work with gpt-4o
Run unit tests #1775: Pull request #1550 opened by LoryPack
August 22, 2024 15:16 3m 36s LoryPack:add-4o
August 22, 2024 15:16 3m 36s
Remove global OpenAI client initialization
Run unit tests #1764: Pull request #1539 opened by michaelAlvarino
July 21, 2024 17:04 3m 40s michaelAlvarino:main
July 21, 2024 17:04 3m 40s
Remove global OpenAI client initialization
Run new evals #2276: Pull request #1539 opened by michaelAlvarino
July 21, 2024 17:04 2m 13s michaelAlvarino:main
July 21, 2024 17:04 2m 13s
[eval] Add IMO problems with exact answers (#1528)
Run unit tests #1763: Commit 234bcde pushed by kliu128
July 13, 2024 19:52 3m 49s main
July 13, 2024 19:52 3m 49s
Added Quran Eval & Simple Fact Model-Graded Definition
Run new evals #2271: Pull request #1511 synchronize by sakher
June 20, 2024 14:13 2m 22s sakher:quran-eval
June 20, 2024 14:13 2m 22s
Added Quran Eval & Simple Fact Model-Graded Definition
Run unit tests #1754: Pull request #1511 synchronize by sakher
June 20, 2024 14:13 3m 43s sakher:quran-eval
June 20, 2024 14:13 3m 43s
Fix problematic sample in Schelling Point
Run unit tests #1752: Pull request #1534 opened by JunShern
May 22, 2024 23:04 8m 5s jun/schellingpoint-fix
May 22, 2024 23:04 8m 5s
Fix problematic sample in Schelling Point
Run new evals #2270: Pull request #1534 opened by JunShern
May 22, 2024 23:04 4m 38s jun/schellingpoint-fix
May 22, 2024 23:04 4m 38s
eval pattern-concat-logic
Run unit tests #1735: Pull request #1508 synchronize by natanaelwf
May 9, 2024 13:18 3m 55s natanaelwf:pattern-concat-logic
May 9, 2024 13:18 3m 55s
eval pattern-concat-logic
Run new evals #2258: Pull request #1508 synchronize by natanaelwf
May 9, 2024 13:18 2m 25s natanaelwf:pattern-concat-logic
May 9, 2024 13:18 2m 25s