Pinned Loading
-
mmlu
mmlu PublicForked from hendrycks/test
Measuring Massive Multitask Language Understanding | ICLR 2021
-
SWE-agent
SWE-agent PublicForked from princeton-nlp/SWE-agent
SWE-agent takes a GitHub issue and tries to automatically fix it, using GPT-4, or your LM of choice. It solves 12.29% of bugs in the SWE-bench evaluation set and takes just 1.5 minutes to run.
Python
-
ServiceNow/TapeAgents
ServiceNow/TapeAgents PublicTapeAgents is a framework that facilitates all stages of the LLM Agent development lifecycle
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.