SII - Generative Artificial Intelligence Research Lab (GAIR)

All

56 repositories

AgencyBench
Public
Python
•
MIT License
•1•13•1•0•Updated Jan 15, 2026Jan 15, 2026
SII-CLI
Public
0•30•0•0•Updated Jan 13, 2026Jan 13, 2026
ResearcherBench
Public
ResearcherBench: Evaluating Deep AI Research Systems on the Frontiers of Scientific Inquiry
Python
•5•41•2•1•Updated Jan 5, 2026Jan 5, 2026
LiveTalk
Public
Python
•
Other
•17•221•7•0•Updated Jan 2, 2026Jan 2, 2026
ASI-Arch
Public
AlphaGo Moment for Model Architecture Discovery.
Python
•
Apache License 2.0
•216•1.1k•9•0•Updated Dec 3, 2025Dec 3, 2025
MegaScience
Public
MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning
science llama reasoning post-training llm qwen scientific-reasoning
Python
•
Apache License 2.0
•6•110•0•0•Updated Nov 23, 2025Nov 23, 2025
InnovatorBench
Public
A benchmark for LLMs on complicated long-horizon tasks that last for days.
Jupyter Notebook
•
Apache License 2.0
•0•12•0•0•Updated Nov 12, 2025Nov 12, 2025
SR-Scientist
Public
SR-Scientist: Scientific Equation Discovery With Agentic AI
Python
•0•29•0•0•Updated Nov 7, 2025Nov 7, 2025
Context-Engineering-2.0
Public
21•261•0•0•Updated Nov 6, 2025Nov 6, 2025
DeepResearcher
Public
Scaling Deep Research via Reinforcement Learning in Real-world Environments.
Python
•
Apache License 2.0
•46•686•9•0•Updated Oct 15, 2025Oct 15, 2025
LIMI
Public
LIMI: Less is More for Agency
Python
•7•158•6•0•Updated Oct 14, 2025Oct 14, 2025
ReasonEval
Public
[AAAI 2025 oral] Evaluating Mathematical Reasoning Beyond Accuracy
Python
•4•76•2•0•Updated Oct 9, 2025Oct 9, 2025
DatasetResearch
Public
DatasetResearch: Benchmarking Agent Systems for Demand-Driven Dataset Discovery
Python
•
Apache License 2.0
•0•20•0•0•Updated Sep 24, 2025Sep 24, 2025
WindowsAgentArena-V2
Public
Python
•
MIT License
•0•3•0•0•Updated Sep 9, 2025Sep 9, 2025
PC-Agent-E
Public
Efficient Agent Training for Computer Use
Python
•
MIT License
•8•135•0•0•Updated Sep 5, 2025Sep 5, 2025
LIMO
Public
[COLM 2025] LIMO: Less is More for Reasoning
Python
•52•1.1k•6•0•Updated Jul 30, 2025Jul 30, 2025
ASI4AI
Public
JavaScript
•1•7•0•0•Updated Jul 23, 2025Jul 23, 2025
lm-open-science-evaluation
Public
Reproducible and flexible LLM evaluations for scientific reasoning.
science evaluation reasoning llm scientific-reasoning
Python
•
Apache License 2.0
•0•26•0•0•Updated Jul 23, 2025Jul 23, 2025
OctoThinker
Public
Revisiting Mid-training in the Era of Reinforcement Learning Scaling
rl llama reasoning post-training pre-training llm qwen verl mid-training
Jupyter Notebook
•
Apache License 2.0
•14•182•5•0•Updated Jul 23, 2025Jul 23, 2025
ProX
Public
[ICML 2025] Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale
llama data-quality mistral pre-training continual neural-symbolic data-centric-ai llm continual-pre-training
Python
•
Apache License 2.0
•18•264•2•0•Updated Jul 8, 2025Jul 8, 2025
anole
Public
Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation
Python
•50•818•35•1•Updated Jun 16, 2025Jun 16, 2025
thinking-with-generated-images
Public
Doodling our way to AGI ✏️ 🖼️ 🧠
Python
•4•120•1•0•Updated May 29, 2025May 29, 2025
LIMOPro
Public
Python
•
Apache License 2.0
•0•13•1•0•Updated May 27, 2025May 27, 2025
DynToM
Public
Python
•0•10•0•0•Updated May 26, 2025May 26, 2025
ToRL
Public
Python
•17•327•23•0•Updated May 24, 2025May 24, 2025
PC-Agent
Public
PC Agent: While You Sleep, AI Works - A Cognitive Journey into Digital World
Python
•
MIT License
•29•308•2•1•Updated May 21, 2025May 21, 2025
cognition-engineering
Public
Generative AI Act II: Test Time Scaling Drives Cognition Engineering
Python
•9•209•1•0•Updated Apr 22, 2025Apr 22, 2025
MAYE
Public
Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme
Python
•8•146•3•0•Updated Apr 9, 2025Apr 9, 2025
simple_tts
Public
Python
•0•1•0•0•Updated Apr 5, 2025Apr 5, 2025
MathPile
Public
[NeurlPS D&B 2024] Generative AI for Math: MathPile
math corpus language-model pre-training large-language-models
Python
•
Apache License 2.0
•22•419•0•0•Updated Apr 4, 2025Apr 4, 2025