Skip to content
View Taishi-N324's full-sized avatar

Highlights

  • Pro

Organizations

@rioyokotalab @LAION-AI @TheDuckAI @llm-jp @SakanaAI @tokyotech-llm @swallow-llm

Block or report Taishi-N324

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

🪐 The Sebulba architecture to scale reinforcement learning on Cloud TPUs in JAX

Python 57 5 Updated Oct 23, 2023
Python 96 9 Updated Mar 4, 2025

Official codebase for "SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution"

Python 418 30 Updated Mar 1, 2025

A project to improve skills of large language models

Python 255 50 Updated Mar 8, 2025
Python 18 1 Updated Feb 2, 2025

Letting Claude Code develop his own MCP tools :)

TypeScript 52 9 Updated Mar 8, 2025

Humanity's Last Exam

Python 541 26 Updated Feb 26, 2025

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

C++ 608 40 Updated Mar 9, 2025

Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends

Python 1,262 186 Updated Mar 4, 2025

Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL

Python 875 53 Updated Mar 4, 2025

Autonomous coding agent right in your IDE, capable of creating/editing files, executing commands, using the browser, and more with your permission every step of the way.

TypeScript 33,692 3,339 Updated Mar 9, 2025

Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers

Python 11 1 Updated Mar 1, 2025

Solve puzzles. Learn CUDA.

Jupyter Notebook 10,649 822 Updated Sep 1, 2024

Official Repo for Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning

Jupyter Notebook 313 21 Updated Dec 15, 2024
Python 1,293 89 Updated Jan 23, 2025

Computer gaming agents that run on your PC and laptops.

Python 412 49 Updated Mar 8, 2025

Official PyTorch implementation for "Large Language Diffusion Models"

Python 1,088 67 Updated Mar 8, 2025

Pokemon terminal themes.

Python 4,566 243 Updated Jul 18, 2024

Tailor-made Pokémon themes for your Hyper terminal

JavaScript 1,233 70 Updated Aug 30, 2024

Tabular In-Context Learning

Jupyter Notebook 53 11 Updated Mar 6, 2025

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 1,239 68 Updated Mar 7, 2025

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 4,854 474 Updated Mar 8, 2025

Playing Pokemon Red with Reinforcement Learning

Jupyter Notebook 7,214 681 Updated Mar 6, 2025

Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym

Jupyter Notebook 387 24 Updated Feb 26, 2025

FlashMLA: Efficient MLA decoding kernels

C++ 11,219 784 Updated Mar 1, 2025

DeepEP: an efficient expert-parallel communication library

Cuda 7,092 611 Updated Mar 6, 2025

Everything you want to know about Google Cloud TPU

Python 517 30 Updated Jul 16, 2024

A fast MoE impl for PyTorch

Python 1,658 192 Updated Feb 10, 2025

[ICLR2025 Spotlight🔥] Official Implementation of TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters

Python 532 40 Updated Feb 11, 2025
Next
Showing results