Awesome Python

Hand-picked awesome Python libraries and frameworks, organised by category 🐍

Interactive version: www.awesomepython.org

Updated 21 Mar 2025

Newly Created Repositories

Awesome Python is regularly updated, and this category lists the most recently created GitHub repositories from all the other repositories here.

mannaandpoem/OpenManus ⭐ 36,746
Open source version of Manus, the general AI agent
🔗 openmanus.github.io
camel-ai/owl ⭐ 12,653
🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation
anthropics/claude-code ⭐ 6,496
Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflows
🔗 docs.anthropic.com/s/claude-code
zilliztech/deep-searcher ⭐ 4,476
DeepSearcher combines reasoning LLMs and VectorDBs o perform search, evaluation, and reasoning based on private data, providing highly accurate answer and comprehensive report
deepseek-ai/smallpond ⭐ 4,269
A lightweight data processing framework built on DuckDB and 3FS.
deepseek-ai/DualPipe ⭐ 2,604
DualPipe is an innovative bidirectional pipeline parallelism algorithm introduced in the DeepSeek-V3 Technical Report.
openmanus/OpenManus-RL ⭐ 1,572
OpenManus-RL is an open-source initiative collaboratively led by Ulab-UIUC and MetaGPT. This project is an extended version of the original OpenManus initiative.
hiyouga/EasyR1 ⭐ 1,528
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
🔗 verl.readthedocs.io/en/latest/index.html
deepseek-ai/EPLB ⭐ 1,076
Expert Parallelism Load Balancer across GPUs
emissary-tech/legit-rag ⭐ 240
A modular Retrieval-Augmented Generation (RAG) system built with FastAPI, Qdrant, and OpenAI.

Agentic AI

Agentic AI libraries, frameworks and tools: AI agents, workflows, autonomous decision-making, goal-oriented tasks, and API integrations.

langchain-ai/langchain ⭐ 103,604
🦜🔗 Build context-aware reasoning applications
🔗 python.langchain.com
langgenius/dify ⭐ 83,239
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
🔗 dify.ai
geekan/MetaGPT ⭐ 52,783
🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
🔗 deepwisdom.ai
logspace-ai/langflow ⭐ 51,893
Langflow is a powerful tool for building and deploying AI-powered agents and workflows.
🔗 www.langflow.org
browser-use/browser-use ⭐ 45,589
Browser use is the easiest way to connect your AI agents with the browser.
🔗 browser-use.com
microsoft/autogen ⭐ 41,666
AutoGen is a framework for creating multi-agent AI applications that can act autonomously or work alongside humans.
🔗 microsoft.github.io/autogen
run-llama/llama_index ⭐ 40,104
LlamaIndex is the leading framework for building LLM-powered agents over your data.
🔗 docs.llamaindex.ai
mannaandpoem/OpenManus ⭐ 36,746
Open source version of Manus, the general AI agent
🔗 openmanus.github.io
crewaiinc/crewAI ⭐ 28,808
Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.
🔗 crewai.com
openbmb/ChatDev ⭐ 26,479
ChatDev stands as a virtual software company that operates through various intelligent agents holding different roles, including Chief Executive Officer, Chief Product Officer etc
🔗 arxiv.org/abs/2307.07924
mem0ai/mem0 ⭐ 26,371
Enhances AI assistants and agents with an intelligent memory layer, enabling personalized AI interactions
🔗 mem0.ai
composiohq/composio ⭐ 24,364
Composio equip's your AI agents & LLMs with 100+ high-quality integrations via function calling
🔗 docs.composio.dev
stanford-oval/storm ⭐ 23,459
An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.
🔗 storm.genie.stanford.edu
yoheinakajima/babyagi ⭐ 21,211
GPT-4 powered task-driven autonomous agent
🔗 babyagi.org
agno-agi/agno ⭐ 21,143
Build Multimodal AI Agents with memory, knowledge and tools. Simple, fast and model-agnostic.
🔗 docs.agno.com
microsoft/OmniParser ⭐ 20,658
OmniParser is a comprehensive method for parsing user interface screenshots into structured and easy-to-understand elements
assafelovic/gpt-researcher ⭐ 20,306
LLM based autonomous agent that conducts deep local and web research on any topic and generates a long report with citations.
🔗 gptr.dev
openai/swarm ⭐ 19,336
A framework exploring ergonomic, lightweight multi-agent orchestration.
unity-technologies/ml-agents ⭐ 17,820
The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.
🔗 unity.com/products/machine-learning-agents
letta-ai/letta ⭐ 15,381
Letta (formerly MemGPT) is a framework for creating LLM services with memory.
🔗 docs.letta.com
huggingface/smolagents ⭐ 15,114
🤗 smolagents: a barebones library for agents. Agents write python code to call tools and orchestrate other agents.
🔗 huggingface.co/docs/smolagents
dzhng/deep-research ⭐ 14,673
An AI-powered research assistant that performs iterative, deep research on any topic by combining search engines, web scraping, and large language models.
camel-ai/owl ⭐ 12,653
🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation
smol-ai/developer ⭐ 11,901
the first library to let you embed a developer agent in your own app!
🔗 twitter.com/smolmodels
camel-ai/camel ⭐ 10,585
🐫 CAMEL: Finding the Scaling Law of Agents. The first and the best multi-agent framework. https://www.camel-ai.org
🔗 docs.camel-ai.org
langchain-ai/langgraph ⭐ 10,298
LangGraph is a library for building stateful, multi-actor applications with LLMs, built on top of (and intended to be used with) LangChain.
🔗 langchain-ai.github.io/langgraph
sakanaai/AI-Scientist ⭐ 9,366
The AI Scientist, the first comprehensive system for fully automatic scientific discovery, enabling Foundation Models such as Large Language Models (LLMs) to perform research independently.
nirdiamant/GenAI_Agents ⭐ 9,117
Tutorials and implementations for various Generative AI Agent techniques, from basic to advanced. It serves as a comprehensive guide for building intelligent, interactive AI systems.
meta-llama/llama-stack ⭐ 7,485
Llama Stack standardizes the building blocks needed to bring genai applications to market. These blocks cover model training and fine-tuning, evaluation, and running AI agents in production
pydantic/pydantic-ai ⭐ 7,275
PydanticAI is a Python Agent Framework designed to make it less painful to build production grade applications with Generative AI.
🔗 ai.pydantic.dev
upsonic/Upsonic ⭐ 7,036
Upsonic is a reliability-focused framework designed for real-world applications. It enables trusted agent workflows in your organization through advanced reliability features, including verification layers, triangular architecture, validator agents, and output evaluation systems.
🔗 upsonic.ai
mnotgod96/AppAgent ⭐ 5,618
AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.
🔗 appagent-official.github.io
prefecthq/marvin ⭐ 5,562
✨ AI agents that spark joy
🔗 askmarvin.ai
kyegomez/swarms ⭐ 4,725
The Enterprise-Grade Production-Ready Multi-Agent Orchestration Framework. Website: https://swarms.ai
🔗 docs.swarms.world
zilliztech/deep-searcher ⭐ 4,476
DeepSearcher combines reasoning LLMs and VectorDBs o perform search, evaluation, and reasoning based on private data, providing highly accurate answer and comprehensive report
landing-ai/vision-agent ⭐ 4,339
VisionAgent is a library that helps you utilize agent frameworks to generate code to solve your vision task
meta-llama/llama-stack-apps ⭐ 4,173
Agentic components of the Llama Stack APIs
crewaiinc/crewAI-examples ⭐ 3,900
A collection of examples that show how to use CrewAI framework to automate workflows.
x-plug/MobileAgent ⭐ 3,843
Mobile-Agent: The Powerful Mobile Device Operation Assistant Family
🔗 arxiv.org/abs/2406.01014
pyspur-dev/pyspur ⭐ 3,596
A visual playground for agentic workflows: Iterate over your agents 10x faster
🔗 pyspur.dev
langroid/langroid ⭐ 3,159
Harness LLMs with Multi-Agent Programming
🔗 langroid.github.io/langroid
brainblend-ai/atomic-agents ⭐ 3,097
Atomic Agents provides a set of tools and agents that can be combined to create powerful applications. It is built on top of Instructor and leverages the power of Pydantic for data and schema validation and serialization.
facebookresearch/Pearl ⭐ 2,785
A Production-ready Reinforcement Learning AI Agent Library brought by the Applied Reinforcement Learning team at Meta.
joshuac215/agent-service-toolkit ⭐ 2,471
A full toolkit for running an AI agent service built with LangGraph, FastAPI and Streamlit.
🔗 agent-service-toolkit.streamlit.app
om-ai-lab/OmAgent ⭐ 2,305
OmAgent is python library for building multimodal language agents with ease. We try to keep the library simple without too much overhead like other agent framework.
🔗 om-agent.com
langchain-ai/open_deep_research ⭐ 2,268
Open Deep Research is an open source assistant that automates research and produces customizable reports on any topic
griptape-ai/griptape ⭐ 2,229
Modular Python framework for AI agents and workflows with chain-of-thought reasoning, tools, and memory.
🔗 www.griptape.ai
ag2ai/ag2 ⭐ 2,074
AG2 (formerly AutoGen) is an open-source programming framework for building AI agents and facilitating cooperation among multiple agents to solve tasks.
🔗 ag2.ai
i-am-bee/beeai-framework ⭐ 2,072
Build production-ready AI agents in both Python and Typescript
🔗 i-am-bee.github.io/beeai-framework
run-llama/llama_deploy ⭐ 1,980
Async-first framework for deploying, scaling, and productionizing agentic multi-service systems based on workflows from llama_index.
🔗 docs.llamaindex.ai/en/stable/module_guides/llama_deploy
emcie-co/parlant ⭐ 1,850
Control GenAI interactions with power, precision, and consistency using LLM-native Conversation Design paradigms
🔗 www.parlant.io
langchain-ai/executive-ai-assistant ⭐ 1,641
Executive AI Assistant (EAIA) is an AI agent that attempts to do the job of an Executive Assistant (EA).
btahir/open-deep-research ⭐ 1,635
Open source alternative to Gemini Deep Research. Generate reports with AI based on search results.
🔗 opendeepresearch.vercel.app
openmanus/OpenManus-RL ⭐ 1,572
OpenManus-RL is an open-source initiative collaboratively led by Ulab-UIUC and MetaGPT. This project is an extended version of the original OpenManus initiative.
openautocoder/Agentless ⭐ 1,571
Agentless🐱: an agentless approach to automatically solve software development problems
link-agi/AutoAgents ⭐ 1,328
[IJCAI 2024] Generate different roles for GPTs to form a collaborative entity for complex tasks.
🔗 huggingface.co/spaces/linksoul/autoagents
agentera/Agently ⭐ 1,264
Agently is a development framework that helps developers build AI agent native application really fast.
🔗 agently.tech
prefecthq/ControlFlow ⭐ 1,220
ControlFlow provides a structured, developer-focused framework for defining workflows and delegating work to LLMs, without sacrificing control or transparency
🔗 controlflow.ai
shengranhu/ADAS ⭐ 1,219
Automated Design of Agentic Systems using Meta Agent Search to show agents can invent novel and powerful agent designs
🔗 www.shengranhu.com/adas
msoedov/agentic_security ⭐ 1,172
An open-source vulnerability scanner for Agent Workflows and LLMs. Protecting AI systems from jailbreaks, fuzzing, and multimodal attacks.
🔗 agentic-security.vercel.app
plurai-ai/intellagent ⭐ 972
Simulate interactions, analyze performance, and gain actionable insights for conversational agents. Test, evaluate, and optimize your agent to ensure reliable real-world deployment.
🔗 intellagent-doc.plurai.ai
szczyglis-dev/py-gpt ⭐ 925
Desktop AI Assistant powered by o1, o3, GPT-4, GPT-4 Vision, Gemini, Claude, Llama 3, DeepSeek, Bielik, DALL-E, chat, vision, voice control, image generation and analysis, agents, command execution, file upload/download, speech synthesis and recognition, access to Web, memory, presets, assistants, plugins, and more...
🔗 pygpt.net
thytu/Agentarium ⭐ 908
Framework for managing and orchestrating AI agents with ease. Agentarium provides a flexible and intuitive way to create, manage, and coordinate interactions between multiple AI agents in various environments.
victordibia/autogen-ui ⭐ 876
Web UI for AutoGen (A Framework Multi-Agent LLM Applications)
thudm/CogAgent ⭐ 834
An open-sourced end-to-end VLM-based GUI Agent
google-deepmind/concordia ⭐ 809
Concordia is a library to facilitate construction and use of generative agent-based models to simulate interactions of agents in grounded physical, social, or digital space.
deedy/mac_computer_use ⭐ 767
A fork of Anthropic Computer Use that you can run on Mac computers to give Claude and other AI models autonomous access to your computer.
🔗 x.com/deedydas/status/1849481225041559910
strnad/CrewAI-Studio ⭐ 703
agentic,gui,automation
salesforceairesearch/AgentLite ⭐ 582
AgentLite is a research-oriented library designed for building and advancing LLM-based task-oriented agent systems. It simplifies the implementation of new agent/multi-agent architectures, enabling easy orchestration of multiple agents through a manager agent.
quantalogic/quantalogic ⭐ 378
QuantaLogic is a ReAct (Reasoning & Action) framework for building advanced AI agents. The cli version include coding capabilities comparable to Aider.
prithivirajdamodaran/Route0x ⭐ 101
A production-grade query routing solution, leveraging LLMs while optimizing for cost per query

Code Quality

Code quality tooling: linters, formatters, pre-commit hooks, unused code removal.

psf/black ⭐ 39,933
The uncompromising Python code formatter
🔗 black.readthedocs.io/en/stable
astral-sh/ruff ⭐ 36,874
An extremely fast Python linter and code formatter, written in Rust.
🔗 docs.astral.sh/ruff
google/yapf ⭐ 13,862
A formatter for Python files
pre-commit/pre-commit ⭐ 13,518
A framework for managing and maintaining multi-language pre-commit hooks.
🔗 pre-commit.com
sqlfluff/sqlfluff ⭐ 8,673
A modular SQL linter and auto-formatter with support for multiple dialects and templated code.
🔗 www.sqlfluff.com
pycqa/isort ⭐ 6,621
A Python utility / library to sort imports.
🔗 pycqa.github.io/isort
davidhalter/jedi ⭐ 5,912
Awesome autocompletion, static analysis and refactoring library for python
🔗 jedi.readthedocs.io
pycqa/pylint ⭐ 5,423
It's not just a linter that annoys you!
🔗 pylint.readthedocs.io/en/latest
jendrikseipp/vulture ⭐ 3,737
Find dead Python code
asottile/pyupgrade ⭐ 3,725
A tool (and pre-commit hook) to automatically upgrade syntax for newer versions of the language.
pycqa/flake8 ⭐ 3,554
flake8 is a python tool that glues together pycodestyle, pyflakes, mccabe, and third-party plugins to check the style and quality of some python code.
🔗 flake8.pycqa.org
wemake-services/wemake-python-styleguide ⭐ 2,675
The strictest and most opinionated python linter ever!
🔗 wemake-python-styleguide.rtfd.io
python-lsp/python-lsp-server ⭐ 2,111
Fork of the python-language-server project, maintained by the Spyder IDE team and the community
codespell-project/codespell ⭐ 2,053
check code for common misspellings
sourcery-ai/sourcery ⭐ 1,622
Instant AI code reviews
🔗 sourcery.ai
tconbeer/sqlfmt ⭐ 441
sqlfmt formats your dbt SQL files so you don't have to
🔗 sqlfmt.com

Crypto and Blockchain

Cryptocurrency and blockchain libraries: trading bots, API integration, Ethereum virtual machine, solidity.

freqtrade/freqtrade ⭐ 37,321
Free, open source crypto trading bot
🔗 www.freqtrade.io
ccxt/ccxt ⭐ 35,163
A JavaScript / TypeScript / Python / C# / PHP / Go cryptocurrency trading API with support for more than 100 bitcoin/altcoin exchanges
🔗 docs.ccxt.com
crytic/slither ⭐ 5,560
Static Analyzer for Solidity and Vyper
🔗 blog.trailofbits.com/2018/10/19/slither-a-solidity-static-analysis-framework
ethereum/web3.py ⭐ 5,189
A python interface for interacting with the Ethereum blockchain and ecosystem.
🔗 web3py.readthedocs.io
ethereum/consensus-specs ⭐ 3,684
Ethereum Proof-of-Stake Consensus Specifications
cyberpunkmetalhead/Binance-volatility-trading-bot ⭐ 3,450
This is a fully functioning Binance trading bot that measures the volatility of every coin on Binance and places trades with the highest gaining coins If you like this project consider donating though the Brave browser to allow me to continuously improve the script.
bmoscon/cryptofeed ⭐ 2,374
Cryptocurrency Exchange Websocket Data Feed Handler
ethereum/py-evm ⭐ 2,316
A Python implementation of the Ethereum Virtual Machine
🔗 py-evm.readthedocs.io/en/latest
binance/binance-public-data ⭐ 1,765
Details on how to get Binance public data
ofek/bit ⭐ 1,285
Bitcoin made easy.
🔗 ofek.dev/bit
man-c/pycoingecko ⭐ 1,071
Python wrapper for the CoinGecko API
palkeo/panoramix ⭐ 842
Ethereum decompiler
coinbase/agentkit ⭐ 596
AgentKit is Coinbase Developer Platform's framework for easily enabling AI agents to take actions onchain. It is designed to be framework-agnostic, so you can use it with any AI framework, and wallet-agnostic
🔗 docs.cdp.coinbase.com/agentkit/docs/welcome
dylanhogg/awesome-crypto ⭐ 74
A list of awesome crypto and blockchain projects
🔗 www.awesomecrypto.xyz

Data

General data libraries: data processing, serialisation, formats, databases, SQL, connectors, web crawlers, data generation/augmentation/checks.

scrapy/scrapy ⭐ 54,567
Scrapy, a fast high-level web crawling & scraping framework for Python.
🔗 scrapy.org
apache/spark ⭐ 40,753
Apache Spark - A unified analytics engine for large-scale data processing
🔗 spark.apache.org
microsoft/markitdown ⭐ 40,234
A utility for converting files to Markdown, supports: PDF, PPT, Word, Excel, Images etc
mindsdb/mindsdb ⭐ 27,339
AI's query engine - Platform for building AI that can learn and answer questions over large scale federated data.
🔗 mindsdb.com
getredash/redash ⭐ 27,090
Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
🔗 redash.io
jaidedai/EasyOCR ⭐ 25,947
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
🔗 www.jaided.ai
ds4sd/docling ⭐ 24,248
Docling parses documents and exports them to the desired format with ease and speed.
🔗 docling-project.github.io/docling
pathwaycom/pathway ⭐ 23,703
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
🔗 pathway.com
qdrant/qdrant ⭐ 22,555
Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
🔗 qdrant.tech
humansignal/label-studio ⭐ 21,245
Label Studio is an open source data labeling tool. It lets you label data types like audio, text, images, videos, and time series with a simple and straightforward UI and export to various model formats.
🔗 labelstud.io
chroma-core/chroma ⭐ 18,681
the AI-native open-source embedding database
🔗 www.trychroma.com
joke2k/faker ⭐ 18,106
Faker is a Python package that generates fake data for you.
🔗 faker.readthedocs.io
avaiga/taipy ⭐ 17,891
Turns Data and AI algorithms into production-ready web applications in no time.
🔗 www.taipy.io
airbytehq/airbyte ⭐ 17,558
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
🔗 airbyte.com
binux/pyspider ⭐ 16,553
A Powerful Spider(Web Crawler) System in Python.
🔗 docs.pyspider.org
twintproject/twint ⭐ 15,995
An advanced Twitter scraping & OSINT tool written in Python that doesn't use Twitter's API, allowing you to scrape a user's followers, following, Tweets and more while evading most API limitations.
tiangolo/sqlmodel ⭐ 15,427
SQL databases in Python, designed for simplicity, compatibility, and robustness.
🔗 sqlmodel.tiangolo.com
apache/arrow ⭐ 15,115
Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics
🔗 arrow.apache.org
redis/redis-py ⭐ 12,928
Redis Python client
weaviate/weaviate ⭐ 12,793
Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a cloud-native database.
🔗 weaviate.io/developers/weaviate
coleifer/peewee ⭐ 11,433
a small, expressive orm -- supports postgresql, mysql, sqlite and cockroachdb
🔗 docs.peewee-orm.com
s0md3v/Photon ⭐ 11,404
Incredibly fast crawler designed for OSINT.
sqlalchemy/sqlalchemy ⭐ 10,168
The Database Toolkit for Python
🔗 www.sqlalchemy.org
simonw/datasette ⭐ 9,879
An open source multi-tool for exploring and publishing data
🔗 datasette.io
bigscience-workshop/petals ⭐ 9,504
🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
🔗 petals.dev
voxel51/fiftyone ⭐ 9,286
Refine high-quality datasets and visual AI models
🔗 fiftyone.ai
yzhao062/pyod ⭐ 8,934
A Python Library for Outlier and Anomaly Detection, Integrating Classical and Deep Learning Techniques
🔗 pyod.readthedocs.io
gristlabs/grist-core ⭐ 8,107
Grist is the evolution of spreadsheets.
🔗 www.getgrist.com
cyclotruc/gitingest ⭐ 7,490
Turn any Git repository into a prompt-friendly text ingest for LLMs.
🔗 gitingest.com
tobymao/sqlglot ⭐ 7,324
Python SQL Parser and Transpiler
🔗 sqlglot.com
alirezamika/autoscraper ⭐ 6,684
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
kaggle/kaggle-api ⭐ 6,503
Official Kaggle API
madmaze/pytesseract ⭐ 6,049
A Python wrapper for Google Tesseract
lancedb/lancedb ⭐ 5,887
Developer-friendly, serverless vector database for AI applications. Easily add long-term memory to your LLM apps!
🔗 lancedb.github.io/lancedb
vi3k6i5/flashtext ⭐ 5,638
Extract Keywords from sentence or Replace keywords in sentences.
ibis-project/ibis ⭐ 5,598
Ibis is a Python library that provides a lightweight, universal interface for data wrangling. It helps Python users explore and transform data of any size, stored anywhere.
🔗 ibis-project.org
airbnb/knowledge-repo ⭐ 5,509
A next-generation curated knowledge sharing platform for data scientists and other technical professions.
superduperdb/superduper ⭐ 5,012
Superduper: Build end-to-end AI applications and agent workflows on your existing data infrastructure and preferred tools - without migrating your data.
🔗 superduper.io
facebookresearch/AugLy ⭐ 4,991
A data augmentations library for audio, image, text, and video.
🔗 ai.facebook.com/blog/augly-a-new-data-augmentation-library-to-help-build-more-robust-ai-models
jazzband/tablib ⭐ 4,674
Python Module for Tabular Datasets in XLS, CSV, JSON, YAML, &c.
🔗 tablib.readthedocs.io
amundsen-io/amundsen ⭐ 4,530
Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.
🔗 www.amundsen.io/amundsen
lk-geimfari/mimesis ⭐ 4,505
Mimesis is a robust data generator for Python that can produce a wide range of fake data in multiple languages.
🔗 mimesis.name
giskard-ai/giskard ⭐ 4,372
🐢 Open-Source Evaluation & Testing for AI & LLM systems
🔗 docs.giskard.ai
mongodb/mongo-python-driver ⭐ 4,195
PyMongo - the Official MongoDB Python driver
🔗 www.mongodb.com/docs/languages/python/pymongo-driver/current
adbar/trafilatura ⭐ 4,036
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
🔗 trafilatura.readthedocs.io
rom1504/img2dataset ⭐ 3,951
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
andialbrecht/sqlparse ⭐ 3,833
A non-validating SQL parser module for Python
deepchecks/deepchecks ⭐ 3,738
Deepchecks: Tests for Continuous Validation of ML Models & Data. Deepchecks is a holistic open-source solution for all of your AI & ML validation needs, enabling to thoroughly test your data and models from research to production.
🔗 docs.deepchecks.com/stable
jmcnamara/XlsxWriter ⭐ 3,734
A Python module for creating Excel XLSX files.
🔗 xlsxwriter.readthedocs.io
rapidai/RapidOCR ⭐ 3,710
📄 Awesome OCR multiple programing languages toolkits based on ONNXRuntime, OpenVINO, PaddlePaddle and PyTorch.
🔗 rapidai.github.io/rapidocrdocs
praw-dev/praw ⭐ 3,619
PRAW, an acronym for "Python Reddit API Wrapper", is a python package that allows for simple access to Reddit's API.
🔗 praw.readthedocs.io
run-llama/llama-hub ⭐ 3,471
A library of data loaders for LLMs made by the community -- to be used with LlamaIndex and/or LangChain
🔗 llamahub.ai
dlt-hub/dlt ⭐ 3,339
data load tool (dlt) is an open source Python library that makes data loading easy 🛠️
🔗 dlthub.com/docs
pyeve/cerberus ⭐ 3,198
Lightweight, extensible data validation library for Python
🔗 python-cerberus.org
sqlalchemy/alembic ⭐ 3,147
A database migrations tool for SQLAlchemy.
zoomeranalytics/xlwings ⭐ 3,101
xlwings is a Python library that makes it easy to call Python from Excel and vice versa. It works with Excel on Windows and macOS as well as with Google Sheets and Excel on the web.
🔗 www.xlwings.org
docarray/docarray ⭐ 3,025
Represent, send, store and search multimodal data
🔗 docs.docarray.org
pallets/itsdangerous ⭐ 2,985
Safely pass trusted data to untrusted environments and back.
🔗 itsdangerous.palletsprojects.com
datafold/data-diff ⭐ 2,964
Compare tables within or across databases
🔗 docs.datafold.com
goldsmith/Wikipedia ⭐ 2,935
A Pythonic wrapper for the Wikipedia API
🔗 wikipedia.readthedocs.org
mlabonne/llm-datasets ⭐ 2,841
Curated list of datasets and tools for post-training.
🔗 mlabonne.github.io/blog
awslabs/amazon-redshift-utils ⭐ 2,795
Amazon Redshift Utils contains utilities, scripts and view which are useful in a Redshift environment
kayak/pypika ⭐ 2,634
PyPika is a python SQL query builder that exposes the full richness of the SQL language using a syntax that reflects the resulting query. PyPika excels at all sorts of SQL queries but is especially useful for data analysis.
🔗 pypika.readthedocs.io/en/latest
sdv-dev/SDV ⭐ 2,539
Synthetic data generation for tabular data
🔗 docs.sdv.dev/sdv
pynamodb/PynamoDB ⭐ 2,484
A pythonic interface to Amazon's DynamoDB
🔗 pynamodb.readthedocs.io
samuelcolvin/arq ⭐ 2,384
Fast job queuing and RPC in python with asyncio and redis.
🔗 arq-docs.helpmanual.io
uqfoundation/dill ⭐ 2,322
serialize all of Python
🔗 dill.rtfd.io
huggingface/datatrove ⭐ 2,302
DataTrove is a library to process, filter and deduplicate text data at a very large scale. It provides a set of prebuilt commonly used processing blocks with a framework to easily add custom functionality
pikepdf/pikepdf ⭐ 2,291
A Python library for reading and writing PDF, powered by QPDF
🔗 pikepdf.readthedocs.io
emirozer/fake2db ⭐ 2,288
Generate fake but valid data filled databases for test purposes using most popular patterns(AFAIK). Current support is sqlite, mysql, postgresql, mongodb, redis, couchdb.
graphistry/pygraphistry ⭐ 2,224
PyGraphistry is a Python library to quickly load, shape, embed, and explore big graphs with the GPU-accelerated Graphistry visual graph analyzer
accenture/AmpliGraph ⭐ 2,198
Python library for Representation Learning on Knowledge Graphs https://docs.ampligraph.org
sfu-db/connector-x ⭐ 2,170
Fastest library to load data from DB to DataFrames in Rust and Python
🔗 sfu-db.github.io/connector-x
aminalaee/sqladmin ⭐ 2,077
SQLAlchemy Admin for FastAPI and Starlette
🔗 aminalaee.dev/sqladmin
milvus-io/bootcamp ⭐ 2,048
Dealing with all unstructured data, such as reverse image search, audio search, molecular search, video analysis, question and answer systems, NLP, etc.
🔗 milvus.io
agronholm/sqlacodegen ⭐ 2,022
Automatic model code generator for SQLAlchemy
uber/petastorm ⭐ 1,823
Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
aio-libs/aiomysql ⭐ 1,796
aiomysql is a library for accessing a MySQL database from the asyncio
🔗 aiomysql.rtfd.io
simonw/sqlite-utils ⭐ 1,782
Python CLI utility and library for manipulating SQLite databases
🔗 sqlite-utils.datasette.io
simple-salesforce/simple-salesforce ⭐ 1,748
A very simple Salesforce.com REST API client for Python
collerek/ormar ⭐ 1,716
python async orm with fastapi in mind and pydantic validation
🔗 collerek.github.io/ormar
zarr-developers/zarr-python ⭐ 1,641
An implementation of chunked, compressed, N-dimensional arrays for Python.
🔗 zarr.readthedocs.io
scholarly-python-package/scholarly ⭐ 1,559
Retrieve author and publication information from Google Scholar in a friendly, Pythonic way without having to worry about CAPTCHAs!
🔗 scholarly.readthedocs.io
eleutherai/the-pile ⭐ 1,546
The Pile is a large, diverse, open source language modelling data set that consists of many smaller datasets combined together.
ydataai/ydata-synthetic ⭐ 1,514
Synthetic data generators for tabular and time-series data
🔗 docs.sdk.ydata.ai
mchong6/JoJoGAN ⭐ 1,424
Official PyTorch repo for JoJoGAN: One Shot Face Stylization
sdispater/orator ⭐ 1,420
The Orator ORM provides a simple yet beautiful ActiveRecord implementation.
🔗 orator-orm.com
google/tensorstore ⭐ 1,389
Library for reading and writing large multi-dimensional arrays.
🔗 google.github.io/tensorstore
quixio/quix-streams ⭐ 1,328
Python Streaming DataFrames for Kafka
🔗 docs.quix.io
d-star-ai/dsRAG ⭐ 1,259
A retrieval engine for unstructured data. It is especially good at handling challenging queries over dense text, like financial reports, legal documents, and academic papers.
aio-libs/aiocache ⭐ 1,246
Asyncio cache manager for redis, memcached and memory
🔗 aiocache.readthedocs.io
eliasdabbas/advertools ⭐ 1,200
advertools - online marketing productivity and analysis tools
🔗 advertools.readthedocs.io
pytorch/data ⭐ 1,174
A PyTorch repo for data loading and utilities to be shared by the PyTorch domain libraries.
brettkromkamp/contextualise ⭐ 1,070
Contextualise is an effective tool particularly suited for organising information-heavy projects and activities consisting of unstructured and widely diverse data and information resources
🔗 contextualise.dev
uber/fiber ⭐ 1,043
Distributed Computing for AI Made Simple
🔗 uber.github.io/fiber
intake/intake ⭐ 1,032
Intake is a lightweight package for finding, investigating, loading and disseminating data.
🔗 intake.readthedocs.io
duckdb/dbt-duckdb ⭐ 1,019
dbt (http://getdbt.com) adapter for DuckDB (http://duckdb.org)
igorbenav/fastcrud ⭐ 973
FastCRUD is a Python package for FastAPI, offering robust async CRUD operations and flexible endpoint creation utilities.
goccy/bigquery-emulator ⭐ 911
BigQuery emulator provides a way to launch a BigQuery server on your local machine for testing and development.
scikit-hep/awkward ⭐ 869
Manipulate JSON-like data with NumPy-like idioms.
🔗 awkward-array.org
macbre/sql-metadata ⭐ 843
Uses tokenized query returned by python-sqlparse and generates query metadata
🔗 pypi.python.org/pypi/sql-metadata
koaning/human-learn ⭐ 806
Natural Intelligence is still a pretty good idea.
🔗 koaning.github.io/human-learn
googleapis/python-bigquery ⭐ 758
Python Client for Google BigQuery
hyperqueryhq/whale ⭐ 726
🐳 The stupidly simple CLI workspace for your data warehouse.
🔗 rsyi.gitbook.io/whale
weaviate/recipes ⭐ 721
This repository shares end-to-end notebooks on how to use various Weaviate features and integrations!
kagisearch/vectordb ⭐ 700
A minimal Python package for storing and retrieving text using chunking, embeddings, and vector search.
🔗 vectordb.com
dgarnitz/vectorflow ⭐ 689
VectorFlow is a high volume vector embedding pipeline that ingests raw data, transforms it into vectors and writes it to a vector DB of your choice.
🔗 www.getvectorflow.com
unstructured-io/unstructured-api ⭐ 683
API for Open-Source Pre-Processing Tools for Unstructured Data
apache/iceberg-python ⭐ 640
PyIceberg is a Python library for programmatic access to Iceberg table metadata as well as to table data in Iceberg format.
🔗 py.iceberg.apache.org
jina-ai/vectordb ⭐ 597
A Python vector database you just need - no more, no less.
koaning/bulk ⭐ 568
Bulk is a quick UI developer tool to apply some bulk labels.
ibm/data-prep-kit ⭐ 535
Data Prep Kit is a community project to democratize and accelerate unstructured data preparation for LLM app developers
🔗 data-prep-kit.github.io/data-prep-kit
koaning/doubtlab ⭐ 509
Doubt your data, find bad labels.
🔗 koaning.github.io/doubtlab
titan-systems/titan ⭐ 463
Snowflake infrastructure-as-code. Provision environments, automate deploys, CI/CD. Manage RBAC, users, roles, and data access. Declarative Python Resource API.
stackloklabs/promptwright ⭐ 392
Promptwright is a Python library designed for generating large synthetic datasets using LLMs

Debugging

Debugging and tracing tools.

cool-rr/PySnooper ⭐ 16,445
Never use print for debugging again
gruns/icecream ⭐ 9,594
🍦 Never use print() to debug again.
shobrook/rebound ⭐ 4,126
Get instant Stack Overflow results whenever an exception is thrown
inducer/pudb ⭐ 3,048
Full-screen console debugger for Python
🔗 documen.tician.de/pudb
gotcha/ipdb ⭐ 1,900
Integration of IPython pdb
alexmojaki/heartrate ⭐ 1,814
Simple real time visualisation of the execution of a Python program.
alexmojaki/birdseye ⭐ 1,673
Graphical Python debugger which lets you easily view the values of all evaluated expressions
🔗 birdseye.readthedocs.io
pdbpp/pdbpp ⭐ 1,338
pdb++, a drop-in replacement for pdb (the Python debugger)
alexmojaki/snoop ⭐ 1,333
A powerful set of Python debugging tools, based on PySnooper
samuelcolvin/python-devtools ⭐ 1,010
Dev tools for python
🔗 python-devtools.helpmanual.io

Diffusion Text to Image

Text-to-image diffusion model libraries, tools and apps for generating images from natural language.

automatic1111/stable-diffusion-webui ⭐ 149,582
Stable Diffusion web UI
comfyanonymous/ComfyUI ⭐ 71,392
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
🔗 www.comfy.org
compvis/stable-diffusion ⭐ 69,994
A latent text-to-image diffusion model
🔗 ommer-lab.com/research/latent-diffusion-models
stability-ai/stablediffusion ⭐ 40,513
High-Resolution Image Synthesis with Latent Diffusion Models
lllyasviel/ControlNet ⭐ 31,762
Let us control diffusion models!
huggingface/diffusers ⭐ 28,101
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.
🔗 huggingface.co/docs/diffusers
invoke-ai/InvokeAI ⭐ 24,654
Invoke is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and create visual media using the latest AI-driven technologies. The solution offers an industry leading WebUI, and serves as the foundation for multiple commercial products.
🔗 invoke-ai.github.io/invokeai
openbmb/MiniCPM-o ⭐ 18,991
MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone
apple/ml-stable-diffusion ⭐ 17,213
Stable Diffusion with Core ML on Apple Silicon
borisdayma/dalle-mini ⭐ 14,796
DALL·E Mini - Generate images from a text prompt
🔗 www.craiyon.com
divamgupta/diffusionbee-stable-diffusion-ui ⭐ 13,099
Diffusion Bee is the easiest way to run Stable Diffusion locally on your M1 Mac. Comes with a one-click installer. No dependencies or technical knowledge needed.
🔗 diffusionbee.com
compvis/latent-diffusion ⭐ 12,528
High-Resolution Image Synthesis with Latent Diffusion Models
instantid/InstantID ⭐ 11,488
InstantID: Zero-shot Identity-Preserving Generation in Seconds 🔥
🔗 instantid.github.io
lucidrains/DALLE2-pytorch ⭐ 11,235
Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch
facebookresearch/dinov2 ⭐ 10,015
PyTorch code and models for the DINOv2 self-supervised learning method.
ashawkey/stable-dreamfusion ⭐ 8,540
Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion.
carson-katri/dream-textures ⭐ 7,955
Stable Diffusion built-in to Blender
xavierxiao/Dreambooth-Stable-Diffusion ⭐ 7,669
Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) with Stable Diffusion
idea-research/GroundingDINO ⭐ 7,654
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
🔗 arxiv.org/abs/2303.05499
opengvlab/InternVL ⭐ 7,270
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
🔗 internvl.readthedocs.io/en/latest
timothybrooks/instruct-pix2pix ⭐ 6,566
PyTorch implementation of InstructPix2Pix, an instruction-based image editing model, based on the original CompVis/stable_diffusion repo.
openai/consistency_models ⭐ 6,279
Official repo for consistency models.
salesforce/BLIP ⭐ 5,108
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
nateraw/stable-diffusion-videos ⭐ 4,545
Create 🔥 videos with Stable Diffusion by exploring the latent space and morphing between text prompts
lkwq007/stablediffusion-infinity ⭐ 3,871
Outpainting with Stable Diffusion on an infinite canvas
jina-ai/discoart ⭐ 3,843
🪩 Create Disco Diffusion artworks in one line
mlc-ai/web-stable-diffusion ⭐ 3,646
Bringing stable diffusion models to web browsers. Everything runs inside the browser with no server support.
🔗 mlc.ai/web-stable-diffusion
openai/glide-text2im ⭐ 3,595
GLIDE: a diffusion-based text-conditional image synthesis model
openai/improved-diffusion ⭐ 3,471
Release for Improved Denoising Diffusion Probabilistic Models
saharmor/dalle-playground ⭐ 2,766
A playground to generate images from any text prompt using Stable Diffusion (past: using DALL-E Mini)
google-research/big_vision ⭐ 2,726
Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.
stability-ai/stability-sdk ⭐ 2,436
SDK for interacting with stability.ai APIs (e.g. stable diffusion inference)
🔗 platform.stability.ai
thudm/CogVLM2 ⭐ 2,310
GPT4V-level open-source multi-modal model based on Llama3-8B
open-compass/VLMEvalKit ⭐ 2,024
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
🔗 huggingface.co/spaces/opencompass/open_vlm_leaderboard
coyote-a/ultimate-upscale-for-automatic1111 ⭐ 1,713
Ultimate SD Upscale extension for AUTOMATIC1111 Stable Diffusion web UI
divamgupta/stable-diffusion-tensorflow ⭐ 1,599
Stable Diffusion in TensorFlow / Keras
nvlabs/prismer ⭐ 1,310
The implementation of "Prismer: A Vision-Language Model with Multi-Task Experts".
🔗 shikun.io/projects/prismer
chenyangqiqi/FateZero ⭐ 1,138
[ICCV 2023 Oral] "FateZero: Fusing Attentions for Zero-shot Text-based Video Editing"
🔗 fate-zero-edit.github.io
thereforegames/unprompted ⭐ 794
Templating language written for Stable Diffusion workflows. Available as an extension for the Automatic1111 WebUI.
tanelp/tiny-diffusion ⭐ 714
A minimal PyTorch implementation of probabilistic diffusion models for 2D datasets.
sharonzhou/long_stable_diffusion ⭐ 687
Long-form text-to-images generation, using a pipeline of deep generative models (GPT-3 and Stable Diffusion)
laion-ai/dalle2-laion ⭐ 501
Pretrained Dalle2 from laion

Finance

Financial and quantitative libraries: investment research tools, market data, algorithmic trading, backtesting, financial derivatives.

openbb-finance/OpenBB ⭐ 37,429
Investment Research for Everyone, Everywhere.
🔗 openbb.co
quantopian/zipline ⭐ 18,259
Zipline, a Pythonic Algorithmic Trading Library
🔗 www.zipline.io
virattt/ai-hedge-fund ⭐ 17,976
AI-powered hedge fund. The goal of this project is to explore the use of AI to make trading decisions.
microsoft/qlib ⭐ 17,013
Qlib is an AI-oriented quantitative investment platform that aims to realize the potential, empower research, and create value using AI technologies in quantitative investment, from exploring ideas to implementing productions. Qlib supports diverse machine learning modeling paradigms. including supervised learning, ...
🔗 qlib.readthedocs.io/en/latest
mementum/backtrader ⭐ 16,426
Python Backtesting library for trading strategies
🔗 www.backtrader.com
ranaroussi/yfinance ⭐ 16,267
Download market data from Yahoo! Finance's API
🔗 yfinance-python.org
ai4finance-foundation/FinGPT ⭐ 15,390
FinGPT: Open-Source Financial Large Language Models! Revolutionize 🔥 We release the trained model on HuggingFace.
🔗 ai4finance.org
ai4finance-foundation/FinRL ⭐ 11,172
FinRL: Financial Reinforcement Learning. 🔥
🔗 ai4finance.org
quantconnect/Lean ⭐ 10,832
Lean Algorithmic Trading Engine by QuantConnect (Python, C#)
🔗 lean.io
ta-lib/ta-lib-python ⭐ 10,372
Python wrapper for TA-Lib (http://ta-lib.org/).
🔗 ta-lib.github.io/ta-lib-python
goldmansachs/gs-quant ⭐ 8,568
Python toolkit for quantitative finance
🔗 developer.gs.com/discover/products/gs-quant
kernc/backtesting.py ⭐ 6,123
🔎 📈 🐍 💰 Backtest trading strategies in Python.
🔗 kernc.github.io/backtesting.py
twopirllc/pandas-ta ⭐ 5,902
Technical Analysis Indicators - Pandas TA is an easy to use Python 3 Pandas Extension with 150+ Indicators
🔗 twopirllc.github.io/pandas-ta
quantopian/pyfolio ⭐ 5,867
Portfolio and risk analytics in Python
🔗 quantopian.github.io/pyfolio
ranaroussi/quantstats ⭐ 5,451
Portfolio analytics for quants, written in Python
polakowo/vectorbt ⭐ 4,920
Find your trading edge, using the fastest engine for backtesting, algorithmic trading, and research.
🔗 vectorbt.dev
google/tf-quant-finance ⭐ 4,756
High-performance TensorFlow library for quantitative finance.
borisbanushev/stockpredictionai ⭐ 4,586
In this noteboook I will create a complete process for predicting stock price movements. Follow along and we will achieve some pretty good results. For that purpose we will use a Generative Adversarial Network (GAN) with LSTM, a type of Recurrent Neural Network, as generator, and a Convolutional Neural Networ...
gbeced/pyalgotrade ⭐ 4,514
Python Algorithmic Trading Library
🔗 gbeced.github.io/pyalgotrade
matplotlib/mplfinance ⭐ 3,930
Financial Markets Data Visualization using Matplotlib
🔗 pypi.org/project/mplfinance
quantopian/alphalens ⭐ 3,585
Performance analysis of predictive (alpha) stock factors
🔗 quantopian.github.io/alphalens
cuemacro/finmarketpy ⭐ 3,540
Python library for backtesting trading strategies & analyzing financial markets (formerly pythalesians)
🔗 www.cuemacro.com
zvtvz/zvt ⭐ 3,482
modular quant framework.
🔗 zvt.readthedocs.io/en/latest
robcarver17/pysystemtrade ⭐ 2,815
Systematic Trading in python
quantopian/research_public ⭐ 2,534
Quantitative research and educational materials
🔗 www.quantopian.com/lectures
pmorissette/bt ⭐ 2,445
bt - flexible backtesting for Python
🔗 pmorissette.github.io/bt
domokane/FinancePy ⭐ 2,304
A Python Finance Library that focuses on the pricing and risk-management of Financial Derivatives, including fixed-income, equity, FX and credit derivatives.
blankly-finance/blankly ⭐ 2,253
🚀 💸 Easily build, backtest and deploy your algo in just a few lines of code. Trade stocks, cryptos, and forex across exchanges w/ one package.
🔗 package.blankly.finance
pmorissette/ffn ⭐ 2,180
ffn - a financial function library for Python
🔗 pmorissette.github.io/ffn
cuemacro/findatapy ⭐ 1,785
Python library to download market data via Bloomberg, Eikon, Quandl, Yahoo etc.
quantopian/empyrical ⭐ 1,353
Common financial risk and performance metrics. Used by zipline and pyfolio.
🔗 quantopian.github.io/empyrical
idanya/algo-trader ⭐ 814
Trading bot with support for realtime trading, backtesting, custom strategies and much more.
gbeced/basana ⭐ 685
A Python async and event driven framework for algorithmic trading, with a focus on crypto currencies.
chancefocus/PIXIU ⭐ 661
This repository introduces PIXIU, an open-source resource featuring the first financial large language models (LLMs), instruction tuning data, and evaluation benchmarks to holistically assess financial LLMs. Our goal is to continually push forward the open-source development of financial artificial intelligence (AI).
nasdaq/data-link-python ⭐ 512
A Python library for Nasdaq Data Link's RESTful API

Game Development

Game development tools, engines and libraries.

kitao/pyxel ⭐ 16,060
A retro game engine for Python
microsoft/TRELLIS ⭐ 8,454
A large 3D asset generation model. It takes in text or image prompts and generates high-quality 3D assets in various formats, such as Radiance Fields, 3D Gaussians, and meshes.
🔗 trellis3d.github.io
pygame/pygame ⭐ 7,878
🐍🎮 pygame (the library) is a Free and Open Source python programming language library for making multimedia applications like games built on top of the excellent SDL library. C, Python, Native, OpenGL.
🔗 www.pygame.org
panda3d/panda3d ⭐ 4,685
Powerful, mature open-source cross-platform game engine for Python and C++, developed by Disney and CMU
🔗 www.panda3d.org
niklasf/python-chess ⭐ 2,535
python-chess is a chess library for Python, with move generation, move validation, and support for common formats
🔗 python-chess.readthedocs.io/en/latest
pokepetter/ursina ⭐ 2,315
A game engine powered by python and panda3d.
🔗 pokepetter.github.io/ursina
pyglet/pyglet ⭐ 1,996
pyglet is a cross-platform windowing and multimedia library for Python, for developing games and other visually rich applications.
🔗 pyglet.org
pythonarcade/arcade ⭐ 1,764
Easy to use Python library for creating 2D arcade games.
🔗 arcade.academy

GIS

Geospatial libraries: raster and vector data formats, interactive mapping and visualisation, computing frameworks for processing images, projections.

domlysz/BlenderGIS ⭐ 8,084
Blender addons to make the bridge between Blender and geographic data
python-visualization/folium ⭐ 7,060
Python Data. Leaflet.js Maps.
🔗 python-visualization.github.io/folium
osgeo/gdal ⭐ 5,170
GDAL is an open source MIT licensed translator library for raster and vector geospatial data formats.
🔗 gdal.org
gboeing/osmnx ⭐ 5,064
Download, model, analyze, and visualize street networks and other geospatial features from OpenStreetMap.
🔗 osmnx.readthedocs.io
geopandas/geopandas ⭐ 4,666
Python tools for geographic data
🔗 geopandas.org
shapely/shapely ⭐ 4,047
Manipulation and analysis of geometric objects
🔗 shapely.readthedocs.io/en/stable
giswqs/geemap ⭐ 3,593
A Python package for interactive geospatial analysis and visualization with Google Earth Engine.
🔗 geemap.org
holoviz/datashader ⭐ 3,382
Quickly and accurately render even the largest data.
🔗 datashader.org
opengeos/leafmap ⭐ 3,304
A Python package for interactive mapping and geospatial analysis with minimal coding in a Jupyter environment
🔗 leafmap.org
microsoft/torchgeo ⭐ 3,250
TorchGeo: datasets, samplers, transforms, and pre-trained models for geospatial data
🔗 www.osgeo.org/projects/torchgeo
opengeos/segment-geospatial ⭐ 3,212
A Python package for segmenting geospatial data with the Segment Anything Model (SAM)
🔗 samgeo.gishub.org
google/earthengine-api ⭐ 2,789
Python and JavaScript bindings for calling the Earth Engine API.
rasterio/rasterio ⭐ 2,327
Rasterio reads and writes geospatial raster datasets
🔗 rasterio.readthedocs.io
mcordts/cityscapesScripts ⭐ 2,225
README and scripts for the Cityscapes Dataset
azavea/raster-vision ⭐ 2,124
An open source library and framework for deep learning on satellite and aerial imagery.
🔗 docs.rastervision.io
apache/sedona ⭐ 2,010
A cluster computing framework for processing large-scale geospatial data
🔗 sedona.apache.org
plant99/felicette ⭐ 1,821
Satellite imagery for dummies.
gboeing/osmnx-examples ⭐ 1,641
Gallery of OSMnx tutorials, usage examples, and feature demonstations.
🔗 osmnx.readthedocs.io
microsoft/GlobalMLBuildingFootprints ⭐ 1,515
Worldwide building footprints derived from satellite imagery
jupyter-widgets/ipyleaflet ⭐ 1,509
A Jupyter - Leaflet.js bridge
🔗 ipyleaflet.readthedocs.io
pysal/pysal ⭐ 1,374
PySAL: Python Spatial Analysis Library Meta-Package
🔗 pysal.org/pysal
anitagraser/movingpandas ⭐ 1,281
Movement trajectory classes and functions built on top of GeoPandas
🔗 movingpandas.org
residentmario/geoplot ⭐ 1,169
High-level geospatial data visualization library for Python.
🔗 residentmario.github.io/geoplot/index.html
sentinel-hub/eo-learn ⭐ 1,161
Earth observation processing framework for machine learning in Python
🔗 eo-learn.readthedocs.io/en/latest
opengeos/streamlit-geospatial ⭐ 919
A multi-page streamlit app for geospatial
🔗 huggingface.co/spaces/giswqs/streamlit
osgeo/grass ⭐ 899
GRASS - free and open-source geospatial processing engine
🔗 grass.osgeo.org
uber/h3-py ⭐ 884
Python bindings for H3, a hierarchical hexagonal geospatial indexing system
🔗 uber.github.io/h3-py
makepath/xarray-spatial ⭐ 870
Raster-based Spatial Analytics for Python
🔗 xarray-spatial.readthedocs.io
developmentseed/titiler ⭐ 849
Build your own Raster dynamic map tile services
🔗 developmentseed.org/titiler

Graph

Graphs and network libraries: network analysis, graph machine learning, visualisation.

networkx/networkx ⭐ 15,501
Network Analysis in Python
🔗 networkx.org
stellargraph/stellargraph ⭐ 2,986
StellarGraph - Machine Learning on Graphs
🔗 stellargraph.readthedocs.io
westhealth/pyvis ⭐ 1,071
Python package for creating and visualizing interactive network graphs.
🔗 pyvis.readthedocs.io/en/latest
microsoft/graspologic ⭐ 879
graspologic is a package for graph statistical algorithms
🔗 graspologic-org.github.io/graspologic
rampasek/GraphGPS ⭐ 723
Recipe for a General, Powerful, Scalable Graph Transformer
dylanhogg/llmgraph ⭐ 397
Create knowledge graphs with LLMs

GUI

Graphical user interface libraries and toolkits.

hoffstadt/DearPyGui ⭐ 13,891
Dear PyGui: A fast and powerful Graphical User Interface Toolkit for Python with minimal dependencies
🔗 dearpygui.readthedocs.io/en/latest
pysimplegui/PySimpleGUI ⭐ 13,561
Python GUIs for Humans! PySimpleGUI is the top-rated Python application development environment. Launched in 2018 and actively developed, maintained, and supported in 2024. Transforms tkinter, Qt, WxPython, and Remi into a simple, intuitive, and fun experience for both hobbyists and expert users.
🔗 www.pysimplegui.com
parthjadhav/Tkinter-Designer ⭐ 9,628
An easy and fast way to create a Python GUI 🐍
samuelcolvin/FastUI ⭐ 8,760
FastUI is a new way to build web application user interfaces defined by declarative Python code.
🔗 fastui-demo.onrender.com
r0x0r/pywebview ⭐ 5,030
Build GUI for your Python program with JavaScript, HTML, and CSS
🔗 pywebview.flowrl.com
beeware/toga ⭐ 4,556
A Python native, OS native GUI toolkit.
🔗 toga.readthedocs.io/en/latest
dddomodossola/remi ⭐ 3,581
Python REMote Interface library. Platform independent. In about 100 Kbytes, perfect for your diet.
wxwidgets/Phoenix ⭐ 2,404
wxPython's Project Phoenix. A new implementation of wxPython, better, stronger, faster than he was before.
🔗 wxpython.org

Jupyter

Jupyter and JupyterLab and Notebook tools, libraries and plugins.

jupyterlab/jupyterlab ⭐ 14,441
JupyterLab computational environment.
🔗 jupyterlab.readthedocs.io
jupyter/notebook ⭐ 12,113
Jupyter Interactive Notebook
🔗 jupyter-notebook.readthedocs.io
marimo-team/marimo ⭐ 11,352
A reactive Python notebook: run a cell or interact with a UI element, and marimo automatically runs dependent cells, keeping code and outputs consistent. marimo notebooks are stored as pure Python, executable as scripts, and deployable as apps.
🔗 marimo.io
garrettj403/SciencePlots ⭐ 7,608
Matplotlib styles for scientific plotting
mwouts/jupytext ⭐ 6,777
Jupyter Notebooks as Markdown Documents, Julia, Python or R scripts
🔗 jupytext.readthedocs.io
nteract/papermill ⭐ 6,112
📚 Parameterize, execute, and analyze notebooks
🔗 papermill.readthedocs.io/en/latest
connorferster/handcalcs ⭐ 5,727
Python library for converting Python calculations into rendered latex.
voila-dashboards/voila ⭐ 5,617
Voilà turns Jupyter notebooks into standalone web applications
🔗 voila.readthedocs.io
jupyterlite/jupyterlite ⭐ 4,053
Wasm powered Jupyter running in the browser 💡
🔗 jupyterlite.rtfd.io/en/stable/try/lab
executablebooks/jupyter-book ⭐ 3,999
Create beautiful, publication-quality books and documents from computational content.
🔗 next.jupyterbook.org
jupyterlab/jupyterlab-desktop ⭐ 3,923
JupyterLab desktop application, based on Electron.
jupyterlab/jupyter-ai ⭐ 3,493
A generative AI extension for JupyterLab
🔗 jupyter-ai.readthedocs.io
jupyter-widgets/ipywidgets ⭐ 3,206
Interactive Widgets for the Jupyter Notebook
🔗 ipywidgets.readthedocs.io
quantopian/qgrid ⭐ 3,068
An interactive grid for sorting, filtering, and editing DataFrames in Jupyter notebooks
jupyter/nbdime ⭐ 2,716
Tools for diffing and merging of Jupyter notebooks.
🔗 nbdime.readthedocs.io
mito-ds/mito ⭐ 2,356
Jupyter extensions that help you write code faster: Context aware AI Chat, Autocomplete, and Spreadsheet
🔗 trymito.io
jupyter/nbviewer ⭐ 2,236
nbconvert as a web service: Render Jupyter Notebooks as static web pages
🔗 nbviewer.jupyter.org
maartenbreddels/ipyvolume ⭐ 1,956
3d plotting for Python in the Jupyter notebook based on IPython widgets using WebGL
jupyter-lsp/jupyterlab-lsp ⭐ 1,856
Coding assistance for JupyterLab (code navigation + hover suggestions + linters + autocompletion + rename) using Language Server Protocol
🔗 jupyterlab-lsp.readthedocs.io
jupyter/nbconvert ⭐ 1,801
Jupyter Notebook Conversion
🔗 nbconvert.readthedocs.io
koaning/drawdata ⭐ 1,282
Draw datasets from within Python notebooks.
8080labs/pyforest ⭐ 1,110
With pyforest you can use all your favorite Python libraries without importing them before. If you use a package that is not imported yet, pyforest imports the package for you and adds the code to the first Jupyter cell.
🔗 8080labs.com
nbqa-dev/nbQA ⭐ 1,094
Run ruff, isort, pyupgrade, mypy, pylint, flake8, and more on Jupyter Notebooks
🔗 nbqa.readthedocs.io/en/latest/index.html
vizzuhq/ipyvizzu ⭐ 961
Build animated charts in Jupyter Notebook and similar environments with a simple Python syntax.
🔗 ipyvizzu.vizzuhq.com
aws/graph-notebook ⭐ 759
Library extending Jupyter notebooks to integrate with Apache TinkerPop, openCypher, and RDF SPARQL.
🔗 github.com/aws/graph-notebook
linealabs/lineapy ⭐ 666
Move fast from data science prototype to pipeline. Capture, analyze, and transform messy notebooks into data pipelines with just two lines of code.
🔗 lineapy.org
xiaohk/stickyland ⭐ 565
Break the linear presentation of Jupyter Notebooks with sticky cells!
🔗 xiaohk.github.io/stickyland
infuseai/colab-xterm ⭐ 437
Open a terminal in colab, including the free tier.

LLMs and ChatGPT

Large language model and GPT libraries and frameworks: auto-gpt, agents, QnA, chain-of-thought workflows, API integations. Also see the Natural Language Processing category for crossover.

significant-gravitas/AutoGPT ⭐ 173,516
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
🔗 agpt.co
deepseek-ai/DeepSeek-V3 ⭐ 92,448
A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.
open-webui/open-webui ⭐ 84,123
Open WebUI is an extensible, feature-rich, and user-friendly self-hosted AI platform designed to operate entirely offline. It supports various LLM runners like Ollama and OpenAI-compatible APIs, with built-in inference engine for RAG
🔗 openwebui.com
ggerganov/llama.cpp ⭐ 76,764
LLM inference in C/C++
nomic-ai/gpt4all ⭐ 72,850
GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.
🔗 nomic.ai/gpt4all
xtekky/gpt4free ⭐ 63,835
The official gpt4free repository | various collection of powerful language models | o3 and deepseek r1, gpt-4.5
🔗 t.me/g4f_channel
killianlucas/open-interpreter ⭐ 58,806
A natural language interface for computers
🔗 openinterpreter.com
facebookresearch/llama ⭐ 57,879
Inference code for Llama models
imartinez/private-gpt ⭐ 55,462
Interact with your documents using the power of GPT, 100% privately, no data leaks
🔗 privategpt.dev
gpt-engineer-org/gpt-engineer ⭐ 53,409
CLI platform to experiment with codegen. Precursor to: https://lovable.dev
xai-org/grok-1 ⭐ 50,246
This repository contains JAX example code for loading and running the Grok-1 open-weights model.
infiniflow/ragflow ⭐ 45,392
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
🔗 ragflow.io
hiyouga/LLaMA-Factory ⭐ 44,593
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
🔗 huggingface.co/papers/2403.13372
oobabooga/text-generation-webui ⭐ 42,908
A Gradio web UI for Large Language Models with support for multiple inference backends.
🔗 oobabooga.gumroad.com/l/deep_reason
vllm-project/vllm ⭐ 41,818
A high-throughput and memory-efficient inference and serving engine for LLMs
🔗 docs.vllm.ai
thudm/ChatGLM-6B ⭐ 41,019
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
hpcaitech/ColossalAI ⭐ 40,617
Making large AI models cheaper, faster and more accessible
🔗 www.colossalai.org
karpathy/nanoGPT ⭐ 40,137
The simplest, fastest repository for training/finetuning medium-sized GPTs.
lm-sys/FastChat ⭐ 38,133
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
quivrhq/quivr ⭐ 37,549
Opiniated RAG for integrating GenAI in your apps 🧠 Focus on your product rather than the RAG. Easy integration in existing products with customisation! Any LLM: GPT4, Groq, Llama. Any Vectorstore: PGVector, Faiss. Any Files. Anyway you want.
🔗 core.quivr.com
laion-ai/Open-Assistant ⭐ 37,255
OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
🔗 open-assistant.io
unslothai/unsloth ⭐ 35,076
Finetune Llama 3.3, DeepSeek-R1, Gemma 3 & Reasoning LLMs 2x faster with 70% less memory! 🦥
🔗 unsloth.ai
moymix/TaskMatrix ⭐ 34,515
Connects ChatGPT and a series of Visual Foundation Models to enable sending and receiving images during chatting.
unclecode/crawl4ai ⭐ 33,599
AI-ready web crawling tailored for LLMs, AI agents, and data pipelines. Open source, flexible, and built for real-time performance, Crawl4AI empowers developers with unmatched speed, precision, and deployment ease.
🔗 crawl4ai.com
pythagora-io/gpt-pilot ⭐ 32,498
The first real AI developer
danielmiessler/fabric ⭐ 30,053
fabric is an open-source framework for augmenting humans using AI. It provides a modular framework for solving specific problems using a crowdsourced set of AI prompts that can be used anywhere.
🔗 danielmiessler.com/p/fabric-origin-story
tatsu-lab/stanford_alpaca ⭐ 29,878
Code and documentation to train Stanford's Alpaca models, and generate the data.
🔗 crfm.stanford.edu/2023/03/13/alpaca.html
meta-llama/llama3 ⭐ 28,535
The official Meta Llama 3 GitHub site
exo-explore/exo ⭐ 26,735
Run your own AI cluster at home. Unify your existing devices into one powerful GPU: iPhone, iPad, Android, Mac, NVIDIA, Raspberry Pi etc
khoj-ai/khoj ⭐ 26,692
Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous AI
🔗 khoj.dev
embedchain/mem0 ⭐ 26,370
The Memory layer for AI Agents
🔗 mem0.ai
karpathy/llm.c ⭐ 26,041
LLM training in simple, pure C/CUDA. There is no need for 245MB of PyTorch or 107MB of cPython
vision-cair/MiniGPT-4 ⭐ 25,607
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
🔗 minigpt-4.github.io
microsoft/JARVIS ⭐ 24,025
JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf
microsoft/graphrag ⭐ 23,579
A modular graph-based Retrieval-Augmented Generation (RAG) system
🔗 microsoft.github.io/graphrag
microsoft/semantic-kernel ⭐ 23,567
An SDK that integrates LLMs like OpenAI, Azure OpenAI, and Hugging Face with conventional programming languages like C#, Python, and Java
🔗 aka.ms/semantic-kernel
openai/gpt-2 ⭐ 23,200
Code for the paper "Language Models are Unsupervised Multitask Learners"
🔗 openai.com/blog/better-language-models
huggingface/open-r1 ⭐ 22,942
The goal of this repo is to build the missing pieces of the R1 pipeline such that everybody can reproduce and build on top of it
pathwaycom/llm-app ⭐ 22,942
Ready-to-run cloud templates for RAG, AI pipelines, and enterprise search with live data. 🐳Docker-friendly.⚡Always in sync with Sharepoint, Google Drive, S3, Kafka, PostgreSQL, real-time data APIs, and more.
🔗 pathway.com/developers/templates
stanfordnlp/dspy ⭐ 22,486
DSPy: The framework for programming—not prompting—language models
🔗 dspy.ai
haotian-liu/LLaVA ⭐ 21,864
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
🔗 llava.hliu.cc
cinnamon/kotaemon ⭐ 21,708
An open-source RAG UI for chatting with your documents. Built with both end users and developers in mind
🔗 cinnamon.github.io/kotaemon
karpathy/minGPT ⭐ 21,585
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
openai/chatgpt-retrieval-plugin ⭐ 21,145
The ChatGPT Retrieval Plugin lets you easily find personal or work documents by asking questions in natural language.
mlc-ai/mlc-llm ⭐ 20,207
Universal LLM Deployment Engine with ML Compilation
🔗 llm.mlc.ai
guidance-ai/guidance ⭐ 19,900
A guidance language for controlling large language models.
deepset-ai/haystack ⭐ 19,841
AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversatio...
🔗 haystack.deepset.ai
modelcontextprotocol/servers ⭐ 19,769
A collection of reference implementations for the Model Context Protocol (MCP), as well as references to community built servers
🔗 modelcontextprotocol.io
rasahq/rasa ⭐ 19,670
💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants
🔗 rasa.com/docs/rasa
berriai/litellm ⭐ 19,266
Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
🔗 docs.litellm.ai/docs
stitionai/devika ⭐ 19,069
Devika is an advanced AI software engineer that can understand high-level human instructions, break them down into steps, research relevant information, and write code to achieve the given objective.
tloen/alpaca-lora ⭐ 18,839
Instruct-tune LLaMA on consumer hardware
karpathy/llama2.c ⭐ 18,184
Inference Llama 2 in one file of pure C
huggingface/peft ⭐ 17,797
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
🔗 huggingface.co/docs/peft
qwenlm/Qwen ⭐ 17,476
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
facebookresearch/llama-cookbook ⭐ 16,474
Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama model family and using them on various provider services
🔗 www.llama.com
dao-ailab/flash-attention ⭐ 16,371
Fast and memory-efficient exact attention
facebookresearch/codellama ⭐ 16,242
Inference code for CodeLlama models
transformeroptimus/SuperAGI ⭐ 16,078
<⚡️> SuperAGI - A dev-first open source autonomous AI agent framework. Enabling developers to build, manage & run useful autonomous agents quickly and reliably.
🔗 superagi.com
idea-research/Grounded-Segment-Anything ⭐ 15,938
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
🔗 arxiv.org/abs/2401.14159
thudm/ChatGLM2-6B ⭐ 15,753
ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型
openai/evals ⭐ 15,738
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
mayooear/ai-pdf-chatbot-langchain ⭐ 15,215
LangChain & LangGraph AI PDF chatbot agent
🔗 www.youtube.com/watch?v=ih9pbgvvoo4
mlc-ai/web-llm ⭐ 15,017
High-performance In-browser LLM Inference Engine
🔗 webllm.mlc.ai
fauxpilot/fauxpilot ⭐ 14,686
FauxPilot - an open-source alternative to GitHub Copilot server
vanna-ai/vanna ⭐ 14,277
🤖 Chat with your SQL database 📊. Accurate Text-to-SQL Generation via LLMs using RAG 🔄.
🔗 vanna.ai/docs
nirdiamant/RAG_Techniques ⭐ 13,448
The most comprehensive and dynamic collections of Retrieval-Augmented Generation (RAG) tutorials available today. This repository serves as a hub for cutting-edge techniques aimed at enhancing the accuracy, efficiency, and contextual richness of RAG systems.
blinkdl/RWKV-LM ⭐ 13,374
RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RNN and transformer - great performance, linear time, constant space (no kv-cache), fast training, infinite ctx_len, and f...
microsoft/BitNet ⭐ 12,812
Official inference framework for 1-bit LLMs (e.g., BitNet b1.58). It offers a suite of optimized kernels, that support fast and lossless inference of 1.58-bit models
skyvern-ai/skyvern ⭐ 12,669
Skyvern automates browser-based workflows using LLMs and computer vision. It provides a simple API endpoint to fully automate manual workflows, replacing brittle or unreliable automation solutions.
🔗 www.skyvern.com
lvwerra/trl ⭐ 12,589
Train transformer language models with reinforcement learning.
🔗 hf.co/docs/trl
paddlepaddle/PaddleNLP ⭐ 12,433
👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.
🔗 paddlenlp.readthedocs.io
sgl-project/sglang ⭐ 12,074
SGLang is a fast serving framework for large language models and vision language models.
🔗 docs.sglang.ai
openlmlab/MOSS ⭐ 12,033
An open-source tool-augmented conversational language model from Fudan University
🔗 txsun1997.github.io/blogs/moss.html
shishirpatil/gorilla ⭐ 11,903
Enables LLMs to use tools by invoking APIs. Given a query, Gorilla comes up with the semantically and syntactically correct API.
🔗 gorilla.cs.berkeley.edu
lightning-ai/litgpt ⭐ 11,818
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
🔗 lightning.ai
lightning-ai/litgpt ⭐ 11,818
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
🔗 lightning.ai
nvidia/Megatron-LM ⭐ 11,806
Ongoing research training transformer models at scale
🔗 docs.nvidia.com/megatron-core/developer-guide/latest/user-guide/index.html#quick-start
h2oai/h2ogpt ⭐ 11,726
Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://gpt-docs.h2o.ai/
🔗 h2o.ai
andrewyng/aisuite ⭐ 11,658
Simple, unified interface to multiple Generative AI providers. aisuite makes it easy for developers to use multiple LLM through a standardized interface.
microsoft/LoRA ⭐ 11,525
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
🔗 arxiv.org/abs/2106.09685
llmware-ai/llmware ⭐ 11,272
Unified framework for building enterprise RAG pipelines with small, specialized models
🔗 llmware-ai.github.io/llmware
jiayi-pan/TinyZero ⭐ 11,221
TinyZero is a reproduction of DeepSeek R1 Zero in countdown and multiplication tasks.
anthropics/anthropic-cookbook ⭐ 11,169
Provides code and guides designed to help developers build with Claude, offering copy-able code snippets that you can easily integrate into your own projects.
dottxt-ai/outlines ⭐ 11,081
Structured Text Generation from LLMs
🔗 dottxt-ai.github.io/outlines
google-research/vision_transformer ⭐ 11,052
Vision Transformer and MLP-Mixer Architectures
databrickslabs/dolly ⭐ 10,817
Databricks’ Dolly, a large language model trained on the Databricks Machine Learning Platform
🔗 www.databricks.com/blog/2023/03/24/hello-dolly-democratizing-magic-chatgpt-open-models.html
swivid/F5-TTS ⭐ 10,461
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
🔗 arxiv.org/abs/2410.06885
artidoro/qlora ⭐ 10,315
QLoRA: Efficient Finetuning of Quantized LLMs
🔗 arxiv.org/abs/2305.14314
microsoft/promptflow ⭐ 10,106
Build high-quality LLM apps - from prototyping, testing to production deployment and monitoring.
🔗 microsoft.github.io/promptflow
mistralai/mistral-inference ⭐ 10,103
Official inference library for Mistral models
🔗 mistral.ai
instructor-ai/instructor ⭐ 9,818
Instructor is a Python library that makes it a breeze to work with structured outputs from large language models (LLMs). Built on top of Pydantic, it provides a simple, transparent, and user-friendly API to manage validation, retries, and streaming responses.
🔗 python.useinstructor.com
karpathy/minbpe ⭐ 9,491
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
mshumer/gpt-prompt-engineer ⭐ 9,491
Simply input a description of your task and some test cases, and the system will generate, test, and rank a multitude of prompts to find the ones that perform the best.
blinkdl/ChatRWKV ⭐ 9,464
ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.
chainlit/chainlit ⭐ 8,902
Build Conversational AI in minutes ⚡️
🔗 docs.chainlit.io
axolotl-ai-cloud/axolotl ⭐ 8,881
Go ahead and axolotl questions
🔗 axolotl-ai-cloud.github.io/axolotl
abetlen/llama-cpp-python ⭐ 8,829
Simple Python bindings for @ggerganov's llama.cpp library.
🔗 llama-cpp-python.readthedocs.io
apple/ml-ferret ⭐ 8,597
Ferret: Refer and Ground Anything Anywhere at Any Granularity
explodinggradients/ragas ⭐ 8,500
Supercharge Your LLM Application Evaluations 🚀
🔗 docs.ragas.io
thudm/CodeGeeX ⭐ 8,432
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
🔗 codegeex.cn
optimalscale/LMFlow ⭐ 8,375
An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
🔗 optimalscale.github.io/lmflow
jzhang38/TinyLlama ⭐ 8,313
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
eleutherai/lm-evaluation-harness ⭐ 8,287
A framework for few-shot evaluation of language models.
🔗 www.eleuther.ai
eleutherai/gpt-neo ⭐ 8,287
An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.
🔗 www.eleuther.ai
anthropics/anthropic-quickstarts ⭐ 8,271
A collection of projects designed to help developers quickly get started with building applications using the Anthropic API. Each quickstart provides a foundation that you can easily build upon and customize for your specific needs.
vaibhavs10/insanely-fast-whisper ⭐ 8,197
An opinionated CLI to transcribe Audio files w/ Whisper on-device! Powered by 🤗 Transformers, Optimum & flash-attn
sjtu-ipads/PowerInfer ⭐ 8,153
High-speed Large Language Model Serving for Local Deployment
lianjiatech/BELLE ⭐ 8,081
BELLE: Be Everyone's Large Language model Engine（开源中文对话大模型）
plachtaa/VALL-E-X ⭐ 7,828
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/
01-ai/Yi ⭐ 7,826
The Yi series models are the next generation of open-source large language models trained from scratch by 01.AI.
🔗 01.ai
e2b-dev/E2B ⭐ 7,762
E2B is an open-source infrastructure that allows you to run AI-generated code in secure isolated sandboxes in the cloud
🔗 e2b.dev/docs
thudm/GLM-130B ⭐ 7,680
GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)
vikhyat/moondream ⭐ 7,642
A tiny open-source computer-vision language model designed to run efficiently on edge devices
🔗 moondream.ai
skypilot-org/skypilot ⭐ 7,536
SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 15+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
🔗 docs.skypilot.co
sweepai/sweep ⭐ 7,529
Sweep: AI coding assistant for JetBrains
🔗 sweep.dev
zilliztech/GPTCache ⭐ 7,462
Semantic cache for LLMs. Fully integrated with LangChain and llama_index.
🔗 gptcache.readthedocs.io
openlm-research/open_llama ⭐ 7,461
OpenLLaMA: An Open Reproduction of LLaMA
bigcode-project/starcoder ⭐ 7,394
Home of StarCoder: fine-tuning & inference!
eleutherai/gpt-neox ⭐ 7,131
An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
🔗 www.eleuther.ai
bhaskatripathi/pdfGPT ⭐ 7,083
PDF GPT allows you to chat with the contents of your PDF file by using GPT capabilities. The most effective open source solution to turn your pdf files in a chatbot!
🔗 huggingface.co/spaces/bhaskartripathi/pdfchatter
future-house/paper-qa ⭐ 7,082
High-accuracy retrieval augmented generation (RAG) on PDFs or text files, with a focus on the scientific literature
🔗 futurehouse.gitbook.io/futurehouse-cookbook
canner/WrenAI ⭐ 7,068
Open-source GenBI AI Agent that empowers data-driven teams to chat with their data to generate Text-to-SQL, charts, spreadsheets, reports, and BI.
🔗 getwren.ai/oss
apple/corenet ⭐ 6,999
CoreNet is a deep neural network toolkit that allows researchers and engineers to train standard and novel small and large-scale models for variety of tasks, including foundation models (e.g., CLIP and LLM), object classification, object detection, and semantic segmentation.
weaviate/Verba ⭐ 6,946
Retrieval Augmented Generation (RAG) chatbot powered by Weaviate
mit-han-lab/streaming-llm ⭐ 6,830
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
🔗 arxiv.org/abs/2309.17453
internlm/InternLM ⭐ 6,820
Official release of InternLM series (InternLM, InternLM2, InternLM2.5, InternLM3).
🔗 internlm.intern-ai.org.cn
langchain-ai/opengpts ⭐ 6,598
An open source effort to create a similar experience to OpenAI's GPTs and Assistants API.
run-llama/rags ⭐ 6,429
RAGs is a Streamlit app that lets you create a RAG pipeline from a data source using natural language.
modelscope/ms-swift ⭐ 6,346
Use PEFT or Full-parameter to finetune 450+ LLMs (Qwen2.5, InternLM3, GLM4, Llama3.3, Mistral, Yi1.5, Baichuan2, DeepSeek-R1, ...) and 150+ MLLMs (Qwen2.5-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL2.5, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, Yi-VL, DeepSeek-VL2, Phi3.5-Vision, GOT-OCR2, ...).
🔗 swift.readthedocs.io/zh-cn/latest
nat/openplayground ⭐ 6,340
An LLM playground you can run on your laptop
lightning-ai/lit-llama ⭐ 6,042
Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.
minedojo/Voyager ⭐ 5,978
An Open-Ended Embodied Agent with Large Language Models
🔗 voyager.minedojo.org
pytorch-labs/gpt-fast ⭐ 5,889
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
promptfoo/promptfoo ⭐ 5,880
Test your prompts, agents, and RAGs. Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.
🔗 promptfoo.dev
langchain-ai/chat-langchain ⭐ 5,771
Locally hosted chatbot specifically focused on question answering over the LangChain documentation
🔗 chat.langchain.com
lyogavin/airllm ⭐ 5,741
AirLLM optimizes inference memory usage, allowing 70B large language models to run inference on a single 4GB GPU card without quantization, distillation and pruning. And you can run 405B Llama3.1 on 8GB vram now.
qwenlm/Qwen-VL ⭐ 5,649
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
microsoft/promptbase ⭐ 5,573
promptbase is an evolving collection of resources, best practices, and example scripts for eliciting the best performance from foundation models.
cg123/mergekit ⭐ 5,437
Tools for merging pretrained large language models.
arcee-ai/mergekit ⭐ 5,437
Tools for merging pretrained large language models.
allenai/OLMo ⭐ 5,394
OLMo is a repository for training and using AI2's state-of-the-art open language models. It is designed by scientists, for scientists.
🔗 allenai.org/olmo
dsdanielpark/Bard-API ⭐ 5,283
The unofficial python package that returns response of Google Bard through cookie value.
🔗 pypi.org/project/bardapi
pipecat-ai/pipecat ⭐ 5,221
Open Source framework for voice and multimodal conversational AI
volcengine/verl ⭐ 5,007
veRL is a flexible, efficient and production-ready RL training library for large language models (LLMs).
🔗 verl.readthedocs.io/en/latest/index.html
open-compass/opencompass ⭐ 4,954
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
🔗 opencompass.org.cn
microsoft/LLMLingua ⭐ 4,946
[EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
🔗 llmlingua.com
openbmb/ToolBench ⭐ 4,932
[ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.
🔗 openbmb.github.io/toolbench
togethercomputer/RedPajama-Data ⭐ 4,678
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
linkedin/Liger-Kernel ⭐ 4,672
Efficient Triton Kernels for LLM Training
🔗 arxiv.org/pdf/2410.10989
1rgs/jsonformer ⭐ 4,646
A Bulletproof Way to Generate Structured JSON from Language Models
guardrails-ai/guardrails ⭐ 4,644
Open-source Python package for specifying structure and type, validating and correcting the outputs of large language models (LLMs)
🔗 www.guardrailsai.com/docs
nvidia/NeMo-Guardrails ⭐ 4,528
NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.
kyegomez/tree-of-thoughts ⭐ 4,463
Plug in and Play Implementation of Tree of Thoughts: Deliberate Problem Solving with Large Language Models that Elevates Model Reasoning by atleast 70%
🔗 discord.gg/qutxnk2nmf
katanaml/sparrow ⭐ 4,416
Sparrow is a solution for efficient data extraction and processing from various documents and images like invoices and receipts
🔗 sparrow.katanaml.io
microsoft/BioGPT ⭐ 4,382
Implementation of BioGPT: Generative Pre-trained Transformer for Biomedical Text Generation and Mining
modelcontextprotocol/python-sdk ⭐ 4,345
The Model Context Protocol allows applications to provide context for LLMs in a standardized way, separating the concerns of providing context from the actual LLM interaction.
🔗 modelcontextprotocol.io
yizhongw/self-instruct ⭐ 4,311
Aligning pretrained language models with instruction data generated by themselves.
instruction-tuning-with-gpt-4/GPT-4-LLM ⭐ 4,281
Instruction Tuning with GPT-4
🔗 instruction-tuning-with-gpt-4.github.io
h2oai/h2o-llmstudio ⭐ 4,224
H2O LLM Studio - a framework and no-code GUI for fine-tuning LLMs. Documentation: https://docs.h2o.ai/h2o-llmstudio/
🔗 h2o.ai
ragapp/ragapp ⭐ 4,153
The easiest way to use Agentic RAG in any enterprise
mshumer/gpt-llm-trainer ⭐ 4,097
Input a description of your task, and the system will generate a dataset, parse it, and fine-tune a LLaMA 2 model for you
turboderp/exllamav2 ⭐ 4,053
A fast inference library for running LLMs locally on modern consumer-class GPUs
agiresearch/AIOS ⭐ 3,938
AIOS, a Large Language Model (LLM) Agent operating system, embeds large language model into Operating Systems (OS) as the brain of the OS, enabling an operating system "with soul" -- an important step towards AGI.
🔗 aios.foundation
truefoundry/cognita ⭐ 3,929
RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry
🔗 cognita.truefoundry.com
microsoft/LMOps ⭐ 3,888
General technology for enabling AI capabilities w/ LLMs and MLLMs
🔗 aka.ms/generalai
eth-sri/lmql ⭐ 3,863
A language for constraint-guided and efficient LLM programming.
🔗 lmql.ai
ravenscroftj/turbopilot ⭐ 3,822
Turbopilot is an open source large-language-model based code completion engine that runs locally on CPU
llm-attacks/llm-attacks ⭐ 3,774
This is the official repository for "Universal and Transferable Adversarial Attacks on Aligned Language Models"
🔗 llm-attacks.org
lm-sys/RouteLLM ⭐ 3,728
A framework for serving and evaluating LLM routers - save LLM costs without compromising quality
marker-inc-korea/AutoRAG ⭐ 3,687
AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation
🔗 auto-rag.com
mmabrouk/llm-workflow-engine ⭐ 3,681
Power CLI and Workflow manager for LLMs (core package)
defog-ai/sqlcoder ⭐ 3,663
SoTA LLM for converting natural language questions to SQL queries
nirdiamant/Prompt_Engineering ⭐ 3,515
A comprehensive collection of tutorials and implementations for Prompt Engineering techniques, ranging from fundamental concepts to advanced strategies.
minimaxir/simpleaichat ⭐ 3,503
Python package for easily interfacing with chat apps, with robust features and minimal code complexity.
next-gpt/NExT-GPT ⭐ 3,464
Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model
🔗 next-gpt.github.io
iryna-kondr/scikit-llm ⭐ 3,432
Seamlessly integrate LLMs into scikit-learn.
🔗 beastbyte.ai
minimaxir/gpt-2-simple ⭐ 3,400
Python package to easily retrain OpenAI's GPT-2 text-generating model on new texts
jaymody/picoGPT ⭐ 3,328
An unnecessarily tiny implementation of GPT-2 in NumPy.
bclavie/RAGatouille ⭐ 3,320
Bridging the gap between state-of-the-art research and alchemical RAG pipeline practices.
huggingface/text-embeddings-inference ⭐ 3,319
A blazing fast inference solution for text embeddings models
🔗 huggingface.co/docs/text-embeddings-inference/quick_tour
deep-diver/LLM-As-Chatbot ⭐ 3,308
LLM as a Chatbot Service
deep-agent/R1-V ⭐ 3,294
We are building a general framework for Reinforcement Learning with Verifiable Rewards (RLVR) in VLM. RLVR outperforms chain-of-thought supervised fine-tuning (CoT-SFT) in both effectiveness and out-of-distribution (OOD) robustness for vision language models.
luodian/Otter ⭐ 3,241
🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
🔗 otter-ntu.github.io
kiln-ai/Kiln ⭐ 3,204
The easiest tool for fine-tuning LLM models, synthetic data generation, and collaborating on datasets.
🔗 getkiln.ai
vllm-project/aibrix ⭐ 3,192
AIBrix delivers a cloud-native solution optimized for deploying, managing, and scaling large language model (LLM) inference, tailored specifically to enterprise needs.
novasky-ai/SkyThought ⭐ 3,136
Sky-T1: Train your own O1 preview model within $450
🔗 novasky-ai.github.io
microsoft/torchscale ⭐ 3,063
Foundation Architecture for (M)LLMs
🔗 aka.ms/generalai
verazuo/jailbreak_llms ⭐ 3,013
Official repo for the ACM CCS 2024 paper "Do Anything Now'': Characterizing and Evaluating In-The-Wild Jailbreak Prompts
🔗 jailbreak-llms.xinyueshen.me
cohere-ai/cohere-toolkit ⭐ 3,000
Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.
baichuan-inc/Baichuan-13B ⭐ 2,979
A 13B large language model developed by Baichuan Intelligent Technology
🔗 huggingface.co/baichuan-inc/baichuan-13b-chat
li-plus/chatglm.cpp ⭐ 2,972
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
lightning-ai/LitServe ⭐ 2,969
Lightning-fast serving engine for any AI model of any size. Flexible. Easy. Enterprise-scale.
🔗 lightning.ai/litserve
meta-llama/PurpleLlama ⭐ 2,966
Set of tools to assess and improve LLM security. An umbrella project to bring together tools and evals to help the community build responsibly with open genai models.
freedomintelligence/LLMZoo ⭐ 2,937
⚡LLM Zoo is a project that provides data, models, and evaluation benchmark for large language models.⚡
mistralai/mistral-finetune ⭐ 2,887
A light-weight codebase that enables memory-efficient and performant finetuning of Mistral's models. It is based on LoRA.
sylphai-inc/AdalFlow ⭐ 2,883
Unified auto-differentiative framework for both zero-shot prompt optimization and few-shot optimization. It advances existing auto-optimization research, including Text-Grad and DsPy
🔗 adalflow.sylph.ai
mit-han-lab/llm-awq ⭐ 2,848
AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
hegelai/prompttools ⭐ 2,815
Open-source tools for prompt testing and experimentation, with support for both LLMs (e.g. OpenAI, LLaMA) and vector databases (e.g. Chroma, Weaviate, LanceDB).
🔗 prompttools.readthedocs.io
juncongmoo/pyllama ⭐ 2,803
LLaMA: Open and Efficient Foundation Language Models
alpha-vllm/LLaMA2-Accessory ⭐ 2,763
An Open-source Toolkit for LLM Development
🔗 llama2-accessory.readthedocs.io
paperswithcode/galai ⭐ 2,709
Model API for GALACTICA
cheshire-cat-ai/core ⭐ 2,679
AI agent microservice
🔗 cheshirecat.ai
predibase/lorax ⭐ 2,628
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
🔗 loraexchange.ai
noahshinn/reflexion ⭐ 2,625
[NeurIPS 2023] Reflexion: Language Agents with Verbal Reinforcement Learning
deepseek-ai/DualPipe ⭐ 2,604
DualPipe is an innovative bidirectional pipeline parallelism algorithm introduced in the DeepSeek-V3 Technical Report.
pytorch/executorch ⭐ 2,603
An end-to-end solution for enabling on-device inference capabilities across mobile and edge devices including wearables, embedded devices and microcontrollers. It is part of the PyTorch Edge ecosystem and enables efficient deployment of PyTorch models to edge devices.
🔗 pytorch.org/executorch
janhq/cortex.cpp ⭐ 2,559
Cortex is a Local AI API Platform that is used to run and customize LLMs.
🔗 cortex.so
argilla-io/distilabel ⭐ 2,559
Distilabel is the framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
🔗 distilabel.argilla.io
databricks/dbrx ⭐ 2,544
Code examples and resources for DBRX, a large language model developed by Databricks
🔗 www.databricks.com
roboflow/maestro ⭐ 2,501
streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL
🔗 maestro.roboflow.com
ofa-sys/OFA ⭐ 2,478
Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
young-geng/EasyLM ⭐ 2,460
Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax.
openai/simple-evals ⭐ 2,450
Lightweight library for evaluating language models
flashinfer-ai/flashinfer ⭐ 2,404
FlashInfer is a library and kernel generator for Large Language Models that provides high-performance implementation of LLM GPU kernels such as FlashAttention, SparseAttention, PageAttention, Sampling
🔗 flashinfer.ai
truera/trulens ⭐ 2,381
Evaluation and Tracking for LLM Experiments
🔗 www.trulens.org
intel/neural-compressor ⭐ 2,355
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
🔗 intel.github.io/neural-compressor
civitai/sd_civitai_extension ⭐ 2,352
All of the Civitai models inside Automatic 1111 Stable Diffusion Web UI
spcl/graph-of-thoughts ⭐ 2,319
Official Implementation of "Graph of Thoughts: Solving Elaborate Problems with Large Language Models"
🔗 arxiv.org/pdf/2308.09687.pdf
uptrain-ai/uptrain ⭐ 2,247
An open-source unified platform to evaluate and improve Generative AI applications. Provide grades for 20+ preconfigured evaluations (covering language, code, embedding use cases)
🔗 uptrain.ai
agenta-ai/agenta ⭐ 2,224
The open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM Observability all in one place.
🔗 www.agenta.ai
azure-samples/graphrag-accelerator ⭐ 2,223
One-click deploy of a Knowledge Graph powered RAG (GraphRAG) in Azure
🔗 github.com/microsoft/graphrag
evolvinglmms-lab/lmms-eval ⭐ 2,216
Accelerating the development of large multimodal models (LMMs) with one-click evaluation module - lmms-eval.
🔗 lmms-lab.framer.ai
openai/finetune-transformer-lm ⭐ 2,195
Code and model for the paper "Improving Language Understanding by Generative Pre-Training"
🔗 s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf
tairov/llama2.mojo ⭐ 2,112
Inference Llama 2 in one file of pure 🔥
🔗 www.modular.com/blog/community-spotlight-how-i-built-llama2-by-aydyn-tairov
ist-daslab/gptq ⭐ 2,061
Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
🔗 arxiv.org/abs/2210.17323
openai/image-gpt ⭐ 2,055
Archived. Code and models from the paper "Generative Pretraining from Pixels"
facebookresearch/large_concept_model ⭐ 2,031
Large Concept Models: Language modeling in a sentence representation space
huggingface/smollm ⭐ 2,020
Everything about the SmolLM2 and SmolVLM family of models
🔗 huggingface.co/huggingfacetb
microsoft/Megatron-DeepSpeed ⭐ 2,018
Ongoing research training transformer language models at scale, including: BERT & GPT-2
casper-hansen/AutoAWQ ⭐ 2,014
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
🔗 casper-hansen.github.io/autoawq
lucidrains/toolformer-pytorch ⭐ 2,013
Implementation of Toolformer, Language Models That Can Use Tools, by MetaAI
akariasai/self-rag ⭐ 2,005
This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, and Hannaneh Hajishirzi.
🔗 selfrag.github.io
epfllm/meditron ⭐ 1,989
Meditron is a suite of open-source medical Large Language Models (LLMs).
🔗 huggingface.co/epfl-llm
neulab/prompt2model ⭐ 1,983
A system that takes a natural language task description to train a small special-purpose model that is conducive for deployment.
ruc-nlpir/FlashRAG ⭐ 1,978
FlashRAG is a Python toolkit for the reproduction and development of RAG research. Our toolkit includes 36 pre-processed benchmark RAG datasets and 15 state-of-the-art RAG algorithms.
🔗 arxiv.org/abs/2405.13576
openai/gpt-2-output-dataset ⭐ 1,967
Dataset of GPT-2 outputs for research in detection, biases, and more
facebookresearch/chameleon ⭐ 1,949
Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.
🔗 arxiv.org/abs/2405.09818
minimaxir/aitextgen ⭐ 1,841
A robust Python tool for text-based AI training and generation using GPT-2.
🔗 docs.aitextgen.io
openai/gpt-discord-bot ⭐ 1,804
Example Discord bot written in Python that uses the completions API to have conversations with the text-davinci-003 model, and the moderations API to filter the messages.
ray-project/llm-applications ⭐ 1,778
A comprehensive guide to building RAG-based LLM applications for production.
noamgat/lm-format-enforcer ⭐ 1,736
Enforce the output format (JSON Schema, Regex etc) of a language model
huggingface/nanotron ⭐ 1,691
Minimalistic large language model 3D-parallelism training
ai-hypercomputer/maxtext ⭐ 1,655
MaxText is a high performance, highly scalable, open-source LLM written in pure Python/Jax and targeting Google Cloud TPUs and GPUs for training and inference.
qwenlm/Qwen-Audio ⭐ 1,633
The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.
agentops-ai/tokencost ⭐ 1,605
Easy token price estimates for 400+ LLMs. TokenOps.
🔗 agentops.ai
illuin-tech/colpali ⭐ 1,604
Code used for training the vision retrievers in the ColPali: Efficient Document Retrieval with Vision Language Models paper
🔗 huggingface.co/vidore
jina-ai/thinkgpt ⭐ 1,569
Agent techniques to augment your LLM and push it beyong its limits
meetkai/functionary ⭐ 1,532
Chat language model that can use tools and interpret the results
hiyouga/EasyR1 ⭐ 1,528
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
🔗 verl.readthedocs.io/en/latest/index.html
protectai/llm-guard ⭐ 1,509
Sanitization, detection of harmful language, prevention of data leakage, and resistance against prompt injection attacks for LLMs
🔗 llm-guard.com
run-llama/llama-lab ⭐ 1,456
Llama Lab is a repo dedicated to building cutting-edge projects using LlamaIndex
farizrahman4u/loopgpt ⭐ 1,450
Re-implementation of Auto-GPT as a python package, written with modularity and extensibility in mind.
cstankonrad/long_llama ⭐ 1,450
LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transformer (FoT) method.
topoteretes/cognee ⭐ 1,446
Reliable LLM Memory for AI Applications and AI Agents
🔗 www.cognee.ai
chatarena/chatarena ⭐ 1,434
ChatArena (or Chat Arena) is a Multi-Agent Language Game Environments for LLMs. The goal is to develop communication and collaboration capabilities of AIs.
bigscience-workshop/Megatron-DeepSpeed ⭐ 1,376
Ongoing research training transformer language models at scale, including: BERT & GPT-2
explosion/spacy-transformers ⭐ 1,369
🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy
🔗 spacy.io/usage/embeddings-transformers
karpathy/nano-llama31 ⭐ 1,338
This repo is to Llama 3.1 what nanoGPT is to GPT-2. i.e. it is a minimal, dependency-free implementation of the Llama 3.1 architecture
answerdotai/rerankers ⭐ 1,337
Welcome to rerankers! Our goal is to provide users with a simple API to use any reranking models.
huggingface/lighteval ⭐ 1,301
LightEval is a lightweight LLM evaluation suite that Hugging Face has been using internally with the recently released LLM data processing library datatrove and LLM training library nanotron.
facebookresearch/MobileLLM ⭐ 1,265
Training code of MobileLLM introduced in our work: "MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases"
ray-project/ray-llm ⭐ 1,262
RayLLM - LLMs on Ray (Archived). Read README for more info.
🔗 docs.ray.io/en/latest
mlfoundations/dclm ⭐ 1,262
DataComp for Language Models
googleapis/python-genai ⭐ 1,228
Google Gen AI Python SDK provides an interface for developers to integrate Google's generative models into their Python applications.
🔗 googleapis.github.io/python-genai
keirp/automatic_prompt_engineer ⭐ 1,226
Large Language Models Are Human-Level Prompt Engineers
srush/MiniChain ⭐ 1,225
A tiny library for coding with large language models.
🔗 srush-minichain.hf.space
hao-ai-lab/LookaheadDecoding ⭐ 1,216
Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
🔗 arxiv.org/abs/2402.02057
explosion/spacy-llm ⭐ 1,213
🦙 Integrating LLMs into structured NLP pipelines
🔗 spacy.io/usage/large-language-models
protectai/rebuff ⭐ 1,208
Rebuff is designed to protect AI applications from prompt injection (PI) attacks through a multi-layered defense
🔗 playground.rebuff.ai
ibm/Dromedary ⭐ 1,142
Dromedary: towards helpful, ethical and reliable LLMs.
lupantech/chameleon-llm ⭐ 1,120
Codes for "Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models".
🔗 chameleon-llm.github.io
vllm-project/llm-compressor ⭐ 1,101
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
minishlab/model2vec ⭐ 1,098
Model2Vec is a technique to turn any sentence transformer into a really small static model, reducing model size by 15x and making the models up to 500x faster, with a small drop in performance
🔗 minishlab.github.io
nirdiamant/Controllable-RAG-Agent ⭐ 1,093
An advanced Retrieval-Augmented Generation (RAG) solution designed to tackle complex questions that simple semantic similarity-based retrieval cannot solve
huggingface/evaluation-guidebook ⭐ 1,078
Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard and designing lighteval!
deepseek-ai/EPLB ⭐ 1,076
Expert Parallelism Load Balancer across GPUs
rlancemartin/auto-evaluator ⭐ 1,071
Evaluation tool for LLM QA chains
🔗 autoevaluator.langchain.com
cerebras/modelzoo ⭐ 1,030
Examples of common deep learning models that can be trained on Cerebras hardware
ctlllll/LLM-ToolMaker ⭐ 1,027
Large Language Models as Tool Makers
microsoft/Llama-2-Onnx ⭐ 1,025
A Microsoft optimized version of the Llama 2 model, available from Meta
nomic-ai/pygpt4all ⭐ 1,020
Official supported Python bindings for llama.cpp + gpt4all
🔗 nomic-ai.github.io/pygpt4all
pinecone-io/canopy ⭐ 1,007
Retrieval Augmented Generation (RAG) framework and context engine powered by Pinecone
🔗 www.pinecone.io
ajndkr/lanarky ⭐ 987
The web framework for building LLM microservices
🔗 lanarky.ajndkr.com
datadreamer-dev/DataDreamer ⭐ 983
DataDreamer is a powerful open-source Python library for prompting, synthetic data generation, and training workflows. It is designed to be simple, extremely efficient, and research-grade.
🔗 datadreamer.dev
likejazz/llama3.np ⭐ 974
llama3.np is a pure NumPy implementation for Llama 3 model.
huggingface/picotron ⭐ 937
Minimalist & most-hackable repository for pre-training Llama-like models with 4D Parallelism (Data, Tensor, Pipeline, Context parallel)
huggingface/optimum-nvidia ⭐ 937
Optimum-NVIDIA delivers the best inference performance on the NVIDIA platform through Hugging Face. Run LLaMA 2 at 1,200 tokens/second (up to 28x faster than the framework)
soulter/hugging-chat-api ⭐ 914
HuggingChat Python API🤗
prometheus-eval/prometheus-eval ⭐ 882
Evaluate your LLM's response with Prometheus and GPT4 💯
muennighoff/sgpt ⭐ 864
SGPT: GPT Sentence Embeddings for Semantic Search
🔗 arxiv.org/abs/2202.08904
langchain-ai/langsmith-cookbook ⭐ 856
LangSmith is a platform for building production-grade LLM applications.
🔗 langsmith-cookbook.vercel.app
wandb/weave ⭐ 846
Weave is a toolkit for developing AI-powered applications, built by Weights & Biases.
🔗 wandb.me/weave
nousresearch/Hermes-Function-Calling ⭐ 832
Code for the Hermes Pro Large Language Model to perform function calling based on the provided schema. It allows users to query the model and retrieve information related to stock prices, company fundamentals, financial statements
mlc-ai/xgrammar ⭐ 801
XGrammar is an open-source library for efficient, flexible, and portable structured generation. It supports general context-free grammar to enable a broad range of structures while bringing careful system optimizations to enable fast executions.
🔗 xgrammar.mlc.ai
oliveirabruno01/babyagi-asi ⭐ 793
BabyAGI: an Autonomous and Self-Improving agent, or BASI
junruxiong/IncarnaMind ⭐ 793
Connect and chat with your multiple documents (pdf and txt) through GPT 3.5, GPT-4 Turbo, Claude and Local Open-Source LLMs
🔗 www.incarnamind.com
opengvlab/OmniQuant ⭐ 784
[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
utkusen/promptmap ⭐ 757
Vulnerability scanning tool that automatically tests prompt injection attacks on your LLM applications. It analyzes your LLM system prompts, runs them, and sends attack prompts to them.
opengenerativeai/GenossGPT ⭐ 754
One API for all LLMs either Private or Public (Anthropic, Llama V2, GPT 3.5/4, Vertex, GPT4ALL, HuggingFace ...) 🌈🐂 Replace OpenAI GPT with any LLMs in your app with one line.
🔗 genoss.ai
salesforce/xgen ⭐ 717
Salesforce open-source LLMs with 8k sequence length.
developersdigest/llm-api-engine ⭐ 708
Build and deploy AI-powered APIs in seconds. This project allows you to create custom APIs that extract structured data from websites using natural language descriptions, powered by LLMs and web scraping technology.
🔗 www.youtube.com/watch?v=8kuek1bo4mm
tag-research/TAG-Bench ⭐ 701
Table-Augmented Generation (TAG) is a unified and general-purpose paradigm for answering natural language questions over databases
🔗 arxiv.org/pdf/2408.14717
squeezeailab/SqueezeLLM ⭐ 680
[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization
🔗 arxiv.org/abs/2306.07629
sumandora/remove-refusals-with-transformers ⭐ 676
A proof-of-concept implementation to remove refusals from an LLM model without using TransformerLens
lupantech/ScienceQA ⭐ 644
Data and code for NeurIPS 2022 Paper "Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering".
tsinghuadatabasegroup/DB-GPT ⭐ 615
LLM As Database Administrator
🔗 dbgpt.dbmind.cn
microsoft/VPTQ ⭐ 612
Extreme Low-bit Vector Post-Training Quantization for Large Language Models
modal-labs/llm-finetuning ⭐ 572
Guide for fine-tuning Llama/Mistral/CodeLlama models and more
zhudotexe/kani ⭐ 570
kani (カニ) is a highly hackable microframework for chat-based language models with tool use/function calling. (NLP-OSS @ EMNLP 2023)
🔗 kani.readthedocs.io
magnivorg/prompt-layer-library ⭐ 566
🍰 PromptLayer - Maintain a log of your prompts and OpenAI API requests. Track, debug, and replay old completions.
🔗 www.promptlayer.com
centerforaisafety/hle ⭐ 559
Humanity's Last Exam (HLE) is a multi-modal benchmark at the frontier of human knowledge, designed to be the final closed-ended academic benchmark of its kind with broad subject coverage
🔗 lastexam.ai
hazyresearch/ama_prompting ⭐ 546
Ask Me Anything language model prompting
judahpaul16/gpt-home ⭐ 543
ChatGPT at home! Basically a better Google Nest Hub or Amazon Alexa home assistant. Built on the Raspberry Pi using the OpenAI API.
🔗 hub.docker.com/r/judahpaul/gpt-home
declare-lab/instruct-eval ⭐ 543
This repository contains code to quantitatively evaluate instruction-tuned models such as Alpaca and Flan-T5 on held-out tasks.
🔗 declare-lab.github.io/instruct-eval
vahe1994/SpQR ⭐ 541
Quantization algorithm and the model evaluation code for SpQR method for LLM compression
eugeneyan/obsidian-copilot ⭐ 527
🤖 A prototype assistant for writing and thinking
🔗 eugeneyan.com/writing/obsidian-copilot
continuum-llms/chatgpt-memory ⭐ 525
Allows to scale the ChatGPT API to multiple simultaneous sessions with infinite contextual and adaptive memory powered by GPT and Redis datastore.
hazyresearch/H3 ⭐ 516
Language Modeling with the H3 State Space Model
huggingface/text-clustering ⭐ 515
Easily embed, cluster and semantically label text datasets
likenneth/honest_llama ⭐ 513
Inference-Time Intervention: Eliciting Truthful Answers from a Language Model
kbressem/medAlpaca ⭐ 513
LLM finetuned for medical question answering
predibase/llm_distillation_playbook ⭐ 505
Best practices for distilling large language models.
deepseek-ai/DeepSeek-Prover-V1.5 ⭐ 478
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search
stanford-oval/suql ⭐ 256
SUQL: Conversational Search over Structured and Unstructured Data with LLMs
🔗 arxiv.org/abs/2311.09818
emissary-tech/legit-rag ⭐ 240
A modular Retrieval-Augmented Generation (RAG) system built with FastAPI, Qdrant, and OpenAI.
dottxt-ai/outlines-core ⭐ 193
Core functionality for structured generation, formerly implemented in Outlines, with a focus on performance and portability.
🔗 docs.rs/outlines-core
quotient-ai/judges ⭐ 155
judges is a small library to use and create LLM-as-a-Judge evaluators. The purpose of judges is to have a curated set of LLM evaluators in a low-friction format across a variety of use cases
codelion/adaptive-classifier ⭐ 133
A flexible, adaptive classification system that allows for dynamic addition of new classes and continuous learning from examples. Built on top of transformers from HuggingFace, this library provides an easy-to-use interface for creating and updating text classifiers.
jina-ai/llm-query-expansion ⭐ 36
Query Expension for Better Query Embedding using LLMs

Math and Science

Mathematical, numerical and scientific libraries.

numpy/numpy ⭐ 29,074
The fundamental package for scientific computing with Python.
🔗 numpy.org
camdavidsonpilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers ⭐ 27,227
aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)
🔗 camdavidsonpilon.github.io/probabilistic-programming-and-bayesian-methods-for-hackers
taichi-dev/taichi ⭐ 26,888
Productive, portable, and performant GPU programming in Python: Taichi Lang is an open-source, imperative, parallel programming language for high-performance numerical computation.
🔗 taichi-lang.org
experience-monks/math-as-code ⭐ 15,300
This is a reference to ease developers into mathematical notation by showing comparisons with Python code
scipy/scipy ⭐ 13,457
SciPy library main repository
🔗 scipy.org
sympy/sympy ⭐ 13,420
A computer algebra system written in pure Python
🔗 sympy.org
google/or-tools ⭐ 11,707
Google Optimization Tools (a.k.a., OR-Tools) is an open-source, fast and portable software suite for solving combinatorial optimization problems.
🔗 developers.google.com/optimization
z3prover/z3 ⭐ 10,759
Z3 is a theorem prover from Microsoft Research with a Python language binding.
cupy/cupy ⭐ 9,995
NumPy & SciPy for GPU
🔗 cupy.dev
google-deepmind/alphageometry ⭐ 4,406
Solving Olympiad Geometry without Human Demonstrations
pim-book/programmers-introduction-to-mathematics ⭐ 3,569
Code for A Programmer's Introduction to Mathematics
🔗 pimbook.org
mikedh/trimesh ⭐ 3,172
Python library for loading and using triangular meshes.
🔗 trimesh.org
talalalrawajfeh/mathematics-roadmap ⭐ 2,885
A Comprehensive Roadmap to Mathematics
pyro-ppl/numpyro ⭐ 2,409
Probabilistic programming with NumPy powered by JAX for autograd and JIT compilation to GPU/TPU/CPU.
🔗 num.pyro.ai
mckinsey/causalnex ⭐ 2,288
A Python library that helps data scientists to infer causation rather than observing correlation.
🔗 causalnex.readthedocs.io
pyomo/pyomo ⭐ 2,131
An object-oriented algebraic modeling language in Python for structured optimization problems.
🔗 www.pyomo.org
facebookresearch/theseus ⭐ 1,853
A library for differentiable nonlinear optimization
arviz-devs/arviz ⭐ 1,663
Exploratory analysis of Bayesian models with Python
🔗 python.arviz.org
google-research/torchsde ⭐ 1,620
Differentiable SDE solvers with GPU support and efficient sensitivity analysis.
dynamicslab/pysindy ⭐ 1,543
A package for the sparse identification of nonlinear dynamical systems from data
🔗 pysindy.readthedocs.io/en/latest
geomstats/geomstats ⭐ 1,319
Computations and statistics on manifolds with geometric structures.
🔗 geomstats.ai
cma-es/pycma ⭐ 1,156
pycma is a Python implementation of CMA-ES and a few related numerical optimization tools.
pymc-labs/CausalPy ⭐ 958
A Python package for causal inference in quasi-experimental settings
🔗 causalpy.readthedocs.io
sj001/AI-Feynman ⭐ 672
Implementation of AI Feynman: a Physics-Inspired Method for Symbolic Regression
willianfuks/tfcausalimpact ⭐ 632
Python Causal Impact Implementation Based on Google's R Package. Built using TensorFlow Probability.
lean-dojo/LeanDojo ⭐ 628
Tool for data extraction and interacting with Lean programmatically.
🔗 leandojo.org
brandondube/prysm ⭐ 286
Prysm is an open-source library for physical and first-order modeling of optical systems and analysis of related data: numerical and physical optics, integrated modeling, phase retrieval, segmented systems, polynomials and fitting, sequential raytracing.
🔗 prysm.readthedocs.io/en/stable
lean-dojo/ReProver ⭐ 259
Retrieval-Augmented Theorem Provers for Lean
🔗 leandojo.org
albahnsen/pycircular ⭐ 104
pycircular is a Python module for circular data analysis
gbillotey/Fractalshades ⭐ 28
Arbitrary-precision fractal explorer - Python package

Machine Learning - General

General and classical machine learning libraries. See below for other sections covering specialised ML areas.

openai/openai-cookbook ⭐ 62,365
Examples and guides for using the OpenAI API
🔗 cookbook.openai.com
scikit-learn/scikit-learn ⭐ 61,446
scikit-learn: machine learning in Python
🔗 scikit-learn.org
suno-ai/bark ⭐ 37,240
🔊 Text-Prompted Generative Audio Model
tencentarc/GFPGAN ⭐ 36,445
GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.
google-research/google-research ⭐ 35,130
This repository contains code released by Google Research
🔗 research.google
facebookresearch/faiss ⭐ 33,718
A library for efficient similarity search and clustering of dense vectors.
🔗 faiss.ai
google/jax ⭐ 31,673
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
🔗 jax.readthedocs.io
open-mmlab/mmdetection ⭐ 30,546
OpenMMLab Detection Toolbox and Benchmark
🔗 mmdetection.readthedocs.io
lutzroeder/netron ⭐ 29,665
Visualizer for neural network, deep learning and machine learning models
🔗 netron.app
google/mediapipe ⭐ 29,006
Cross-platform, customizable ML solutions for live and streaming media.
🔗 ai.google.dev/edge/mediapipe
ageron/handson-ml2 ⭐ 28,559
A series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in Python using Scikit-Learn, Keras and TensorFlow 2.
dmlc/xgboost ⭐ 26,709
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
🔗 xgboost.readthedocs.io/en/stable
roboflow/supervision ⭐ 26,219
We write your reusable computer vision tools. 💜
🔗 supervision.roboflow.com
facebookresearch/fastText ⭐ 26,118
A library for efficient learning of word representations and sentence classification.
🔗 fasttext.cc
modular/max ⭐ 23,793
The Modular Accelerated Xecution (MAX) platform is an integrated suite of AI libraries, tools, and technologies that unifies commonly fragmented AI deployment workflows
🔗 docs.modular.com/max
harisiqbal88/PlotNeuralNet ⭐ 22,945
Latex code for making neural networks diagrams
jina-ai/serve ⭐ 21,434
☁️ Build multimodal AI applications with cloud-native stack
🔗 jina.ai/serve
ml-explore/mlx ⭐ 19,653
MLX is an array framework for machine learning on Apple silicon, brought to you by Apple machine learning research.
🔗 ml-explore.github.io/mlx
onnx/onnx ⭐ 18,641
Open standard for machine learning interoperability
🔗 onnx.ai
microsoft/LightGBM ⭐ 17,045
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
🔗 lightgbm.readthedocs.io/en/latest
microsoft/onnxruntime ⭐ 16,014
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
🔗 onnxruntime.ai
ddbourgin/numpy-ml ⭐ 15,969
Machine learning, in numpy
🔗 numpy-ml.readthedocs.io
tensorflow/tensor2tensor ⭐ 15,955
Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
aleju/imgaug ⭐ 14,531
Image augmentation for machine learning experiments.
🔗 imgaug.readthedocs.io
microsoft/nni ⭐ 14,148
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
🔗 nni.readthedocs.io
neonbjb/tortoise-tts ⭐ 13,842
A multi-voice TTS system trained with an emphasis on quality
jindongwang/transferlearning ⭐ 13,771
Transfer learning / domain adaptation / domain generalization / multi-task learning etc. Papers, codes, datasets, applications, tutorials.-迁移学习
🔗 transferlearning.xyz
deepmind/deepmind-research ⭐ 13,615
This repository contains implementations and illustrative code to accompany DeepMind publications
spotify/annoy ⭐ 13,591
Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk
deepmind/alphafold ⭐ 13,331
Implementation of the inference pipeline of AlphaFold v2
facebookresearch/AnimatedDrawings ⭐ 12,341
Code to accompany "A Method for Animating Children's Drawings of the Human Figure"
ggerganov/ggml ⭐ 12,129
Tensor library for machine learning
optuna/optuna ⭐ 11,571
A hyperparameter optimization framework
🔗 optuna.org
google-gemini/cookbook ⭐ 11,029
A collection of guides and examples for the Gemini API, including quickstart tutorials for writing prompts.
🔗 ai.google.dev/gemini-api/docs
thudm/CogVideo ⭐ 10,985
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
statsmodels/statsmodels ⭐ 10,511
Statsmodels: statistical modeling and econometrics in Python
🔗 www.statsmodels.org/devel
cleanlab/cleanlab ⭐ 10,237
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
🔗 cleanlab.ai
twitter/the-algorithm-ml ⭐ 10,219
Source code for Twitter's Recommendation Algorithm
🔗 blog.twitter.com/engineering/en_us/topics/open-source/2023/twitter-recommendation-algorithm
epistasislab/tpot ⭐ 9,863
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
🔗 epistasislab.github.io/tpot
megvii-basedetection/YOLOX ⭐ 9,712
YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/
wandb/wandb ⭐ 9,637
The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.
🔗 wandb.ai
pycaret/pycaret ⭐ 9,216
An open-source, low-code machine learning library in Python
🔗 www.pycaret.org
facebookresearch/xformers ⭐ 9,187
Hackable and optimized Transformers building blocks, supporting a composable construction.
🔗 facebookresearch.github.io/xformers
pymc-devs/pymc ⭐ 8,921
Bayesian Modeling and Probabilistic Programming in Python
🔗 docs.pymc.io
open-mmlab/mmsegmentation ⭐ 8,712
OpenMMLab Semantic Segmentation Toolbox and Benchmark.
🔗 mmsegmentation.readthedocs.io/en/main
uberi/speech_recognition ⭐ 8,643
Speech recognition module for Python, supporting several engines and APIs, online and offline.
🔗 pypi.python.org/pypi/speechrecognition
awslabs/autogluon ⭐ 8,557
Fast and Accurate ML in 3 Lines of Code
🔗 auto.gluon.ai
huggingface/accelerate ⭐ 8,495
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
🔗 huggingface.co/docs/accelerate
catboost/catboost ⭐ 8,297
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
🔗 catboost.ai
automl/auto-sklearn ⭐ 7,772
Automated Machine Learning with scikit-learn
🔗 automl.github.io/auto-sklearn
lmcinnes/umap ⭐ 7,662
Uniform Manifold Approximation and Projection
featurelabs/featuretools ⭐ 7,384
An open source python library for automated feature engineering
🔗 www.featuretools.com
hyperopt/hyperopt ⭐ 7,364
Distributed Asynchronous Hyperparameter Optimization in Python
🔗 hyperopt.github.io/hyperopt
py-why/dowhy ⭐ 7,357
DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks.
🔗 www.pywhy.org/dowhy
hips/autograd ⭐ 7,193
Efficiently computes derivatives of NumPy code.
ml-explore/mlx-examples ⭐ 7,146
Examples in the MLX framework
open-mmlab/mmagic ⭐ 7,100
OpenMMLab Multimodal Advanced, Generative, and Intelligent Creation Toolbox. Unlock the magic 🪄: Generative-AI (AIGC), easy-to-use APIs, awsome model zoo, diffusion models, for text-to-image generation, image/video restoration/enhancement, etc.
🔗 mmagic.readthedocs.io/en/latest
scikit-learn-contrib/imbalanced-learn ⭐ 6,941
A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning
🔗 imbalanced-learn.org
probml/pyprobml ⭐ 6,689
Python code for "Probabilistic Machine learning" book by Kevin Murphy
yangchris11/samurai ⭐ 6,614
Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"
🔗 yangchris11.github.io/samurai
nicolashug/Surprise ⭐ 6,561
A Python scikit for building and analyzing recommender systems
🔗 surpriselib.com
google/automl ⭐ 6,323
Google Brain AutoML
cleverhans-lab/cleverhans ⭐ 6,270
An adversarial example library for constructing attacks, building defenses, and benchmarking both
project-monai/MONAI ⭐ 6,218
AI Toolkit for Healthcare Imaging
🔗 monai.io
kevinmusgrave/pytorch-metric-learning ⭐ 6,105
The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.
🔗 kevinmusgrave.github.io/pytorch-metric-learning
open-mmlab/mmcv ⭐ 6,054
OpenMMLab Computer Vision Foundation
🔗 mmcv.readthedocs.io/en/latest
google-deepmind/graphcast ⭐ 5,926
GraphCast: Learning skillful medium-range global weather forecasting
uber/causalml ⭐ 5,281
Uplift modeling and causal inference with machine learning algorithms
online-ml/river ⭐ 5,239
🌊 Online machine learning in Python
🔗 riverml.xyz
mdbloice/Augmentor ⭐ 5,101
Image augmentation library in Python for machine learning.
🔗 augmentor.readthedocs.io/en/stable
rasbt/mlxtend ⭐ 4,983
A library of extension and helper modules for Python's data analysis and machine learning libraries.
🔗 rasbt.github.io/mlxtend
marqo-ai/marqo ⭐ 4,801
Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai
🔗 www.marqo.ai
skvark/opencv-python ⭐ 4,743
Automated CI toolchain to produce precompiled opencv-python, opencv-python-headless, opencv-contrib-python and opencv-contrib-python-headless packages.
🔗 pypi.org/project/opencv-python
apple/coremltools ⭐ 4,622
Core ML tools contain supporting tools for Core ML model conversion, editing, and validation.
🔗 coremltools.readme.io
nmslib/hnswlib ⭐ 4,591
Header-only C++/python library for fast approximate nearest neighbors
🔗 github.com/nmslib/hnswlib
sanchit-gandhi/whisper-jax ⭐ 4,570
JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.
lucidrains/deep-daze ⭐ 4,365
Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network). Technique was originally created by https://twitter.com/advadnoun
districtdatalabs/yellowbrick ⭐ 4,323
Visual analysis and diagnostic tools to facilitate machine learning model selection.
🔗 www.scikit-yb.org
nv-tlabs/GET3D ⭐ 4,320
Generative Model of High Quality 3D Textured Shapes Learned from Images
huggingface/autotrain-advanced ⭐ 4,312
AutoTrain Advanced: faster and easier training and deployments of state-of-the-art machine learning models
🔗 huggingface.co/autotrain
microsoft/FLAML ⭐ 4,075
A fast library for AutoML and tuning. Join our Discord: https://discord.gg/Cppx2vSPVP.
🔗 microsoft.github.io/flaml
cmusphinx/pocketsphinx ⭐ 4,061
A small speech recognizer
ourownstory/neural_prophet ⭐ 4,022
NeuralProphet: A simple forecasting package
🔗 neuralprophet.com
py-why/EconML ⭐ 4,012
ALICE (Automated Learning and Intelligence for Causation and Economics) is a Microsoft Research project aimed at applying Artificial Intelligence concepts to economic decision making. One of its goals is to build a toolkit that combines state-of-the-art machine learning techniques with econometrics in order to brin...
🔗 www.microsoft.com/en-us/research/project/alice
huggingface/notebooks ⭐ 3,950
Notebooks using the Hugging Face libraries 🤗
huggingface/speech-to-speech ⭐ 3,888
Speech To Speech: an effort for an open-sourced and modular GPT4-o
zjunlp/DeepKE ⭐ 3,794
[EMNLP 2022] An Open Toolkit for Knowledge Graph Extraction and Construction
🔗 deepke.zjukg.cn
rucaibox/RecBole ⭐ 3,639
A unified, comprehensive and efficient recommendation library
🔗 recbole.io
yoheinakajima/instagraph ⭐ 3,512
Converts text input or URL into knowledge graph and displays
lightly-ai/lightly ⭐ 3,314
A python library for self-supervised learning on images.
🔗 docs.lightly.ai/self-supervised-learning
facebookresearch/vissl ⭐ 3,274
VISSL is FAIR's library of extensible, modular and scalable components for SOTA Self-Supervised Learning with images.
🔗 vissl.ai
pytorch/glow ⭐ 3,271
Compiler for Neural Network hardware accelerators
lucidrains/musiclm-pytorch ⭐ 3,234
Implementation of MusicLM, Google's new SOTA model for music generation using attention networks, in Pytorch
hrnet/HRNet-Semantic-Segmentation ⭐ 3,213
The OCR approach is rephrased as Segmentation Transformer: https://arxiv.org/abs/1909.11065. This is an official implementation of semantic segmentation for HRNet. https://arxiv.org/abs/1908.07919
huggingface/safetensors ⭐ 3,169
Implements a new simple format for storing tensors safely (as opposed to pickle) and that is still fast (zero-copy).
🔗 huggingface.co/docs/safetensors
mljar/mljar-supervised ⭐ 3,126
Python package for AutoML on Tabular Data with Feature Engineering, Hyper-Parameters Tuning, Explanations and Automatic Documentation
🔗 mljar.com
shankarpandala/lazypredict ⭐ 3,112
Lazy Predict help build a lot of basic models without much code and helps understand which models works better without any parameter tuning
priorlabs/TabPFN ⭐ 2,969
The TabPFN is a neural network that learned to do tabular data prediction. This is the original CUDA-supporting pytorch impelementation.
🔗 priorlabs.ai
scikit-learn-contrib/hdbscan ⭐ 2,875
A high performance implementation of HDBSCAN clustering.
🔗 hdbscan.readthedocs.io/en/latest
huggingface/optimum ⭐ 2,814
🚀 Accelerate inference and training of 🤗 Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimization tools
🔗 huggingface.co/docs/optimum/main
google-research/t5x ⭐ 2,767
T5X is a modular, composable, research-friendly framework for high-performance, configurable, self-service training, evaluation, and inference of sequence models (starting with language) at many scales.
scikit-optimize/scikit-optimize ⭐ 2,763
Sequential model-based optimization with a scipy.optimize interface
🔗 scikit-optimize.github.io
apple/ml-ane-transformers ⭐ 2,601
Reference implementation of the Transformer architecture optimized for Apple Neural Engine (ANE)
freedmand/semantra ⭐ 2,585
Semantra is a multipurpose tool for semantically searching documents. Query by meaning rather than just by matching text.
rom1504/clip-retrieval ⭐ 2,514
Easily compute clip embeddings and build a clip retrieval system with them
🔗 rom1504.github.io/clip-retrieval
neuraloperator/neuraloperator ⭐ 2,478
Comprehensive library for learning neural operators in PyTorch. It is the official implementation for Fourier Neural Operators and Tensorized Neural Operators.
🔗 neuraloperator.github.io/dev/index.html
eric-mitchell/direct-preference-optimization ⭐ 2,451
Reference implementation for DPO (Direct Preference Optimization)
huggingface/huggingface_hub ⭐ 2,436
The official Python client for the Huggingface Hub.
🔗 huggingface.co/docs/huggingface_hub
scikit-learn-contrib/category_encoders ⭐ 2,431
A library of sklearn compatible categorical variable encoders
🔗 contrib.scikit-learn.org/category_encoders
facebookresearch/flow_matching ⭐ 2,181
Flow Matching (FM) is a recent framework for generative modeling that has achieved state-of-the-art performance across various domains, including image, video, audio, speech, and biological structures
🔗 facebookresearch.github.io/flow_matching
huggingface/evaluate ⭐ 2,152
🤗 Evaluate: A library for easily evaluating machine learning models and datasets.
🔗 huggingface.co/docs/evaluate
aws/sagemaker-python-sdk ⭐ 2,147
A library for training and deploying machine learning models on Amazon SageMaker
🔗 sagemaker.readthedocs.io
qdrant/fastembed ⭐ 1,864
Fast, Accurate, Lightweight Python library to make State of the Art Embedding
🔗 qdrant.github.io/fastembed
contextlab/hypertools ⭐ 1,834
A Python toolbox for gaining geometric insights into high-dimensional data
🔗 hypertools.readthedocs.io/en/latest
linkedin/greykite ⭐ 1,829
A flexible, intuitive and fast forecasting library
microsoft/Olive ⭐ 1,823
Olive: Simplify ML Model Finetuning, Conversion, Quantization, and Optimization for CPUs, GPUs and NPUs.
🔗 microsoft.github.io/olive
rentruewang/koila ⭐ 1,822
Prevent PyTorch's CUDA error: out of memory in just 1 line of code.
🔗 koila.rentruewang.com
bmabey/pyLDAvis ⭐ 1,822
Python library for interactive topic model visualization. Port of the R LDAvis package.
castorini/pyserini ⭐ 1,769
Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.
🔗 pyserini.io
scikit-learn-contrib/lightning ⭐ 1,741
Large-scale linear classification, regression and ranking in Python
🔗 contrib.scikit-learn.org/lightning
tensorflow/addons ⭐ 1,695
Useful extra functionality for TensorFlow 2.x maintained by SIG-addons
microsoft/i-Code ⭐ 1,692
The ambition of the i-Code project is to build integrative and composable multimodal AI. The "i" stands for integrative multimodal learning.
stanfordmlgroup/ngboost ⭐ 1,688
Natural Gradient Boosting for Probabilistic Prediction
laekov/fastmoe ⭐ 1,674
A fast MoE impl for PyTorch
🔗 fastmoe.ai
visual-layer/fastdup ⭐ 1,660
fastdup is a powerful, free tool designed to rapidly generate valuable insights from image and video datasets. It helps enhance the quality of both images and labels, while significantly reducing data operation costs, all with unmatched scalability.
kubeflow/katib ⭐ 1,546
Automated Machine Learning on Kubernetes
🔗 www.kubeflow.org/docs/components/katib
google/vizier ⭐ 1,539
Python-based research interface for blackbox and hyperparameter optimization, based on the internal Google Vizier Service.
🔗 oss-vizier.readthedocs.io
jina-ai/finetuner ⭐ 1,491
🎯 Task-oriented embedding tuning for BERT, CLIP, etc.
🔗 finetuner.jina.ai
csinva/imodels ⭐ 1,437
Interpretable ML package 🔍 for concise, transparent, and accurate predictive modeling (sklearn-compatible).
🔗 csinva.io/imodels
microsoft/Semi-supervised-learning ⭐ 1,437
A Unified Semi-Supervised Learning Codebase (NeurIPS'22)
🔗 usb.readthedocs.io
patchy631/machine-learning ⭐ 1,410
Machine Learning Tutorials Repository
spotify/voyager ⭐ 1,405
🛰️ An approximate nearest-neighbor search library for Python and Java with a focus on ease of use, simplicity, and deployability.
🔗 spotify.github.io/voyager
borealisai/advertorch ⭐ 1,326
A Toolbox for Adversarial Robustness Research
koaning/scikit-lego ⭐ 1,312
Extra blocks for scikit-learn pipelines.
🔗 koaning.github.io/scikit-lego
lightning-ai/lightning-thunder ⭐ 1,306
Thunder is a source-to-source compiler for PyTorch. It makes PyTorch programs faster by combining and using different hardware executors at once
awslabs/dgl-ke ⭐ 1,296
High performance, easy-to-use, and scalable package for learning large-scale knowledge graph embeddings.
🔗 dglke.dgl.ai/doc
pytorch/FBGEMM ⭐ 1,275
FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
nvidia/cuda-python ⭐ 1,124
CUDA Python: Performance meets Productivity
🔗 nvidia.github.io/cuda-python
davidmrau/mixture-of-experts ⭐ 1,071
PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538
opentensor/bittensor ⭐ 1,063
Internet-scale Neural Networks
🔗 www.bittensor.com
google-research/deeplab2 ⭐ 1,015
DeepLab2 is a TensorFlow library for deep labeling, aiming to provide a unified and state-of-the-art TensorFlow codebase for dense pixel labeling tasks.
oml-team/open-metric-learning ⭐ 915
OML is a PyTorch-based framework to train and validate the models producing high-quality embeddings.
🔗 open-metric-learning.readthedocs.io/en/latest/index.html
huggingface/optimum-quanto ⭐ 900
A pytorch quantization backend for optimum
hazyresearch/safari ⭐ 877
Convolutions for Sequence Modeling
criteo/autofaiss ⭐ 841
Automatically create Faiss knn indices with the most optimal similarity search parameters.
🔗 criteo.github.io/autofaiss
replicate/replicate-python ⭐ 810
Python client for Replicate
🔗 replicate.com
pymc-labs/pymc-marketing ⭐ 799
Bayesian marketing toolbox in PyMC. Media Mix (MMM), customer lifetime value (CLV), buy-till-you-die (BTYD) models and more.
🔗 www.pymc-marketing.io
awslabs/python-deequ ⭐ 753
Python API for Deequ, a library built on Spark for defining "unit tests for data", which measure data quality in large datasets
googleapis/python-aiplatform ⭐ 705
A Python SDK for Vertex AI, a fully managed, end-to-end platform for data science and machine learning.
facebookresearch/balance ⭐ 696
The balance python package offers a simple workflow and methods for dealing with biased data samples when looking to infer from them to some target population of interest.
🔗 import-balance.org
nicolas-hbt/pygraft ⭐ 681
Configurable Generation of Synthetic Schemas and Knowledge Graphs at Your Fingertips
🔗 pygraft.readthedocs.io/en/latest
nomic-ai/contrastors ⭐ 663
Contrastive learning toolkit that enables researchers and engineers to train and evaluate contrastive models efficiently.
qdrant/quaterion ⭐ 656
Blazing fast framework for fine-tuning similarity learning models
🔗 quaterion.qdrant.tech
huggingface/exporters ⭐ 649
Export Hugging Face models to Core ML and TensorFlow Lite
intel/intel-npu-acceleration-library ⭐ 643
The Intel NPU Acceleration Library is a Python library designed to boost the efficiency of your applications by leveraging the power of the Intel Neural Processing Unit (NPU) to perform high-speed computations on compatible hardware.
hpcaitech/EnergonAI ⭐ 628
Large-scale model inference.
minishlab/semhash ⭐ 577
SemHash is a lightweight and flexible tool for deduplicating datasets using semantic similarity. It combines fast embedding generation from Model2Vec with efficient ANN-based similarity search through Vicinity
🔗 minishlab.github.io
intellabs/bayesian-torch ⭐ 567
A library for Bayesian neural network layers and uncertainty estimation in Deep Learning extending the core of PyTorch
microsoft/Focal-Transformer ⭐ 551
[NeurIPS 2021 Spotlight] Official code for "Focal Self-attention for Local-Global Interactions in Vision Transformers"
linkedin/FastTreeSHAP ⭐ 535
Fast SHAP value computation for interpreting tree-based models
mrdbourke/m1-machine-learning-test ⭐ 533
Code for testing various M1 Chip benchmarks with TensorFlow.
nevronai/MetisFL ⭐ 525
The first open Federated Learning framework implemented in C++ and Python.
🔗 metisfl.org
deepgraphlearning/ULTRA ⭐ 518
A foundation model for knowledge graph reasoning
dylanhogg/gptauthor ⭐ 73
GPTAuthor is an AI tool for writing long form, multi-chapter stories given a story prompt.

Machine Learning - Deep Learning

Machine learning libraries that cross over with deep learning in some way.

tensorflow/tensorflow ⭐ 188,704
An Open Source Machine Learning Framework for Everyone
🔗 tensorflow.org
pytorch/pytorch ⭐ 87,982
Tensors and Dynamic neural networks in Python with strong GPU acceleration
🔗 pytorch.org
openai/whisper ⭐ 78,307
Robust Speech Recognition via Large-Scale Weak Supervision
keras-team/keras ⭐ 62,704
Deep Learning for humans
🔗 keras.io
deepfakes/faceswap ⭐ 53,479
Deepfakes Software For All
🔗 www.faceswap.dev
facebookresearch/segment-anything ⭐ 49,324
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
microsoft/DeepSpeed ⭐ 37,490
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
🔗 www.deepspeed.ai
rwightman/pytorch-image-models ⭐ 33,526
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
🔗 huggingface.co/docs/timm
facebookresearch/detectron2 ⭐ 31,485
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
🔗 detectron2.readthedocs.io/en/latest
xinntao/Real-ESRGAN ⭐ 30,086
Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.
lightning-ai/pytorch-lightning ⭐ 29,143
The deep learning framework to pretrain, finetune and deploy AI models. PyTorch Lightning is just organized PyTorch - Lightning disentangles PyTorch code to decouple the science from the engineering.
🔗 lightning.ai/pytorch-lightning
google-research/tuning_playbook ⭐ 28,150
A playbook for systematically maximizing the performance of deep learning models.
openai/CLIP ⭐ 27,942
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
facebookresearch/Detectron ⭐ 26,319
FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.
matterport/Mask_RCNN ⭐ 25,005
Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow
paddlepaddle/Paddle ⭐ 22,581
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice （『飞桨』核心框架，深度学习&机器学习高性能单机、分布式训练和跨平台部署）
🔗 www.paddlepaddle.org
lucidrains/vit-pytorch ⭐ 22,088
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
pyg-team/pytorch_geometric ⭐ 22,040
Graph Neural Network Library for PyTorch
🔗 pyg.org
apache/mxnet ⭐ 20,790
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
🔗 mxnet.apache.org
sanster/IOPaint ⭐ 20,666
Image inpainting tool powered by SOTA AI Model. Remove any unwanted object, defect, people from your pictures or erase and replace(powered by stable diffusion) any thing on your pictures.
🔗 www.iopaint.com
danielgatis/rembg ⭐ 18,326
Rembg is a tool to remove images background
rasbt/deeplearning-models ⭐ 16,930
A collection of various deep learning architectures, models, and tips
albumentations-team/albumentations ⭐ 14,704
Fast and flexible image augmentation library. Paper about the library: https://www.mdpi.com/2078-2489/11/2/125
🔗 albumentations.ai
microsoft/Swin-Transformer ⭐ 14,470
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
🔗 arxiv.org/abs/2103.14030
facebookresearch/detr ⭐ 14,123
End-to-End Object Detection with Transformers
nvidia/DeepLearningExamples ⭐ 14,036
State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
dmlc/dgl ⭐ 13,765
Python package built to ease deep learning on graph, on top of existing DL frameworks.
🔗 dgl.ai
mlfoundations/open_clip ⭐ 11,256
Open source implementation of OpenAI's CLIP (Contrastive Language-Image Pre-training).
kornia/kornia ⭐ 10,316
🐍 Geometric Computer Vision Library for Spatial AI
🔗 kornia.readthedocs.io
modelscope/facechain ⭐ 9,328
FaceChain is a deep-learning toolchain for generating your Digital-Twin.
tencent/HunyuanVideo ⭐ 9,293
HunyuanVideo: A Systematic Framework For Large Video Generation Model
🔗 aivideo.hunyuan.tencent.com
keras-team/autokeras ⭐ 9,207
AutoML library for deep learning
🔗 autokeras.com
facebookresearch/pytorch3d ⭐ 9,118
PyTorch3D is FAIR's library of reusable components for deep learning with 3D data
🔗 pytorch3d.org
arogozhnikov/einops ⭐ 8,797
Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)
🔗 einops.rocks
bytedance/monolith ⭐ 8,703
A deep learning framework for large scale recommendation modeling with collisionless embedding and real time training captures.
pyro-ppl/pyro ⭐ 8,688
Deep universal probabilistic programming with Python and PyTorch
🔗 pyro.ai
nvidia/apex ⭐ 8,586
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
facebookresearch/ImageBind ⭐ 8,549
ImageBind One Embedding Space to Bind Them All
lucidrains/imagen-pytorch ⭐ 8,216
Implementation of Imagen, Google's Text-to-Image Neural Network, in Pytorch
google/trax ⭐ 8,175
Trax — Deep Learning with Clear Code and Speed
xpixelgroup/BasicSR ⭐ 7,262
Open Source Image and Video Restoration Toolbox for Super-resolution, Denoise, Deblurring, etc. Currently, it includes EDSR, RCAN, SRResNet, SRGAN, ESRGAN, EDVR, BasicVSR, SwinIR, ECBSR, etc. Also support StyleGAN2, DFDNet.
🔗 basicsr.readthedocs.io/en/latest
google/flax ⭐ 6,414
Flax is a neural network library for JAX that is designed for flexibility.
🔗 flax.readthedocs.io
skorch-dev/skorch ⭐ 5,983
A scikit-learn compatible neural network library that wraps PyTorch
facebookresearch/mmf ⭐ 5,551
A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
🔗 mmf.sh
mosaicml/composer ⭐ 5,310
Supercharge Your Model Training
🔗 docs.mosaicml.com
deci-ai/super-gradients ⭐ 4,714
Easily train or fine-tune SOTA computer vision models with one open source training library. The home of Yolo-NAS.
🔗 www.supergradients.com
nvidiagameworks/kaolin ⭐ 4,668
A PyTorch Library for Accelerating 3D Deep Learning Research
facebookincubator/AITemplate ⭐ 4,620
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
pytorch/ignite ⭐ 4,592
High-level library to help with training and evaluating neural networks in PyTorch flexibly and transparently.
🔗 pytorch-ignite.ai
cvg/LightGlue ⭐ 3,663
LightGlue: Local Feature Matching at Light Speed (ICCV 2023)
williamyang1991/VToonify ⭐ 3,572
[SIGGRAPH Asia 2022] VToonify: Controllable High-Resolution Portrait Video Style Transfer
google-research/scenic ⭐ 3,472
Scenic: A Jax Library for Computer Vision Research and Beyond
facebookresearch/PyTorch-BigGraph ⭐ 3,402
Generate embeddings from large-scale graph-structured data.
🔗 torchbiggraph.readthedocs.io
pytorch/botorch ⭐ 3,196
Bayesian optimization in PyTorch
🔗 botorch.org
alpa-projects/alpa ⭐ 3,114
Training and serving large-scale neural networks with auto parallelization.
🔗 alpa.ai
deepmind/dm-haiku ⭐ 2,990
JAX-based neural network library
🔗 dm-haiku.readthedocs.io
explosion/thinc ⭐ 2,837
🔮 A refreshing functional take on deep learning, compatible with your favorite libraries
🔗 thinc.ai
nerdyrodent/VQGAN-CLIP ⭐ 2,642
Just playing with getting VQGAN+CLIP running locally, rather than having to use colab.
modelscope/ClearerVoice-Studio ⭐ 2,422
An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.
danielegrattarola/spektral ⭐ 2,375
Graph Neural Networks with Keras and Tensorflow 2.
🔗 graphneural.network
google-research/electra ⭐ 2,350
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
fepegar/torchio ⭐ 2,145
Medical imaging processing for deep learning.
🔗 torchio.org
neuralmagic/sparseml ⭐ 2,118
Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models
pytorch/torchrec ⭐ 2,056
Pytorch domain library for recommendation systems
🔗 pytorch.org/torchrec
tensorly/tensorly ⭐ 1,603
TensorLy: Tensor Learning in Python.
🔗 tensorly.org
tensorflow/mesh ⭐ 1,602
Mesh TensorFlow: Model Parallelism Made Easier
calculatedcontent/WeightWatcher ⭐ 1,557
The WeightWatcher tool for predicting the accuracy of Deep Neural Networks
vt-vl-lab/FGVC ⭐ 1,557
[ECCV 2020] Flow-edge Guided Video Completion
jeshraghian/snntorch ⭐ 1,493
Deep and online learning with spiking neural networks in Python
🔗 snntorch.readthedocs.io/en/latest
hysts/pytorch_image_classification ⭐ 1,388
PyTorch implementation of image classification models for CIFAR-10/CIFAR-100/MNIST/FashionMNIST/Kuzushiji-MNIST/ImageNet
xl0/lovely-tensors ⭐ 1,199
Tensors, for human consumption
🔗 xl0.github.io/lovely-tensors
deepmind/android_env ⭐ 1,053
RL research on Android devices.
keras-team/keras-cv ⭐ 1,027
Industry-strength Computer Vision workflows with Keras
tensorflow/similarity ⭐ 1,019
TensorFlow Similarity is a python package focused on making similarity learning quick and easy.
kakaobrain/rq-vae-transformer ⭐ 841
The official implementation of Autoregressive Image Generation using Residual Quantization (CVPR '22)
deepmind/chex ⭐ 832
Chex is a library of utilities for helping to write reliable JAX code
🔗 chex.readthedocs.io
mlfoundations/datacomp ⭐ 686
DataComp: In search of the next generation of multimodal datasets
🔗 datacomp.ai
whitead/dmol-book ⭐ 637
Deep learning for molecules and materials book
🔗 dmol.pub
allenai/reward-bench ⭐ 522
RewardBench is a benchmark designed to evaluate the capabilities and safety of reward models (including those trained with Direct Preference Optimization, DPO)
🔗 huggingface.co/spaces/allenai/reward-bench

Machine Learning - Interpretability

Machine learning interpretability libraries. Covers explainability, prediction explainations, dashboards, understanding knowledge development in training.

slundberg/shap ⭐ 23,555
A game theoretic approach to explain the output of any machine learning model.
🔗 shap.readthedocs.io
marcotcr/lime ⭐ 11,800
Lime: Explaining the predictions of any machine learning classifier
interpretml/interpret ⭐ 6,427
Fit interpretable models. Explain blackbox machine learning.
🔗 interpret.ml/docs
pytorch/captum ⭐ 5,128
Model interpretability and understanding for PyTorch
🔗 captum.ai
arize-ai/phoenix ⭐ 5,074
AI Observability & Evaluation
🔗 docs.arize.com/phoenix
tensorflow/lucid ⭐ 4,686
A collection of infrastructure and tools for research in neural network interpretability.
pair-code/lit ⭐ 3,528
The Learning Interpretability Tool: Interactively analyze ML models to understand their behavior in an extensible and framework agnostic interface.
🔗 pair-code.github.io/lit
maif/shapash ⭐ 2,849
🔅 Shapash: User-friendly Explainability and Interpretability to Develop Reliable and Transparent Machine Learning Models
🔗 maif.github.io/shapash
teamhg-memex/eli5 ⭐ 2,768
A library for debugging/inspecting machine learning classifiers and explaining their predictions
🔗 eli5.readthedocs.io
seldonio/alibi ⭐ 2,472
Algorithms for explaining machine learning models
🔗 docs.seldon.io/projects/alibi/en/stable
eleutherai/pythia ⭐ 2,416
Interpretability analysis and scaling laws to understand how knowledge develops and evolves during training in autoregressive transformers
oegedijk/explainerdashboard ⭐ 2,360
Quickly build Explainable AI dashboards that show the inner workings of so-called "blackbox" machine learning models.
🔗 explainerdashboard.readthedocs.io
jalammar/ecco ⭐ 2,022
Explain, analyze, and visualize NLP language models. Ecco creates interactive visualizations directly in Jupyter notebooks explaining the behavior of Transformer-based language models (like GPT2, BERT, RoBERTA, T5, and T0).
🔗 ecco.readthedocs.io
transformerlensorg/TransformerLens ⭐ 1,965
A library for mechanistic interpretability of GPT-style language models
🔗 transformerlensorg.github.io/transformerlens
google-deepmind/penzai ⭐ 1,743
A JAX library for writing models as legible, functional pytree data structures, along with tools for visualizing, modifying, and analyzing them. Penzai focuses on making it easy to do stuff with models after they have been trained
🔗 penzai.readthedocs.io
trusted-ai/AIX360 ⭐ 1,665
Interpretability and explainability of data and machine learning models
🔗 aix360.res.ibm.com
stanfordnlp/pyreft ⭐ 1,444
Stanford NLP Python library for Representation Finetuning (ReFT)
🔗 arxiv.org/abs/2404.03592
cdpierse/transformers-interpret ⭐ 1,328
Model explainability that works seamlessly with 🤗 transformers. Explain your transformers model in just 2 lines of code.
selfexplainml/PiML-Toolbox ⭐ 1,243
PiML (Python Interpretable Machine Learning) toolbox for model development & diagnostics
🔗 selfexplainml.github.io/piml-toolbox
ethicalml/xai ⭐ 1,158
XAI is a Machine Learning library that is designed with AI explainability in its core. XAI contains various tools that enable for analysis and evaluation of data and models
🔗 ethical.institute/principles.html#commitment-3
salesforce/OmniXAI ⭐ 906
OmniXAI: A Library for eXplainable AI
andyzoujm/representation-engineering ⭐ 807
Representation Engineering: A Top-Down Approach to AI Transparency
🔗 www.ai-transparency.org
stanfordnlp/pyvene ⭐ 718
Library for intervening on the internal states of PyTorch models. Interventions are an important operation in many areas of AI, including model editing, steering, robustness, and interpretability.
🔗 pyvene.ai
jbloomaus/SAELens ⭐ 660
Training Sparse Autoencoders on LLms. Analyse sparse autoencoders and neural network internals.
🔗 jbloomaus.github.io/saelens
labmlai/inspectus ⭐ 646
Inspectus provides visualization tools for attention mechanisms in deep learning models. It provides a set of comprehensive views, making it easier to understand how these models work.
ndif-team/nnsight ⭐ 518
The nnsight package enables interpreting and manipulating the internals of deep learned models.
🔗 nnsight.net

Machine Learning - Ops

MLOps tools, frameworks and libraries: intersection of machine learning, data engineering and DevOps; deployment, health, diagnostics and governance of ML models.

apache/airflow ⭐ 39,201
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
🔗 airflow.apache.org
ray-project/ray ⭐ 36,061
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
🔗 ray.io
mlflow/mlflow ⭐ 19,802
Open source platform for the machine learning lifecycle
🔗 mlflow.org
prefecthq/prefect ⭐ 18,646
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
🔗 prefect.io
spotify/luigi ⭐ 18,162
Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.
kestra-io/kestra ⭐ 16,497
⚡ Workflow Automation Platform. Orchestrate & Schedule code in any language, run anywhere, 500+ plugins. Alternative to Zapier, Rundeck, Camunda, Airflow...
🔗 kestra.io
horovod/horovod ⭐ 14,423
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
🔗 horovod.ai
iterative/dvc ⭐ 14,268
🦉 Data Versioning and ML Experiments
🔗 dvc.org
dagster-io/dagster ⭐ 12,736
An orchestration platform for the development, production, and observation of data assets.
🔗 dagster.io
ludwig-ai/ludwig ⭐ 11,376
Low-code framework for building custom LLMs, neural networks, and other AI models
🔗 ludwig.ai
bentoml/OpenLLM ⭐ 10,981
Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.
🔗 bentoml.com
dbt-labs/dbt-core ⭐ 10,538
dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
🔗 getdbt.com
great-expectations/great_expectations ⭐ 10,262
Always know what to expect from your data.
🔗 docs.greatexpectations.io
kedro-org/kedro ⭐ 10,222
Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
🔗 kedro.org
huggingface/text-generation-inference ⭐ 9,895
A Rust, Python and gRPC server for text generation inference. Used in production at HuggingFace to power Hugging Chat, the Inference API and Inference Endpoint.
🔗 hf.co/docs/text-generation-inference
langfuse/langfuse ⭐ 9,479
🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23
🔗 langfuse.com/docs
netflix/metaflow ⭐ 8,640
Build, Deploy and Manage AI/ML Systems
🔗 metaflow.org
activeloopai/deeplake ⭐ 8,476
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
🔗 activeloop.ai
mage-ai/mage-ai ⭐ 8,196
🧙 Build, run, and manage data pipelines for integrating and transforming data.
🔗 www.mage.ai
bentoml/BentoML ⭐ 7,481
The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!
🔗 bentoml.com
flyteorg/flyte ⭐ 6,095
Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
🔗 flyte.org
allegroai/clearml ⭐ 5,881
ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution
🔗 clear.ml/docs
internlm/lmdeploy ⭐ 5,871
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
🔗 lmdeploy.readthedocs.io/en/latest
feast-dev/feast ⭐ 5,869
The Open Source Feature Store for AI/ML
🔗 feast.dev
evidentlyai/evidently ⭐ 5,838
Evidently is an open-source ML and LLM observability framework. Evaluate, test, and monitor any AI-powered system or data pipeline. From tabular data to Gen AI. 100+ metrics.
🔗 discord.gg/xzjkranp8b
adap/flower ⭐ 5,567
Flower: A Friendly Federated AI Framework
🔗 flower.ai
aimhubio/aim ⭐ 5,398
Aim 💫 — An easy-to-use & supercharged open-source experiment tracker.
🔗 aimstack.io
zenml-io/zenml ⭐ 4,471
ZenML 🙏: The bridge between ML and Ops. https://zenml.io.
🔗 zenml.io
internlm/xtuner ⭐ 4,390
An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
🔗 xtuner.readthedocs.io/zh-cn/latest
orchest/orchest ⭐ 4,114
Build data pipelines, the easy way 🛠️
🔗 orchest.readthedocs.io/en/stable
kubeflow/pipelines ⭐ 3,780
Machine Learning Pipelines for Kubeflow
🔗 www.kubeflow.org/docs/components/pipelines
polyaxon/polyaxon ⭐ 3,613
MLOps Tools For Managing & Orchestrating The Machine Learning LifeCycle
🔗 polyaxon.com
ploomber/ploomber ⭐ 3,557
The fastest ⚡️ way to build data pipelines. Develop iteratively, deploy anywhere. ☁️
🔗 docs.ploomber.io
towhee-io/towhee ⭐ 3,333
Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.
🔗 towhee.io
determined-ai/determined ⭐ 3,119
Determined is an open-source machine learning platform that simplifies distributed training, hyperparameter tuning, experiment tracking, and resource management. Works with PyTorch and TensorFlow.
🔗 determined.ai
leptonai/leptonai ⭐ 2,695
A Pythonic framework to simplify AI service building
🔗 lepton.ai
azure/PyRIT ⭐ 2,292
The Python Risk Identification Tool for generative AI (PyRIT) is an open access automation framework to empower security professionals and ML engineers to red team foundation models and their applications.
🔗 azure.github.io/pyrit
labmlai/labml ⭐ 2,123
🔎 Monitor deep learning model training and hardware usage from your mobile phone 📱
🔗 labml.ai
dagworks-inc/hamilton ⭐ 2,053
Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.
🔗 hamilton.dagworks.io/en/latest
meltano/meltano ⭐ 1,996
Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.
🔗 meltano.com
dstackai/dstack ⭐ 1,738
dstack is a lightweight, open-source alternative to Kubernetes & Slurm, simplifying AI container orchestration with multi-cloud & on-prem support. It natively supports NVIDIA, AMD, TPU, and Intel accelerators.
🔗 dstack.ai/docs
dagworks-inc/burr ⭐ 1,537
Build applications that make decisions (chatbots, agents, simulations, etc...). Monitor, trace, persist, and execute on your own infrastructure.
🔗 burr.dagworks.io
hi-primus/optimus ⭐ 1,497
🚚 Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
🔗 hi-optimus.com
kubeflow/examples ⭐ 1,428
A repository to host extended examples and tutorials
substratusai/kubeai ⭐ 842
AI Inference Operator for Kubernetes. The easiest way to serve ML models in production. Supports VLMs, LLMs, embeddings, and speech-to-text.
🔗 www.kubeai.org
vllm-project/production-stack ⭐ 813
vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization
🔗 docs.vllm.ai/projects/production-stack

Machine Learning - Reinforcement

Machine learning libraries and toolkits that cross over with reinforcement learning in some way: agent reinforcement learning, agent environemnts, RLHF

openai/gym ⭐ 35,603
A toolkit for developing and comparing reinforcement learning algorithms.
🔗 www.gymlibrary.dev
openai/baselines ⭐ 16,132
OpenAI Baselines: high-quality implementations of reinforcement learning algorithms
google/dopamine ⭐ 10,667
Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.
🔗 github.com/google/dopamine
farama-foundation/Gymnasium ⭐ 8,614
An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)
🔗 gymnasium.farama.org
thu-ml/tianshou ⭐ 8,298
An elegant PyTorch deep reinforcement learning library.
🔗 tianshou.org
deepmind/pysc2 ⭐ 8,097
StarCraft II Learning Environment
lucidrains/PaLM-rlhf-pytorch ⭐ 7,765
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
tensorlayer/TensorLayer ⭐ 7,347
Deep Learning and Reinforcement Learning Library for Scientists and Engineers
🔗 tensorlayerx.com
keras-rl/keras-rl ⭐ 5,541
Deep Reinforcement Learning for Keras.
🔗 keras-rl.readthedocs.io
deepmind/dm_control ⭐ 3,974
Google DeepMind's software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo.
ai4finance-foundation/ElegantRL ⭐ 3,922
Massively Parallel Deep Reinforcement Learning. 🔥
🔗 ai4finance.org
deepmind/acme ⭐ 3,617
A library of reinforcement learning components and agents
facebookresearch/ReAgent ⭐ 3,599
A platform for Reasoning systems (Reinforcement Learning, Contextual Bandits, etc.)
🔗 reagent.ai
opendilab/DI-engine ⭐ 3,309
DI-engine is a generalized decision intelligence engine for PyTorch and JAX. It provides python-first and asynchronous-native task and middleware abstractions
🔗 di-engine-docs.readthedocs.io
eureka-research/Eureka ⭐ 2,914
Official Repository for "Eureka: Human-Level Reward Design via Coding Large Language Models" (ICLR 2024)
🔗 eureka-research.github.io
pettingzoo-team/PettingZoo ⭐ 2,820
An API standard for multi-agent reinforcement learning environments, with popular reference environments and related utilities
🔗 pettingzoo.farama.org
pytorch/rl ⭐ 2,613
A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.
🔗 pytorch.org/rl
kzl/decision-transformer ⭐ 2,512
Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.
anthropics/hh-rlhf ⭐ 1,703
Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"
🔗 arxiv.org/abs/2204.05862
arise-initiative/robosuite ⭐ 1,541
robosuite: A Modular Simulation Framework and Benchmark for Robot Learning
🔗 robosuite.ai
humancompatibleai/imitation ⭐ 1,436
Clean PyTorch implementations of imitation and reward learning algorithms
🔗 imitation.readthedocs.io
denys88/rl_games ⭐ 1,042
RL Games: High performance RL library
google-deepmind/meltingpot ⭐ 658
A suite of test scenarios for multi-agent reinforcement learning.

Natural Language Processing

Natural language processing libraries and toolkits: text processing, topic modelling, tokenisers, chatbots. Also see the LLMs and ChatGPT category for crossover.

huggingface/transformers ⭐ 141,415
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
🔗 huggingface.co/transformers
myshell-ai/OpenVoice ⭐ 31,372
Instant voice cloning by MIT and MyShell. Audio foundation model.
🔗 research.myshell.ai/open-voice
pytorch/fairseq ⭐ 31,170
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
explosion/spaCy ⭐ 31,166
💫 Industrial-strength Natural Language Processing (NLP) in Python
🔗 spacy.io
vikparuchuri/marker ⭐ 22,988
Marker converts PDF, EPUB, and MOBI to markdown. It's 10x faster than nougat, more accurate on most documents, and has low hallucination risk.
🔗 www.datalab.to
microsoft/unilm ⭐ 20,925
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
🔗 aka.ms/generalai
huggingface/datasets ⭐ 19,819
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
🔗 huggingface.co/docs/datasets
vikparuchuri/surya ⭐ 16,859
OCR, layout analysis, reading order, table recognition in 90+ languages
🔗 www.datalab.to
ukplab/sentence-transformers ⭐ 16,245
State-of-the-Art Text Embeddings
🔗 www.sbert.net
rare-technologies/gensim ⭐ 15,910
Topic Modelling for Humans
🔗 radimrehurek.com/gensim
m-bain/whisperX ⭐ 14,509
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
gunthercox/ChatterBot ⭐ 14,252
ChatterBot is a machine learning, conversational dialog engine for creating chat bots
🔗 docs.chatterbot.us
flairnlp/flair ⭐ 14,109
A very simple framework for state-of-the-art Natural Language Processing (NLP)
🔗 flairnlp.github.io/flair
nltk/nltk ⭐ 13,913
NLTK Source
🔗 www.nltk.org
openai/tiktoken ⭐ 13,851
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
nvidia/NeMo ⭐ 13,359
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
🔗 docs.nvidia.com/nemo-framework/user-guide/latest/overview.html
jina-ai/clip-as-service ⭐ 12,607
🏄 Scalable embedding, reasoning, ranking for images and sentences with CLIP
🔗 clip-as-service.jina.ai
allenai/allennlp ⭐ 11,825
An open-source NLP research library, built on PyTorch.
🔗 www.allennlp.org
facebookresearch/seamless_communication ⭐ 11,423
Foundational Models for State-of-the-Art Speech and Text Translation
google/sentencepiece ⭐ 10,705
Unsupervised text tokenizer for Neural Network-based text generation.
neuml/txtai ⭐ 10,578
💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows
🔗 neuml.github.io/txtai
facebookresearch/ParlAI ⭐ 10,514
A framework for training and evaluating AI models on a variety of openly available dialogue datasets.
🔗 parl.ai
doccano/doccano ⭐ 9,855
Open source annotation tool for machine learning practitioners.
speechbrain/speechbrain ⭐ 9,521
A PyTorch-based Speech Toolkit
🔗 speechbrain.github.io
facebookresearch/nougat ⭐ 9,331
Implementation of Nougat Neural Optical Understanding for Academic Documents
🔗 facebookresearch.github.io/nougat
sloria/TextBlob ⭐ 9,285
Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.
🔗 textblob.readthedocs.io
togethercomputer/OpenChatKit ⭐ 9,018
OpenChatKit provides a powerful, open-source base to create both specialized and general purpose chatbots
espnet/espnet ⭐ 8,877
End-to-End Speech Processing Toolkit
🔗 espnet.github.io/espnet
clips/pattern ⭐ 8,785
Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.
🔗 github.com/clips/pattern/wiki
deeppavlov/DeepPavlov ⭐ 6,829
An open source library for deep learning end-to-end dialog systems and chatbots.
🔗 deeppavlov.ai
facebookresearch/metaseq ⭐ 6,519
A codebase for working with Open Pre-trained Transformers, originally forked from fairseq.
maartengr/BERTopic ⭐ 6,507
Leveraging BERT and c-TF-IDF to create easily interpretable topics.
🔗 maartengr.github.io/bertopic
kingoflolz/mesh-transformer-jax ⭐ 6,321
Model parallel transformers in JAX and Haiku
quivrhq/MegaParse ⭐ 5,875
File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.
🔗 megaparse.com
aiwaves-cn/agents ⭐ 5,523
An Open-source Framework for Data-centric, Self-evolving Autonomous Language Agents
layout-parser/layout-parser ⭐ 5,133
A Unified Toolkit for Deep Learning Based Document Image Analysis
🔗 layout-parser.github.io
salesforce/CodeGen ⭐ 5,042
CodeGen is a family of open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.
minimaxir/textgenrnn ⭐ 4,939
Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.
makcedward/nlpaug ⭐ 4,523
Data augmentation for NLP
🔗 makcedward.github.io
facebookresearch/DrQA ⭐ 4,483
Reading Wikipedia to Answer Open-Domain Questions
argilla-io/argilla ⭐ 4,381
Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets
🔗 docs.argilla.io
thilinarajapakse/simpletransformers ⭐ 4,162
Transformers for Information Retrieval, Text Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conversational AI
🔗 simpletransformers.ai
maartengr/KeyBERT ⭐ 3,777
A minimal and easy-to-use keyword extraction technique that leverages BERT embeddings to create keywords and keyphrases that are most similar to a document.
🔗 maartengr.github.io/keybert
life4/textdistance ⭐ 3,457
📐 Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.
promptslab/Promptify ⭐ 3,435
Prompt Engineering | Prompt Versioning | Use GPT or other prompt based models to get structured output. Join our discord for Prompt-Engineering, LLMs and other latest research
🔗 discord.gg/m88xfymbk6
jsvine/markovify ⭐ 3,329
A simple, extensible Markov chain generator.
bytedance/lightseq ⭐ 3,259
LightSeq: A High Performance Library for Sequence Processing and Generation
errbotio/errbot ⭐ 3,170
Errbot is a chatbot, a daemon that connects to your favorite chat service and bring your tools and some fun into the conversation.
🔗 errbot.io
neuralmagic/deepsparse ⭐ 3,117
Sparsity-aware deep learning inference runtime for CPUs
🔗 neuralmagic.com/deepsparse
huawei-noah/Pretrained-Language-Model ⭐ 3,073
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
ddangelov/Top2Vec ⭐ 3,015
Top2Vec learns jointly embedded topic, document and word vectors.
salesforce/CodeT5 ⭐ 2,930
Home of CodeT5: Open Code LLMs for Code Understanding and Generation
🔗 arxiv.org/abs/2305.07922
jbesomi/texthero ⭐ 2,902
Text preprocessing, representation and visualization from zero to hero.
🔗 texthero.org
huggingface/neuralcoref ⭐ 2,866
✨Fast Coreference Resolution in spaCy with Neural Networks
🔗 huggingface.co/coref
bhavnicksm/chonkie ⭐ 2,806
🦛 CHONK your texts with Chonkie ✨ - The no-nonsense RAG chunking library
🔗 docs.chonkie.ai
bigscience-workshop/promptsource ⭐ 2,799
Toolkit for creating, sharing and using natural language prompts.
nvidia/nv-ingest ⭐ 2,603
NVIDIA-Ingest is a scalable, performance-oriented document content and metadata extraction microservice.
huggingface/setfit ⭐ 2,414
SetFit is an efficient and prompt-free framework for few-shot fine-tuning of Sentence Transformers.
🔗 hf.co/docs/setfit
jamesturk/jellyfish ⭐ 2,111
🪼 a python library for doing approximate and phonetic matching of strings.
🔗 jamesturk.github.io/jellyfish
alibaba/EasyNLP ⭐ 2,103
EasyNLP: A Comprehensive and Easy-to-use NLP Toolkit
thudm/P-tuning-v2 ⭐ 2,020
An optimized deep prompt tuning strategy comparable to fine-tuning across scales and tasks
featureform/featureform ⭐ 1,865
The Virtual Feature Store. Turn your existing data infrastructure into a feature store.
🔗 www.featureform.com
urchade/GLiNER ⭐ 1,864
Generalist and Lightweight Model for Named Entity Recognition (Extract any entity types from texts) @ NAACL 2024
🔗 arxiv.org/abs/2311.08526
marella/ctransformers ⭐ 1,852
Python bindings for the Transformer models implemented in C/C++ using GGML library.
deepset-ai/FARM ⭐ 1,748
🏡 Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.
🔗 farm.deepset.ai
explosion/spacy-models ⭐ 1,711
💫 Models for the spaCy Natural Language Processing (NLP) library
🔗 spacy.io
franck-dernoncourt/NeuroNER ⭐ 1,707
Named-entity recognition using neural networks. Easy-to-use and state-of-the-art results.
🔗 neuroner.com
google-research/language ⭐ 1,658
Shared repository for open-sourced projects from the Google AI Language team.
🔗 ai.google/research/teams/language
plasticityai/magnitude ⭐ 1,643
A fast, efficient universal vector embedding utility package.
arxiv-vanity/arxiv-vanity ⭐ 1,617
Renders papers from arXiv as responsive web pages so you don't have to squint at a PDF.
🔗 www.arxiv-vanity.com
nomic-ai/nomic ⭐ 1,568
Interact, analyze and structure massive text, image, embedding, audio and video datasets
🔗 atlas.nomic.ai
chrismattmann/tika-python ⭐ 1,561
Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.
intellabs/fastRAG ⭐ 1,485
Efficient Retrieval Augmentation and Generation Framework
dmmiller612/bert-extractive-summarizer ⭐ 1,420
Easy to use extractive text summarization with BERT
gunthercox/chatterbot-corpus ⭐ 1,389
A multilingual dialog corpus
🔗 corpus.chatterbot.us
jonasgeiping/cramming ⭐ 1,323
Cramming the training of a (BERT-type) language model into limited compute.
answerdotai/ModernBERT ⭐ 1,279
Bringing BERT into modernity via both architecture changes and scaling
🔗 arxiv.org/abs/2412.13663
pemistahl/lingua-py ⭐ 1,276
The most accurate natural language detection library for Python, suitable for short text and mixed-language text
openai/grade-school-math ⭐ 1,225
GSM8K, a dataset of 8.5K high quality linguistically diverse grade school math word problems
abertsch72/unlimiformer ⭐ 1,059
Public repo for the NeurIPS 2023 paper "Unlimiformer: Long-Range Transformers with Unlimited Length Input"
unitaryai/detoxify ⭐ 1,021
Toxic Comment Classification with Pytorch Lightning and Transformers
🔗 www.unitary.ai
norskregnesentral/skweak ⭐ 923
skweak: A software toolkit for weak supervision applied to NLP tasks
keras-team/keras-hub ⭐ 859
Pretrained model hub for Keras 3.
🔗 keras.io/keras_hub
explosion/spacy-streamlit ⭐ 828
👑 spaCy building blocks and visualizers for Streamlit apps
🔗 share.streamlit.io/ines/spacy-streamlit-demo/master/app.py
paddlepaddle/RocketQA ⭐ 776
🚀 RocketQA, dense retrieval for information retrieval and question answering, including both Chinese and English state-of-the-art models.
maartengr/PolyFuzz ⭐ 752
Performs fuzzy string matching, string grouping, and contains extensive evaluation functions. PolyFuzz is meant to bring fuzzy string matching techniques together within a single framework.
🔗 maartengr.github.io/polyfuzz
webis-de/small-text ⭐ 608
Small-Text provides state-of-the-art Active Learning for Text Classification. Several pre-implemented Query Strategies, Initialization Strategies, and Stopping Critera are provided, which can be easily mixed and matched to build active learning experiments or applications.
🔗 small-text.readthedocs.io
babelscape/rebel ⭐ 518
REBEL is a seq2seq model that simplifies Relation Extraction (EMNLP 2021).
google-research/byt5 ⭐ 501
ByT5 is a tokenizer-free extension of the mT5 model.

Packaging

Python packaging, dependency management and bundling.

astral-sh/uv ⭐ 44,762
An extremely fast Python package installer and resolver, written in Rust. Designed as a drop-in replacement for pip and pip-compile.
🔗 docs.astral.sh/uv
pyenv/pyenv ⭐ 41,260
pyenv lets you easily switch between multiple versions of Python.
python-poetry/poetry ⭐ 32,828
Python packaging and dependency management made easy
🔗 python-poetry.org
pypa/pipenv ⭐ 25,004
A virtualenv management tool that supports a multitude of systems and nicely bridges the gaps between pip, python and virtualenv.
🔗 pipenv.pypa.io
mitsuhiko/rye ⭐ 14,048
a Hassle-Free Python Experience
🔗 rye.astral.sh
pyinstaller/pyinstaller ⭐ 12,219
Freeze (package) Python programs into stand-alone executables
🔗 www.pyinstaller.org
pypa/pipx ⭐ 11,271
Install and Run Python Applications in Isolated Environments
🔗 pipx.pypa.io
pdm-project/pdm ⭐ 8,164
A modern Python package and dependency manager supporting the latest PEP standards
🔗 pdm-project.org
jazzband/pip-tools ⭐ 7,862
A set of tools to keep your pinned Python dependencies fresh (pip-compile + pip-sync)
🔗 pip-tools.rtfd.io
conda-forge/miniforge ⭐ 7,274
A conda-forge distribution.
🔗 conda-forge.org/download
mamba-org/mamba ⭐ 7,209
The Fast Cross-Platform Package Manager: mamba is a reimplementation of the conda package manager in C++
🔗 mamba.readthedocs.io
conda/conda ⭐ 6,738
A system-level, binary package and environment manager running on all major operating systems and platforms.
🔗 docs.conda.io/projects/conda
pypa/hatch ⭐ 6,423
Modern, extensible Python project management
🔗 hatch.pypa.io/latest
indygreg/PyOxidizer ⭐ 5,722
A modern Python application packaging and distribution tool
pypa/virtualenv ⭐ 4,883
A tool to create isolated Python environments. Since Python 3.3, a subset of it has been integrated into the standard lib venv module.
🔗 virtualenv.pypa.io
spack/spack ⭐ 4,589
A flexible package manager that supports multiple versions, configurations, platforms, and compilers.
🔗 spack.io
prefix-dev/pixi ⭐ 4,031
pixi is a cross-platform, multi-language package manager and workflow tool built on the foundation of the conda ecosystem.
🔗 pixi.sh
pantsbuild/pex ⭐ 3,805
A tool for generating .pex (Python EXecutable) files, lock files and venvs.
🔗 docs.pex-tool.org
beeware/briefcase ⭐ 2,819
Tools to support converting a Python project into a standalone native application.
🔗 briefcase.readthedocs.io
pypa/flit ⭐ 2,197
Simplified packaging of Python modules
🔗 flit.pypa.io
linkedin/shiv ⭐ 1,812
shiv is a command line utility for building fully self contained Python zipapps as outlined in PEP 441, but with all their dependencies included.
marcelotduarte/cx_Freeze ⭐ 1,426
cx_Freeze creates standalone executables from Python scripts, with the same performance, is cross-platform and should work on any platform that Python itself works on.
🔗 marcelotduarte.github.io/cx_freeze
ofek/pyapp ⭐ 1,352
Runtime installer for Python applications
🔗 ofek.dev/pyapp
pypa/gh-action-pypi-publish ⭐ 1,004
The blessed GitHub Action, for publishing your 📦 distribution files to PyPI, the tokenless way: https://github.com/marketplace/actions/pypi-publish
🔗 packaging.python.org/guides/publishing-package-distribution-releases-using-github-actions-ci-cd-workflows
py2exe/py2exe ⭐ 918
Create standalone Windows programs from Python code
🔗 www.py2exe.org
prefix-dev/rip ⭐ 658
RIP is a library that allows the resolving and installing of Python PyPI packages from Rust into a virtual environment. It's based on our experience with building Rattler and aims to provide the same experience but for PyPI instead of Conda.
🔗 prefix.dev
snok/install-poetry ⭐ 610
Github action for installing and configuring Poetry
python-poetry/install.python-poetry.org ⭐ 225
The official Poetry installation script
🔗 install.python-poetry.org

Pandas

Pandas and dataframe libraries: data analysis, statistical reporting, pandas GUIs, pandas performance optimisations.

pandas-dev/pandas ⭐ 44,879
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
🔗 pandas.pydata.org
pola-rs/polars ⭐ 32,428
Dataframes powered by a multithreaded, vectorized query engine, written in Rust
🔗 docs.pola.rs
duckdb/duckdb ⭐ 27,694
DuckDB is an analytical in-process SQL database management system
🔗 www.duckdb.org
gventuri/pandas-ai ⭐ 18,148
Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.
🔗 getpanda.ai
kanaries/pygwalker ⭐ 14,429
PyGWalker: Turn your pandas dataframe into an interactive UI for visual analysis
🔗 kanaries.net/pygwalker
ydataai/ydata-profiling ⭐ 12,797
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
🔗 docs.sdk.ydata.ai
rapidsai/cudf ⭐ 8,767
cuDF is a GPU DataFrame library for loading joining, aggregating, filtering, and otherwise manipulating data
🔗 docs.rapids.ai/api/cudf/stable
deepseek-ai/smallpond ⭐ 4,269
A lightweight data processing framework built on DuckDB and 3FS.
aws/aws-sdk-pandas ⭐ 3,989
pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
🔗 aws-sdk-pandas.readthedocs.io
nalepae/pandarallel ⭐ 3,730
A simple and efficient tool to parallelize Pandas operations on all available CPUs
🔗 nalepae.github.io/pandarallel
unionai-oss/pandera ⭐ 3,686
A light-weight, flexible, and expressive statistical data testing library
🔗 www.union.ai/pandera
adamerose/PandasGUI ⭐ 3,222
A GUI for Pandas DataFrames
blaze/blaze ⭐ 3,196
NumPy and Pandas interface to Big Data
🔗 blaze.pydata.org
pydata/pandas-datareader ⭐ 3,013
Extract data from a wide range of Internet sources into a pandas DataFrame.
🔗 pydata.github.io/pandas-datareader/stable/index.html
scikit-learn-contrib/sklearn-pandas ⭐ 2,824
Pandas integration with sklearn
delta-io/delta-rs ⭐ 2,621
A native Rust library for Delta Lake, with bindings into Python
🔗 delta-io.github.io/delta-rs
eventual-inc/Daft ⭐ 2,616
Distributed data engine for Python/SQL designed for the cloud, powered by Rust
🔗 getdaft.io
jmcarpenter2/swifter ⭐ 2,587
A package which efficiently applies any function to a pandas dataframe or series in the fastest available manner
fugue-project/fugue ⭐ 2,053
A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rewrites.
🔗 fugue-tutorials.readthedocs.io
pyjanitor-devs/pyjanitor ⭐ 1,402
Clean APIs for data cleaning. Python implementation of R package Janitor
🔗 pyjanitor-devs.github.io/pyjanitor
holoviz/hvplot ⭐ 1,178
A high-level plotting API for pandas, dask, xarray, and networkx built on HoloViews
🔗 hvplot.holoviz.org
machow/siuba ⭐ 1,169
Python library for using dplyr like syntax with pandas and SQL
🔗 siuba.org
renumics/spotlight ⭐ 1,155
Interactively explore unstructured datasets from your dataframe.
🔗 renumics.com
tkrabel/bamboolib ⭐ 945
bamboolib - a GUI for pandas DataFrames
🔗 bamboolib.com
mwouts/itables ⭐ 842
This packages changes how Pandas and Polars DataFrames are rendered in Jupyter Notebooks. With itables you can display your tables as interactive DataTables that you can sort, paginate, scroll or filter.
🔗 mwouts.github.io/itables

Performance

Performance, parallelisation and low level libraries.

celery/celery ⭐ 25,832
Distributed Task Queue (development branch)
🔗 docs.celeryq.dev
google/flatbuffers ⭐ 23,923
FlatBuffers: Memory Efficient Serialization Library
🔗 flatbuffers.dev
pybind/pybind11 ⭐ 16,325
Seamless operability between C++11 and Python
🔗 pybind11.readthedocs.io
exaloop/codon ⭐ 15,498
A high-performance, zero-overhead, extensible Python compiler with built-in NumPy support
🔗 docs.exaloop.io/codon
dask/dask ⭐ 13,033
Parallel computing with task scheduling
🔗 dask.org
numba/numba ⭐ 10,290
NumPy aware dynamic Python compiler using LLVM
🔗 numba.pydata.org
modin-project/modin ⭐ 10,060
Modin: Scale your Pandas workflows by changing a single line of code
🔗 modin.readthedocs.io
nebuly-ai/optimate ⭐ 8,372
A collection of libraries to optimise AI model performances
🔗 www.nebuly.com
vaexio/vaex ⭐ 8,356
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
🔗 vaex.io
mher/flower ⭐ 6,648
Real-time monitor and web admin for Celery distributed task queue
🔗 flower.readthedocs.io
python-trio/trio ⭐ 6,391
Trio – a friendly Python library for async concurrency and I/O
🔗 trio.readthedocs.io
ultrajson/ultrajson ⭐ 4,385
Ultra fast JSON decoder and encoder written in C with Python bindings
🔗 pypi.org/project/ujson
tlkh/asitop ⭐ 3,921
Perf monitoring CLI tool for Apple Silicon
🔗 tlkh.github.io/asitop
airtai/faststream ⭐ 3,590
FastStream is a powerful and easy-to-use Python framework for building asynchronous services interacting with event streams such as Apache Kafka, RabbitMQ, NATS and Redis.
🔗 faststream.airt.ai/latest
facebookincubator/cinder ⭐ 3,584
Cinder is Meta's internal performance-oriented production version of CPython.
🔗 trycinder.com
ipython/ipyparallel ⭐ 2,606
IPython Parallel: Interactive Parallel Computing in Python
🔗 ipyparallel.readthedocs.io
intel/intel-extension-for-transformers ⭐ 2,161
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
h5py/h5py ⭐ 2,118
HDF5 for Python -- The h5py package is a Pythonic interface to the HDF5 binary data format.
🔗 www.h5py.org
agronholm/anyio ⭐ 1,942
High level asynchronous concurrency and networking framework that works on top of either trio or asyncio
tiangolo/asyncer ⭐ 1,837
Asyncer, async and await, focused on developer experience.
🔗 asyncer.tiangolo.com
intel/intel-extension-for-pytorch ⭐ 1,785
A Python package for extending the official PyTorch that can easily obtain performance on Intel platform
faster-cpython/ideas ⭐ 1,708
Discussion and work tracker for Faster CPython project.
dask/distributed ⭐ 1,611
A distributed task scheduler for Dask
🔗 distributed.dask.org
nschloe/perfplot ⭐ 1,369
📈 Performance analysis for Python snippets
intel/scikit-learn-intelex ⭐ 1,257
Extension for Scikit-learn is a seamless way to speed up your Scikit-learn application
🔗 uxlfoundation.github.io/scikit-learn-intelex
markshannon/faster-cpython ⭐ 946
How to make CPython faster.
zerointensity/pointers.py ⭐ 922
Bringing the hell of pointers to Python.
🔗 pointers.zintensity.dev
brandtbucher/specialist ⭐ 649
Visualize CPython's specializing, adaptive interpreter. 🔥

Profiling

Memory and CPU/GPU profiling tools and libraries.

bloomberg/memray ⭐ 13,776
Memray is a memory profiler for Python
🔗 bloomberg.github.io/memray
benfred/py-spy ⭐ 13,394
Sampling profiler for Python programs
plasma-umass/scalene ⭐ 12,525
Scalene: a high-performance, high-precision CPU, GPU, and memory profiler for Python with AI-powered optimization proposals
joerick/pyinstrument ⭐ 6,920
🚴 Call stack profiler for Python. Shows you why your code is slow!
🔗 pyinstrument.readthedocs.io
gaogaotiantian/viztracer ⭐ 6,221
A debugging and profiling tool that can trace and visualize python code execution
🔗 viztracer.readthedocs.io
pythonprofilers/memory_profiler ⭐ 4,454
Monitor Memory usage of Python code
🔗 pypi.python.org/pypi/memory_profiler
pyutils/line_profiler ⭐ 2,892
Line-by-line profiling for Python
reloadware/reloadium ⭐ 2,871
Hot Reloading and Profiling for Python
jiffyclub/snakeviz ⭐ 2,406
An in-browser Python profile viewer
🔗 jiffyclub.github.io/snakeviz
p403n1x87/austin ⭐ 2,019
Python frame stack sampler for CPython
🔗 pypi.org/project/austin-dist
pythonspeed/filprofiler ⭐ 863
A Python memory profiler for data processing and scientific computing applications
🔗 pythonspeed.com/products/filmemoryprofiler

Security

Security related libraries: vulnerability discovery, SQL injection, environment auditing.

swisskyrepo/PayloadsAllTheThings ⭐ 63,955
A list of useful payloads and bypass for Web Application Security and Pentest/CTF
🔗 swisskyrepo.github.io/payloadsallthethings
sqlmapproject/sqlmap ⭐ 33,605
Automatic SQL injection and database takeover tool
🔗 sqlmap.org
certbot/certbot ⭐ 32,021
Certbot is EFF's tool to obtain certs from Let's Encrypt and (optionally) auto-enable HTTPS on your server. It can also act as a client for any other CA that uses the ACME protocol.
aquasecurity/trivy ⭐ 25,031
Find vulnerabilities, misconfigurations, secrets, SBOM in containers, Kubernetes, code repositories, clouds and more
🔗 trivy.dev
bridgecrewio/checkov ⭐ 7,427
Checkov is a static code analysis tool for infrastructure as code (IaC) and also a software composition analysis (SCA) tool for images and open source packages.
🔗 www.checkov.io
nccgroup/ScoutSuite ⭐ 7,013
Multi-Cloud Security Auditing Tool
pycqa/bandit ⭐ 6,828
Bandit is a tool designed to find common security issues in Python code.
🔗 bandit.readthedocs.io
stamparm/maltrail ⭐ 6,809
Malicious traffic detection system
rhinosecuritylabs/pacu ⭐ 4,589
The AWS exploitation framework, designed for testing the security of Amazon Web Services environments.
🔗 rhinosecuritylabs.com/aws/pacu-open-source-aws-exploitation-framework
microsoft/presidio ⭐ 4,271
Context aware, pluggable and customizable PII de-identification service for text and images
🔗 microsoft.github.io/presidio
dashingsoft/pyarmor ⭐ 4,070
A tool used to obfuscate python scripts, bind obfuscated scripts to fixed machine or expire obfuscated scripts.
🔗 pyarmor.dashingsoft.com
mozilla/bleach ⭐ 2,680
Bleach is an allowed-list-based HTML sanitizing library that escapes or strips markup and attributes
🔗 bleach.readthedocs.io/en/latest
pyupio/safety ⭐ 1,812
Safety checks Python dependencies for known security vulnerabilities and suggests the proper remediations for vulnerabilities detected.
🔗 safetycli.com/product/safety-cli
trailofbits/pip-audit ⭐ 1,017
Audits Python environments, requirements files and dependency trees for known security vulnerabilities, and can automatically fix them
🔗 pypi.org/project/pip-audit
fadi002/de4py ⭐ 886
toolkit for python reverse engineering
🔗 de4py.rf.gd
thecyb3ralpha/BobTheSmuggler ⭐ 530
A tool that leverages HTML Smuggling Attack and allows you to create HTML files with embedded 7z/zip archives.

Simulation

Simulation libraries: robotics, economic, agent-based, traffic, physics, astronomy, chemistry, quantum simulation. Also see the Maths and Science category for crossover.

atsushisakai/PythonRobotics ⭐ 24,519
Python sample codes and textbook for robotics algorithms.
🔗 atsushisakai.github.io/pythonrobotics
genesis-embodied-ai/Genesis ⭐ 24,419
Genesis is a physics platform, and generative data engine, designed for general purpose Robotics/Embodied AI/Physical AI applications
bulletphysics/bullet3 ⭐ 13,137
Bullet Physics SDK: real-time collision detection and multi-physics simulation for VR, games, visual effects, robotics, machine learning etc.
🔗 bulletphysics.org
isl-org/Open3D ⭐ 12,064
Open3D: A Modern Library for 3D Data Processing
🔗 www.open3d.org
dlr-rm/stable-baselines3 ⭐ 10,128
Stable Baselines3 (SB3) is a set of reliable implementations of reinforcement learning algorithms in PyTorch
🔗 stable-baselines3.readthedocs.io
nvidia/Cosmos ⭐ 7,698
NVIDIA Cosmos is a developer-first world foundation model platform designed to help Physical AI developers build their Physical AI systems better and faster.
qiskit/qiskit ⭐ 5,812
Qiskit is an open-source SDK for working with quantum computers at the level of extended quantum circuits, operators, and primitives.
🔗 www.ibm.com/quantum/qiskit
nvidia/warp ⭐ 4,626
A Python framework for high performance GPU simulation and graphics
🔗 nvidia.github.io/warp
astropy/astropy ⭐ 4,608
Astronomy and astrophysics core library
🔗 www.astropy.org
quantumlib/Cirq ⭐ 4,480
An open-source Python framework for creating, editing, and invoking Noisy Intermediate-Scale Quantum (NISQ) circuits.
🔗 quantumai.google/cirq
chakazul/Lenia ⭐ 3,594
Lenia is a 2D cellular automata with continuous space, time and states. It produces a huge variety of interesting methematical life forms
🔗 chakazul.github.io/lenia/javascript/lenia.html
nvidia-omniverse/IsaacLab ⭐ 3,059
Unified framework for robot learning built on NVIDIA Isaac Sim
🔗 isaac-sim.github.io/isaaclab
openai/mujoco-py ⭐ 2,945
MuJoCo is a physics engine for detailed, efficient rigid body simulations with contacts. mujoco-py allows using MuJoCo from Python 3.
projectmesa/mesa ⭐ 2,838
Mesa is an open-source Python library for agent-based modeling, ideal for simulating complex systems and exploring emergent behaviors.
🔗 mesa.readthedocs.io
rdkit/rdkit ⭐ 2,830
The official sources for the RDKit library
google/brax ⭐ 2,576
Massively parallel rigidbody physics simulation on accelerator hardware.
taichi-dev/difftaichi ⭐ 2,556
10 differentiable physical simulators built with Taichi differentiable programming (DiffTaichi, ICLR 2020)
nvidia-omniverse/IsaacGymEnvs ⭐ 2,301
Example RL environments for the NVIDIA Isaac Gym high performance environments
dlr-rm/rl-baselines3-zoo ⭐ 2,299
A training framework for Stable Baselines3 reinforcement learning agents, with hyperparameter optimization and pre-trained agents included.
🔗 rl-baselines3-zoo.readthedocs.io
facebookresearch/habitat-lab ⭐ 2,202
A modular high-level library to train embodied AI agents across a variety of tasks and environments.
🔗 aihabitat.org
quantecon/QuantEcon.py ⭐ 2,076
A community based Python library for quantitative economics
🔗 quantecon.org/quantecon-py
microsoft/PromptCraft-Robotics ⭐ 1,978
Community for applying LLMs to robotics and a robot simulator with ChatGPT integration
🔗 aka.ms/chatgpt-robotics
eloialonso/diamond ⭐ 1,759
DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model
🔗 diamond-wm.github.io
deepmodeling/deepmd-kit ⭐ 1,602
A deep learning package for many-body potential energy representation and molecular dynamics
🔗 docs.deepmodeling.com/projects/deepmd
bowang-lab/scGPT ⭐ 1,159
scGPT: Towards Building a Foundation Model for Single-Cell Multi-omics Using Generative AI
🔗 scgpt.readthedocs.io/en/latest
sail-sg/envpool ⭐ 1,133
C++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.
🔗 envpool.readthedocs.io
a-r-j/graphein ⭐ 1,073
Protein Graph Library
🔗 graphein.ai
viblo/pymunk ⭐ 963
Pymunk is a easy-to-use pythonic 2d physics library that can be used whenever you need 2d rigid body physics from Python
🔗 www.pymunk.org
google-deepmind/materials_discovery ⭐ 962
Graph Networks for Materials Science (GNoME) is a project centered around scaling machine learning methods to tackle materials science.
nvidia-omniverse/OmniIsaacGymEnvs ⭐ 944
Reinforcement Learning Environments for Omniverse Isaac Gym
altera-al/project-sid ⭐ 943
This repository contains our technical report: "Project Sid: Many-agent simulations toward AI civilization"
google/evojax ⭐ 889
EvoJAX is a scalable, general purpose, hardware-accelerated neuroevolution toolkit built on the JAX library
facebookresearch/fairo ⭐ 865
A modular embodied agent architecture and platform for building embodied agents
eureka-research/DrEureka ⭐ 847
Official Repository for "DrEureka: Language Model Guided Sim-To-Real Transfer" (RSS 2024)
🔗 eureka-research.github.io/dr-eureka
polymathicai/the_well ⭐ 777
15TB of Physics Simulations: collection of machine learning datasets containing numerical simulations of a wide variety of spatiotemporal physical systems.
🔗 polymathic-ai.org/the_well
ur-whitelab/chemcrow-public ⭐ 716
Chemcrow
ur-whitelab/chemcrow-runs ⭐ 81
ur-whitelab/chemcrow-runs

Study

Miscellaneous study resources: algorithms, general resources, system design, code repos for textbooks, best practices, tutorials.

thealgorithms/Python ⭐ 198,477
All Algorithms implemented in Python
🔗 thealgorithms.github.io/python
microsoft/generative-ai-for-beginners ⭐ 75,237
Learn the fundamentals of building Generative AI applications with our 21-lesson comprehensive course by Microsoft Cloud Advocates.
🔗 microsoft.github.io/generative-ai-for-beginners
labmlai/annotated_deep_learning_paper_implementations ⭐ 59,263
🧑‍🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
🔗 nn.labml.ai
mlabonne/llm-course ⭐ 48,162
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
🔗 mlabonne.github.io/blog
jakevdp/PythonDataScienceHandbook ⭐ 44,106
Python Data Science Handbook: full text in Jupyter Notebooks
🔗 jakevdp.github.io/pythondatasciencehandbook
rasbt/LLMs-from-scratch ⭐ 42,104
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
🔗 amzn.to/4fqvn0d
realpython/python-guide ⭐ 28,763
Python best practices guidebook, written for humans.
🔗 docs.python-guide.org
d2l-ai/d2l-en ⭐ 25,284
Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.
🔗 d2l.ai
christoschristofidis/awesome-deep-learning ⭐ 24,973
A curated list of awesome Deep Learning tutorials, projects and communities.
wesm/pydata-book ⭐ 22,866
Materials and IPython notebooks for "Python for Data Analysis" by Wes McKinney, published by O'Reilly Media
hannibal046/Awesome-LLM ⭐ 22,140
Awesome-LLM: a curated list of Large Language Model
microsoft/recommenders ⭐ 19,933
Best Practices on Recommendation Systems
🔗 recommenders-team.github.io/recommenders/intro.html
fchollet/deep-learning-with-python-notebooks ⭐ 19,037
Jupyter notebooks for the code samples of the book "Deep Learning with Python"
huggingface/agents-course ⭐ 14,738
This repository contains the Hugging Face Agents Course.
graykode/nlp-tutorial ⭐ 14,501
Natural Language Processing Tutorial for Deep Learning Researchers
🔗 www.reddit.com/r/machinelearning/comments/amfinl/project_nlptutoral_repository_who_is_studying
naklecha/llama3-from-scratch ⭐ 14,271
llama3 implementation one matrix multiplication at a time
shangtongzhang/reinforcement-learning-an-introduction ⭐ 13,920
Python Implementation of Reinforcement Learning: An Introduction
karpathy/nn-zero-to-hero ⭐ 13,426
Neural Networks: Zero to Hero
mrdbourke/pytorch-deep-learning ⭐ 13,022
Materials for the Learn PyTorch for Deep Learning: Zero to Mastery course.
🔗 learnpytorch.io
eugeneyan/open-llms ⭐ 11,803
📋 A list of open LLMs available for commercial use.
karpathy/micrograd ⭐ 11,425
A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API
rucaibox/LLMSurvey ⭐ 11,214
The official GitHub page for the survey paper "A Survey of Large Language Models".
🔗 arxiv.org/abs/2303.18223
zhanymkanov/fastapi-best-practices ⭐ 10,810
FastAPI Best Practices and Conventions we used at our startup
srush/GPU-Puzzles ⭐ 10,731
Teaching beginner GPU programming in a completely interactive fashion
openai/spinningup ⭐ 10,680
An educational resource to help anyone learn deep reinforcement learning.
🔗 spinningup.openai.com
nielsrogge/Transformers-Tutorials ⭐ 10,178
This repository contains demos I made with the Transformers library by HuggingFace.
mooler0410/LLMsPracticalGuide ⭐ 9,773
A curated list of practical guide resources of LLMs (LLMs Tree, Examples, Papers)
🔗 arxiv.org/abs/2304.13712v2
roboflow/notebooks ⭐ 7,395
This repository offers a comprehensive collection of tutorials on state-of-the-art computer vision models and techniques. Explore everything from foundational architectures like ResNet to cutting-edge models like YOLO11, RT-DETR, SAM 2, Florence-2, PaliGemma 2, and Qwen2.5VL.
🔗 roboflow.com/models
firmai/industry-machine-learning ⭐ 7,320
A curated list of applied machine learning and data science notebooks and libraries across different industries (by @firmai)
🔗 www.sov.ai
udlbook/udlbook ⭐ 7,247
Understanding Deep Learning - Simon J.D. Prince
gkamradt/langchain-tutorials ⭐ 6,999
Overview and tutorial of the LangChain Library
neetcode-gh/leetcode ⭐ 5,920
Leetcode solutions for NeetCode.io
alirezadir/Machine-Learning-Interviews ⭐ 5,761
This repo is meant to serve as a guide for Machine Learning/AI technical interviews.
handsonllm/Hands-On-Large-Language-Models ⭐ 5,676
Official code repo for the O'Reilly Book - "Hands-On Large Language Models"
🔗 www.llm-book.com
huggingface/smol-course ⭐ 5,595
a practical course on aligning language models for your specific use case. It's a handy way to get started with aligning language models, because everything runs on most local machines.
mrdbourke/tensorflow-deep-learning ⭐ 5,478
All course materials for the Zero to Mastery Deep Learning with TensorFlow course.
🔗 dbourke.link/ztmtfcourse
udacity/deep-learning-v2-pytorch ⭐ 5,374
Projects and exercises for the latest Deep Learning ND program https://www.udacity.com/course/deep-learning-nanodegree--nd101
timofurrer/awesome-asyncio ⭐ 4,741
A curated list of awesome Python asyncio frameworks, libraries, software and resources
zotroneneis/machine_learning_basics ⭐ 4,357
Plain python implementations of basic machine learning algorithms
promptslab/Awesome-Prompt-Engineering ⭐ 4,256
This repository contains a hand-curated resources for Prompt Engineering with a focus on Generative Pre-trained Transformer (GPT), ChatGPT, PaLM etc
🔗 discord.gg/m88xfymbk6
huggingface/deep-rl-class ⭐ 4,169
This repo contains the Hugging Face Deep Reinforcement Learning Course.
rasbt/machine-learning-book ⭐ 4,044
Code Repository for Machine Learning with PyTorch and Scikit-Learn
🔗 sebastianraschka.com/books/#machine-learning-with-pytorch-and-scikit-learn
huggingface/diffusion-models-class ⭐ 3,920
Materials for the Hugging Face Diffusion Models Course
amanchadha/coursera-deep-learning-specialization ⭐ 3,532
Notes, programming assignments and quizzes from all courses within the Coursera Deep Learning specialization offered by deeplearning.ai: (i) Neural Networks and Deep Learning; (ii) Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization; (iii) Structuring Machine Learning Projects; (iv...
cosmicpython/book ⭐ 3,500
A Book about Pythonic Application Architecture Patterns for Managing Complexity. Cosmos is the Opposite of Chaos you see. O'R. wouldn't actually let us call it "Cosmic Python" tho.
🔗 www.cosmicpython.com
fluentpython/example-code-2e ⭐ 3,479
Example code for Fluent Python, 2nd edition (O'Reilly 2022)
🔗 amzn.to/3j48u2j
mrdbourke/zero-to-mastery-ml ⭐ 3,148
All course materials for the Zero to Mastery Machine Learning and Data Science course.
🔗 dbourke.link/ztmmlcourse
chiphuyen/aie-book ⭐ 2,961
Code for AI Engineering: Building Applications with Foundation Models (Chip Huyen 2025)
krzjoa/awesome-python-data-science ⭐ 2,732
Probably the best curated list of data science software in Python.
🔗 krzjoa.github.io/awesome-python-data-science
gerdm/prml ⭐ 2,281
Repository of notes, code and notebooks in Python for the book Pattern Recognition and Machine Learning by Christopher Bishop
cgpotts/cs224u ⭐ 2,139
Code for CS224u: Natural Language Understanding
cerlymarco/MEDIUM_NoteBook ⭐ 2,107
Repository containing notebooks of my posts on Medium
trananhkma/fucking-awesome-python ⭐ 1,994
awesome-python with ⭐ and 🍴
huggingface/cookbook ⭐ 1,920
Community-driven practical examples of building AI applications and solving various tasks with AI using open-source tools and models.
🔗 huggingface.co/learn/cookbook
chandlerbang/awesome-self-supervised-gnn ⭐ 1,650
Papers about pretraining and self-supervised learning on Graph Neural Networks (GNN).
atcold/NYU-DLSP21 ⭐ 1,597
NYU Deep Learning Spring 2021
🔗 atcold.github.io/nyu-dlsp21
patrickloeber/MLfromscratch ⭐ 1,436
Machine Learning algorithm implementations from scratch.
aburkov/theLMbook ⭐ 1,292
Code for Hundred-Page Language Models Book by Andriy Burkov
🔗 www.thelmbook.com
davidadsp/Generative_Deep_Learning_2nd_Edition ⭐ 1,227
The official code repository for the second edition of the O'Reilly book Generative Deep Learning: Teaching Machines to Paint, Write, Compose and Play.
🔗 www.oreilly.com/library/view/generative-deep-learning/9781098134174
rasbt/LLM-workshop-2024 ⭐ 892
A 4-hour coding workshop to understand how LLMs are implemented and used
jackhidary/quantumcomputingbook ⭐ 835
Companion site for the textbook Quantum Computing: An Applied Approach
bayesianmodelingandcomputationinpython/BookCode_Edition1 ⭐ 515
Bayesian Modeling and Computation in Python: open-access version of the text and the code examples in the book
🔗 www.bayesiancomputationbook.com
dylanhogg/awesome-python ⭐ 364
🐍 Hand-picked awesome Python libraries and frameworks, organised by category
🔗 www.awesomepython.org

Template

Template tools and libraries: cookiecutter repos, generators, quick-starts.

tiangolo/full-stack-fastapi-template ⭐ 31,177
Full stack, modern web application template. Using FastAPI, React, SQLModel, PostgreSQL, Docker, GitHub Actions, automatic HTTPS and more.
cookiecutter/cookiecutter ⭐ 23,219
A cross-platform command-line utility that creates projects from cookiecutters (project templates), e.g. Python package projects, C projects.
🔗 pypi.org/project/cookiecutter
drivendata/cookiecutter-data-science ⭐ 8,650
A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.
🔗 cookiecutter-data-science.drivendata.org
buuntu/fastapi-react ⭐ 2,327
🚀 Cookiecutter Template for FastAPI + React Projects. Using PostgreSQL, SQLAlchemy, and Docker
pyscaffold/pyscaffold ⭐ 2,170
🛠 Python project template generator with batteries included
🔗 pyscaffold.org
cjolowicz/cookiecutter-hypermodern-python ⭐ 1,846
Cookiecutter template for a Python package based on the Hypermodern Python article series.
🔗 cookiecutter-hypermodern-python.readthedocs.io
fmind/mlops-python-package ⭐ 1,186
Best practices designed to support your MLOPs initiatives. You can use this package as part of your MLOps toolkit or platform e.g. Model Registry, Experiment Tracking, Realtime Inference
🔗 fmind.github.io/mlops-python-package
tezromach/python-package-template ⭐ 1,088
🚀 Your next Python package needs a bleeding-edge project structure.
martinheinz/python-project-blueprint ⭐ 964
Blueprint/Boilerplate For Python Projects
callmesora/llmops-python-package ⭐ 861
Best practices designed to support your LLMOps initiatives. You can use this package as part of your LLMOps toolkit or platform e.g. Model Registry, Experiment Tracking, Realtime Inference
fpgmaas/cookiecutter-uv ⭐ 734
A modern cookiecutter template for Python projects that use uv for dependency management
🔗 fpgmaas.github.io/cookiecutter-uv

Terminal

Terminal and console tools and libraries: CLI tools, terminal based formatters, progress bars.

willmcgugan/rich ⭐ 51,259
Rich is a Python library for rich text and beautiful formatting in the terminal.
🔗 rich.readthedocs.io/en/latest
tqdm/tqdm ⭐ 29,473
⚡ A Fast, Extensible Progress Bar for Python and CLI
🔗 tqdm.github.io
aider-ai/aider ⭐ 29,403
Aider lets you pair program with LLMs, to edit code in your local git repository
🔗 aider.chat
willmcgugan/textual ⭐ 27,825
The lean application framework for Python. Build sophisticated user interfaces with a simple Python API. Run your apps in the terminal and a web browser.
🔗 textual.textualize.io
google/python-fire ⭐ 27,471
Python Fire is a library for automatically generating command line interfaces (CLIs) from absolutely any Python object.
tiangolo/typer ⭐ 16,625
Typer, build great CLIs. Easy to code. Based on Python type hints.
🔗 typer.tiangolo.com
pallets/click ⭐ 16,152
Python composable command line interface toolkit
🔗 click.palletsprojects.com
prompt-toolkit/python-prompt-toolkit ⭐ 9,592
Library for building powerful interactive command line applications in Python
🔗 python-prompt-toolkit.readthedocs.io
saulpw/visidata ⭐ 8,103
A terminal spreadsheet multitool for discovering and arranging data
🔗 visidata.org
simonw/llm ⭐ 6,609
A CLI utility and Python library for interacting with Large Language Models, both via remote APIs and models that can be installed and run on your own machine.
🔗 llm.datasette.io
anthropics/claude-code ⭐ 6,496
Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflows
🔗 docs.anthropic.com/s/claude-code
xxh/xxh ⭐ 5,539
🚀 Bring your favorite shell wherever you go through the ssh. Xonsh shell, fish, zsh, osquery and so on.
tconbeer/harlequin ⭐ 4,329
The SQL IDE for Your Terminal.
🔗 harlequin.sh
manrajgrover/halo ⭐ 2,923
💫 Beautiful spinners for terminal, IPython and Jupyter
urwid/urwid ⭐ 2,870
Console user interface library for Python (official repo)
🔗 urwid.org
textualize/trogon ⭐ 2,588
Easily turn your Click CLI into a powerful terminal application
darrenburns/elia ⭐ 2,069
A snappy, keyboard-centric terminal user interface for interacting with large language models. Chat with ChatGPT, Claude, Llama 3, Phi 3, Mistral, Gemma and more.
tmbo/questionary ⭐ 1,680
Python library to build pretty command line user prompts ✨Easy to use multi-select lists, confirmations, free text prompts ...
jazzband/prettytable ⭐ 1,473
Display tabular data in a visually appealing ASCII table format
🔗 pypi.org/project/prettytable
shobrook/wut ⭐ 1,280
Just type wut and an LLM will help you understand whatever's in your terminal. You'll be surprised how useful this can be.
1j01/textual-paint ⭐ 999
🎨 MS Paint in your terminal.
🔗 pypi.org/project/textual-paint

Testing

Testing libraries: unit testing, load testing, acceptance testing, code coverage, browser automation, plugins.

mitmproxy/mitmproxy ⭐ 38,370
An interactive TLS-capable intercepting HTTP proxy for penetration testers and software developers.
🔗 mitmproxy.org
locustio/locust ⭐ 25,800
Write scalable load tests in plain Python 🚗💨
🔗 locust.cloud
microsoft/playwright-python ⭐ 12,608
Python version of the Playwright testing and automation library.
🔗 playwright.dev/python
pytest-dev/pytest ⭐ 12,541
The pytest framework makes it easy to write small tests, yet scales to support complex functional testing
🔗 pytest.org
robotframework/robotframework ⭐ 10,425
Generic automation framework for acceptance testing and RPA
🔗 robotframework.org
seleniumbase/SeleniumBase ⭐ 9,568
Python APIs for web automation, testing, and bypassing bot-detection.
🔗 seleniumbase.io
getmoto/moto ⭐ 7,812
A library that allows you to easily mock out tests based on AWS infrastructure.
🔗 docs.getmoto.org/en/latest
hypothesisworks/hypothesis ⭐ 7,752
Hypothesis is a powerful, flexible, and easy to use library for property-based testing.
🔗 hypothesis.works
newsapps/beeswithmachineguns ⭐ 6,463
A utility for arming (creating) many bees (micro EC2 instances) to attack (load test) targets (web applications).
🔗 apps.chicagotribune.com
confident-ai/deepeval ⭐ 5,608
The LLM Evaluation Framework
🔗 docs.confident-ai.com
codium-ai/qodo-cover ⭐ 4,896
Qodo-Cover: An AI-Powered Tool for Automated Test Generation and Code Coverage Enhancement! 💻🤖🧪🐞
🔗 qodo.ai
spulec/freezegun ⭐ 4,295
Let your Python tests travel through time
getsentry/responses ⭐ 4,223
A utility for mocking out the Python Requests library.
tox-dev/tox ⭐ 3,771
Command line driven CI frontend and development task automation tool.
🔗 tox.wiki
behave/behave ⭐ 3,275
BDD, Python style.
🔗 behave.readthedocs.io/en/latest
nedbat/coveragepy ⭐ 3,104
The code coverage tool for Python
🔗 coverage.readthedocs.io
kevin1024/vcrpy ⭐ 2,764
Automatically mock your HTTP interactions to simplify and speed up testing
cobrateam/splinter ⭐ 2,742
splinter - python test framework for web applications
🔗 splinter.readthedocs.org/en/stable/index.html
pytest-dev/pytest-testinfra ⭐ 2,400
With Testinfra you can write unit tests in Python to test actual state of your servers configured by management tools like Salt, Ansible, Puppet, Chef and so on.
🔗 testinfra.readthedocs.io
pytest-dev/pytest-mock ⭐ 1,909
Thin-wrapper around the mock package for easier use with pytest
🔗 pytest-mock.readthedocs.io/en/latest
pytest-dev/pytest-cov ⭐ 1,832
Coverage plugin for pytest.
pytest-dev/pytest-xdist ⭐ 1,556
pytest plugin for distributed testing and loop-on-failures testing modes.
🔗 pytest-xdist.readthedocs.io
pytest-dev/pytest-asyncio ⭐ 1,479
Asyncio support for pytest
🔗 pytest-asyncio.readthedocs.io
taverntesting/tavern ⭐ 1,061
A command-line tool and Python library and Pytest plugin for automated testing of RESTful APIs, with a simple, concise and flexible YAML-based syntax
🔗 taverntesting.github.io

Machine Learning - Time Series

Machine learning and classical timeseries libraries: forecasting, seasonality, anomaly detection, econometrics.

facebook/prophet ⭐ 18,986
Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
🔗 facebook.github.io/prophet
blue-yonder/tsfresh ⭐ 8,653
Automatic extraction of relevant features from time series:
🔗 tsfresh.readthedocs.io
unit8co/darts ⭐ 8,431
A python library for user-friendly forecasting and anomaly detection on time series.
🔗 unit8co.github.io/darts
sktime/sktime ⭐ 8,261
A unified framework for machine learning with time series
🔗 www.sktime.net
facebookresearch/Kats ⭐ 5,863
Kats, a kit to analyze time series data, a lightweight, easy-to-use, generalizable, and extendable framework to perform time series analysis, from understanding the key statistics and characteristics, detecting change points and anomalies, to forecasting future trends.
awslabs/gluonts ⭐ 4,810
Probabilistic time series modeling in Python
🔗 ts.gluon.ai
google-research/timesfm ⭐ 4,474
TimesFM (Time Series Foundation Model) is a pretrained time-series foundation model developed by Google Research for time-series forecasting.
🔗 research.google/blog/a-decoder-only-foundation-model-for-time-series-forecasting
salesforce/Merlion ⭐ 4,216
Merlion: A Machine Learning Framework for Time Series Intelligence
nixtla/statsforecast ⭐ 4,186
Lightning ⚡️ fast forecasting with statistical and econometric models.
🔗 nixtlaverse.nixtla.io/statsforecast
tdameritrade/stumpy ⭐ 3,820
STUMPY is a powerful and scalable Python library for modern time series analysis
🔗 stumpy.readthedocs.io/en/latest
amazon-science/chronos-forecasting ⭐ 3,058
Chronos: Pretrained Models for Probabilistic Time Series Forecasting
🔗 arxiv.org/abs/2403.07815
aistream-peelout/flow-forecast ⭐ 2,148
Deep learning PyTorch library for time series forecasting, classification, and anomaly detection (originally for flood forecasting).
🔗 flow-forecast.atlassian.net/wiki/spaces/ff/overview
rjt1990/pyflux ⭐ 2,120
Open source time series library for Python
uber/orbit ⭐ 1,958
A Python package for Bayesian forecasting with object-oriented design and probabilistic models under the hood.
🔗 orbit-ml.readthedocs.io/en/stable
alkaline-ml/pmdarima ⭐ 1,627
A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.
🔗 www.alkaline-ml.com/pmdarima
time-series-foundation-models/lag-llama ⭐ 1,390
Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting
bashtage/arch ⭐ 1,380
ARCH models in Python
🔗 bashtage.github.io/arch
winedarksea/AutoTS ⭐ 1,223
Automated Time Series Forecasting
autoviml/Auto_TS ⭐ 748
Automatically build ARIMA, SARIMAX, VAR, FB Prophet and XGBoost Models on Time Series data sets with a Single Line of Code. Created by Ram Seshadri. Collaborators welcome.
google/temporian ⭐ 689
Temporian is an open-source Python library for preprocessing ⚡ and feature engineering 🛠 temporal data 📈 for machine learning applications 🤖
🔗 temporian.readthedocs.io

Typing

Typing libraries: static and run-time type checking, annotations.

python/mypy ⭐ 19,076
Optional static typing for Python
🔗 www.mypy-lang.org
microsoft/pyright ⭐ 13,998
Static Type Checker for Python
facebook/pyre-check ⭐ 6,955
Performant type-checking for python.
🔗 pyre-check.org
python-attrs/attrs ⭐ 5,434
Python Classes Without Boilerplate
🔗 www.attrs.org
instagram/MonkeyType ⭐ 4,866
A Python library that generates static type annotations by collecting runtime types
google/pytype ⭐ 4,854
A static type analyzer for Python code
🔗 google.github.io/pytype
python/typeshed ⭐ 4,540
Collection of library stubs for Python, with static types
koxudaxi/datamodel-code-generator ⭐ 3,021
Pydantic model and dataclasses.dataclass generator for easy conversion of JSON, OpenAPI, JSON Schema, and YAML data sources.
🔗 koxudaxi.github.io/datamodel-code-generator
mtshiba/pylyzer ⭐ 2,713
A fast, feature-rich static code analyzer & language server for Python
🔗 mtshiba.github.io/pylyzer
microsoft/pylance-release ⭐ 1,754
Fast, feature-rich language support for Python. Documentation and issues for Pylance.
agronholm/typeguard ⭐ 1,622
Run-time type checker for Python
patrick-kidger/torchtyping ⭐ 1,420
Type annotations and dynamic checking for a tensor's shape, dtype, names, etc.
robertcraigie/pyright-python ⭐ 207
Python command line wrapper for pyright, a static type checker
🔗 pypi.org/project/pyright

Utility

General utility libraries: miscellaneous tools, linters, code formatters, version management, package tools, documentation tools.

yt-dlp/yt-dlp ⭐ 104,461
A feature-rich command-line audio/video downloader
🔗 discord.gg/h5mncfw63r
home-assistant/core ⭐ 77,157
🏡 Open source home automation that puts local control and privacy first.
🔗 www.home-assistant.io
abi/screenshot-to-code ⭐ 69,099
Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)
🔗 screenshottocode.com
python/cpython ⭐ 65,822
The Python programming language
🔗 www.python.org
localstack/localstack ⭐ 58,067
💻 A fully functional local AWS cloud stack. Develop and test your cloud & Serverless apps offline
🔗 localstack.cloud
faif/python-patterns ⭐ 41,081
A collection of design patterns/idioms in Python
mingrammer/diagrams ⭐ 40,424
🎨 Diagram as Code for prototyping cloud system architectures
🔗 diagrams.mingrammer.com
ggerganov/whisper.cpp ⭐ 38,579
Port of OpenAI's Whisper model in C/C++
paul-gauthier/aider ⭐ 29,403
Aider is a command line tool that lets you pair program with LLMs, to edit code stored in your local git repository
🔗 aider.chat
openai/openai-python ⭐ 25,909
The official Python library for the OpenAI API
🔗 pypi.org/project/openai
keon/algorithms ⭐ 24,399
Minimal examples of data structures and algorithms in Python
norvig/pytudes ⭐ 23,326
Python programs, usually short, of considerable difficulty, to perfect particular skills.
pydantic/pydantic ⭐ 22,861
Data validation using Python type hints
🔗 docs.pydantic.dev
squidfunk/mkdocs-material ⭐ 22,514
Documentation that simply works
🔗 squidfunk.github.io/mkdocs-material
facebookresearch/audiocraft ⭐ 21,668
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
blakeblackshear/frigate ⭐ 21,603
NVR with realtime local object detection for IP cameras
🔗 frigate.video
delgan/loguru ⭐ 21,095
Python logging made (stupidly) simple
chriskiehl/Gooey ⭐ 21,027
Turn (almost) any Python command line program into a full GUI application with one line
mkdocs/mkdocs ⭐ 20,094
Project documentation with Markdown.
🔗 www.mkdocs.org
micropython/micropython ⭐ 20,035
MicroPython - a lean and efficient Python implementation for microcontrollers and constrained systems
🔗 micropython.org
rustpython/RustPython ⭐ 19,778
A Python Interpreter written in Rust
🔗 rustpython.github.io
higherorderco/Bend ⭐ 18,511
A massively parallel, high-level programming language
🔗 higherorderco.com
kivy/kivy ⭐ 18,111
Open source UI framework written in Python, running on Windows, Linux, macOS, Android and iOS
🔗 kivy.org
ipython/ipython ⭐ 16,426
Official repository for IPython itself. Other repos in the IPython organization contain things like the website, documentation builds, etc.
🔗 ipython.readthedocs.org
alievk/avatarify-python ⭐ 16,409
Avatars for Zoom, Skype and other video-conferencing apps.
openai/triton ⭐ 14,906
Development repository for the Triton language and compiler
🔗 triton-lang.org
google/brotli ⭐ 13,911
Brotli is a generic-purpose lossless compression algorithm that compresses data using a combination of a modern variant of the LZ77 algorithm, Huffman coding and 2nd order context modeling
pyo3/pyo3 ⭐ 13,244
Rust bindings for the Python interpreter
🔗 pyo3.rs
zulko/moviepy ⭐ 13,157
Video editing with Python
🔗 zulko.github.io/moviepy
caronc/apprise ⭐ 12,885
Apprise - Push Notifications that work with just about every platform!
🔗 hub.docker.com/r/caronc/apprise
pyodide/pyodide ⭐ 12,851
Pyodide is a Python distribution for the browser and Node.js based on WebAssembly
🔗 pyodide.org/en/stable
nuitka/Nuitka ⭐ 12,797
Nuitka is a Python compiler written in Python. It's fully compatible with Python 2.6, 2.7, 3.4-3.13. You feed it your Python app, it does a lot of clever things, and spits out an executable or extension module.
🔗 nuitka.net
pytube/pytube ⭐ 12,653
A lightweight, dependency-free Python library (and command-line utility) for downloading YouTube Videos.
🔗 pytube.io
python-pillow/Pillow ⭐ 12,628
The Python Imaging Library adds image processing capabilities to Python (Pillow is the friendly PIL fork)
🔗 python-pillow.github.io
dbader/schedule ⭐ 12,010
Python job scheduling for humans.
🔗 schedule.readthedocs.io
ninja-build/ninja ⭐ 11,724
Ninja is a small build system with a focus on speed.
🔗 ninja-build.org
secdev/scapy ⭐ 11,169
Scapy: the Python-based interactive packet manipulation program & library.
🔗 scapy.net
asweigart/pyautogui ⭐ 11,102
A cross-platform GUI automation Python module for human beings. Used to programmatically control the mouse & keyboard.
magicstack/uvloop ⭐ 10,748
Ultra fast asyncio event loop.
pallets/jinja ⭐ 10,674
A very fast and expressive template engine.
🔗 jinja.palletsprojects.com
aristocratos/bpytop ⭐ 10,545
Linux/OSX/FreeBSD resource monitor
cython/cython ⭐ 9,847
The most widely used Python to C compiler
🔗 cython.org
aws/serverless-application-model ⭐ 9,430
The AWS Serverless Application Model (AWS SAM) transform is a AWS CloudFormation macro that transforms SAM templates into CloudFormation templates.
🔗 aws.amazon.com/serverless/sam
paramiko/paramiko ⭐ 9,288
The leading native Python SSHv2 protocol library.
🔗 paramiko.org
boto/boto3 ⭐ 9,240
AWS SDK for Python
🔗 aws.amazon.com/sdk-for-python
facebookresearch/hydra ⭐ 9,128
Hydra is a framework for elegantly configuring complex applications
🔗 hydra.cc
py-pdf/pypdf ⭐ 8,846
A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
🔗 pypdf.readthedocs.io/en/latest
arrow-py/arrow ⭐ 8,819
🏹 Better dates & times for Python
🔗 arrow.readthedocs.io
xonsh/xonsh ⭐ 8,669
🐚 Python-powered shell. Full-featured and cross-platform.
🔗 xon.sh
eternnoir/pyTelegramBotAPI ⭐ 8,315
Python Telegram bot api.
jasonppy/VoiceCraft ⭐ 8,189
Zero-Shot Speech Editing and Text-to-Speech in the Wild
kellyjonbrazil/jc ⭐ 8,093
CLI tool and python library that converts the output of popular command-line tools, file-types, and common strings to JSON, YAML, or Dictionaries. This allows piping of output to tools like jq and simplifying automation scripts.
googleapis/google-api-python-client ⭐ 8,067
🐍 The official Python client library for Google's discovery based APIs.
🔗 googleapis.github.io/google-api-python-client/docs
theskumar/python-dotenv ⭐ 7,965
Reads key-value pairs from a .env file and can set them as environment variables. It helps in developing applications following the 12-factor principles.
🔗 saurabh-kumar.com/python-dotenv
icloud-photos-downloader/icloud_photos_downloader ⭐ 7,827
A command-line tool to download photos from iCloud
googlecloudplatform/python-docs-samples ⭐ 7,616
Code samples used on cloud.google.com
google/latexify_py ⭐ 7,446
A library to generate LaTeX expression from Python code.
pygithub/PyGithub ⭐ 7,247
Typed interactions with the GitHub API v3
🔗 pygithub.readthedocs.io
jd/tenacity ⭐ 7,201
Retrying library for Python
🔗 tenacity.readthedocs.io
marshmallow-code/marshmallow ⭐ 7,113
A lightweight library for converting complex objects to and from simple Python datatypes.
🔗 marshmallow.readthedocs.io
bndr/pipreqs ⭐ 7,082
pipreqs - Generate pip requirements.txt file based on imports of any project. Looking for maintainers to move this project forward.
pyca/cryptography ⭐ 6,942
cryptography is a package designed to expose cryptographic primitives and recipes to Python developers.
🔗 cryptography.io
sphinx-doc/sphinx ⭐ 6,919
The Sphinx documentation generator
🔗 www.sphinx-doc.org
hugapi/hug ⭐ 6,875
Embrace the APIs of the future. Hug aims to make developing APIs as simple as possible, but no simpler.
timdettmers/bitsandbytes ⭐ 6,815
Accessible large language models via k-bit quantization for PyTorch.
🔗 huggingface.co/docs/bitsandbytes/main/en/index
gorakhargosh/watchdog ⭐ 6,797
Python library and shell utilities to monitor filesystem events.
🔗 packages.python.org/watchdog
ijl/orjson ⭐ 6,691
Fast, correct Python JSON library supporting dataclasses, datetimes, and numpy
openai/point-e ⭐ 6,664
Point cloud diffusion for 3D model synthesis
agronholm/apscheduler ⭐ 6,578
Task scheduling library for Python
sdispater/pendulum ⭐ 6,383
Python datetimes made easy
🔗 pendulum.eustace.io
pdfminer/pdfminer.six ⭐ 6,289
Community maintained fork of pdfminer - we fathom PDF
🔗 pdfminersix.readthedocs.io
scikit-image/scikit-image ⭐ 6,200
Image processing in Python
🔗 scikit-image.org
wireservice/csvkit ⭐ 6,109
A suite of utilities for converting to and working with CSV, the king of tabular file formats.
🔗 csvkit.readthedocs.io
pytransitions/transitions ⭐ 5,956
A lightweight, object-oriented finite state machine implementation in Python with many extensions
rsalmei/alive-progress ⭐ 5,734
A new kind of Progress Bar, with real-time throughput, ETA, and very cool animations!
comet-ml/opik ⭐ 5,606
Opik is an open-source platform for evaluating, testing and monitoring LLM applications.
🔗 www.comet.com/docs/opik
traceloop/openllmetry ⭐ 5,537
Open-source observability for your LLM application, based on OpenTelemetry
🔗 www.traceloop.com/openllmetry
spotify/pedalboard ⭐ 5,421
🎛 🔊 A Python library for audio.
🔗 spotify.github.io/pedalboard
buildbot/buildbot ⭐ 5,320
Python-based continuous integration testing framework; your pull requests are more than welcome!
🔗 www.buildbot.net
prompt-toolkit/ptpython ⭐ 5,283
A better Python REPL
pywinauto/pywinauto ⭐ 5,222
Windows GUI Automation with Python (based on text properties)
🔗 pywinauto.github.io
tebelorg/RPA-Python ⭐ 5,123
Python package for doing RPA
pycqa/pycodestyle ⭐ 5,079
Simple Python style checker in one Python file
🔗 pycodestyle.pycqa.org
pythonnet/pythonnet ⭐ 4,962
Python for .NET is a package that gives Python programmers nearly seamless integration with the .NET Common Language Runtime (CLR) and provides a powerful application scripting tool for .NET developers.
🔗 pythonnet.github.io
jorgebastida/awslogs ⭐ 4,896
AWS CloudWatch logs for Humans™
pytoolz/toolz ⭐ 4,814
A functional standard library for Python.
🔗 toolz.readthedocs.org
hhatto/autopep8 ⭐ 4,599
A tool that automatically formats Python code to conform to the PEP 8 style guide.
🔗 pypi.org/project/autopep8
bogdanp/dramatiq ⭐ 4,532
A fast and reliable background task processing library for Python 3.
🔗 dramatiq.io
ashleve/lightning-hydra-template ⭐ 4,512
PyTorch Lightning + Hydra. A very user-friendly template for ML experimentation. ⚡🔥⚡
pyinvoke/invoke ⭐ 4,508
Pythonic task management & command execution.
🔗 pyinvoke.org
pyo3/maturin ⭐ 4,346
Build and publish crates with pyo3, cffi and uniffi bindings as well as rust binaries as python packages
🔗 maturin.rs
blealtan/efficient-kan ⭐ 4,251
An efficient pure-PyTorch implementation of Kolmogorov-Arnold Network (KAN).
adafruit/circuitpython ⭐ 4,227
CircuitPython - a Python implementation for teaching coding with microcontrollers
🔗 circuitpython.org
ets-labs/python-dependency-injector ⭐ 4,218
Dependency injection framework for Python
🔗 python-dependency-injector.ets-labs.org
evhub/coconut ⭐ 4,152
Coconut (coconut-lang.org) is a variant of Python that adds on top of Python syntax new features for simple, elegant, Pythonic functional programming.
🔗 coconut-lang.org
pyinfra-dev/pyinfra ⭐ 4,131
pyinfra turns Python code into shell commands and runs them on your servers. Execute ad-hoc commands and write declarative operations. Target SSH servers, local machine and Docker containers. Fast and scales from one server to thousands.
🔗 pyinfra.com
miguelgrinberg/python-socketio ⭐ 4,119
Python Socket.IO server and client
joblib/joblib ⭐ 4,012
Computing with Python functions.
🔗 joblib.readthedocs.org
python-markdown/markdown ⭐ 3,929
A Python implementation of John Gruber’s Markdown with Extension support.
🔗 python-markdown.github.io
rspeer/python-ftfy ⭐ 3,878
Fixes mojibake and other glitches in Unicode text, after the fact.
🔗 ftfy.readthedocs.org
hynek/structlog ⭐ 3,832
Simple, powerful, and fast logging for Python.
🔗 www.structlog.org
more-itertools/more-itertools ⭐ 3,823
More routines for operating on iterables, beyond itertools
🔗 more-itertools.rtfd.io
zeromq/pyzmq ⭐ 3,822
PyZMQ: Python bindings for zeromq
🔗 zguide.zeromq.org/py:all
spotify/basic-pitch ⭐ 3,761
A lightweight yet powerful audio-to-MIDI converter with pitch bend detection
🔗 basicpitch.io
pydata/xarray ⭐ 3,743
N-D labeled arrays and datasets in Python
🔗 xarray.dev
pypi/warehouse ⭐ 3,686
The Python Package Index
🔗 pypi.org
tartley/colorama ⭐ 3,633
Simple cross-platform colored terminal text in Python
osohq/oso ⭐ 3,476
Deprecated: See README
jorisschellekens/borb ⭐ 3,453
borb is a library for reading, creating and manipulating PDF files in python.
🔗 borbpdf.com
suor/funcy ⭐ 3,409
A fancy and practical functional tools
pyserial/pyserial ⭐ 3,343
Python serial port access library
camelot-dev/camelot ⭐ 3,206
A Python library to extract tabular data from PDFs
🔗 camelot-py.readthedocs.io
libaudioflux/audioFlux ⭐ 3,016
A library for audio and music analysis, feature extraction.
🔗 audioflux.top
tinche/aiofiles ⭐ 2,979
Library for handling local disk files in asyncio applications.
legrandin/pycryptodome ⭐ 2,973
A self-contained cryptographic library for Python
🔗 www.pycryptodome.org
tox-dev/pipdeptree ⭐ 2,866
A command line utility to display dependency tree of the installed Python packages
🔗 pypi.python.org/pypi/pipdeptree
pydantic/logfire ⭐ 2,855
Uncomplicated Observability for Python and beyond! 🪵🔥
🔗 logfire.pydantic.dev/docs
lxml/lxml ⭐ 2,774
The lxml XML toolkit for Python
🔗 lxml.de
whylabs/whylogs ⭐ 2,695
An open-source data logging library for machine learning models and data pipelines. 📚 Provides visibility into data quality & model performance over time. 🛡️ Supports privacy-preserving data collection, ensuring safety & robustness. 📈
🔗 whylogs.readthedocs.io
liiight/notifiers ⭐ 2,694
The easy way to send notifications
🔗 notifiers.readthedocs.io
cdgriffith/Box ⭐ 2,692
Python dictionaries with advanced dot notation access
🔗 github.com/cdgriffith/box/wiki
jcrist/msgspec ⭐ 2,672
A fast serialization and validation library, with builtin support for JSON, MessagePack, YAML, and TOML
🔗 jcristharif.com/msgspec
pexpect/pexpect ⭐ 2,670
A Python module for controlling interactive programs in a pseudo-terminal
🔗 pexpect.readthedocs.io
yaml/pyyaml ⭐ 2,655
Canonical source repository for PyYAML
litl/backoff ⭐ 2,642
Python library providing function decorators for configurable backoff and retry
scrapinghub/dateparser ⭐ 2,627
python parser for human readable dates
pypa/setuptools ⭐ 2,608
Official project repository for the Setuptools build system
🔗 pypi.org/project/setuptools
hgrecco/pint ⭐ 2,517
Operate and manipulate physical quantities in Python
🔗 pint.readthedocs.org
pyston/pyston ⭐ 2,503
(No longer maintained) A faster and highly-compatible implementation of the Python programming language.
🔗 www.pyston.org
grantjenks/python-diskcache ⭐ 2,498
Python disk-backed cache (Django-compatible). Faster than Redis and Memcached. Pure-Python.
🔗 www.grantjenks.com/docs/diskcache
dosisod/refurb ⭐ 2,494
A tool for refurbishing and modernizing Python codebases
nschloe/tikzplotlib ⭐ 2,472
📊 Save matplotlib figures as TikZ/PGFplots for smooth integration into LaTeX.
rhettbull/osxphotos ⭐ 2,450
Python app to work with pictures and associated metadata from Apple Photos on macOS. Also includes a package to provide programmatic access to the Photos library, pictures, and metadata.
tkem/cachetools ⭐ 2,446
Various memoizing collections and decorators, including variants of the Python Standard Library's @lru_cache function decorator
dateutil/dateutil ⭐ 2,431
Useful extensions to the standard Python datetime features
pndurette/gTTS ⭐ 2,421
Python library and CLI tool to interface with Google Translate's text-to-speech API
🔗 gtts.readthedocs.org
kiminewt/pyshark ⭐ 2,343
Python wrapper for tshark, allowing python packet parsing using wireshark dissectors
abseil/abseil-py ⭐ 2,335
A collection of Python library code for building Python applications. The code is collected from Google's own Python code base, and has been extensively tested and used in production.
pyparsing/pyparsing ⭐ 2,286
Python library for creating PEG parsers
astanin/python-tabulate ⭐ 2,283
Pretty-print tabular data in Python, a library and a command-line utility. Repository migrated from bitbucket.org/astanin/python-tabulate.
🔗 pypi.org/project/tabulate
nateshmbhat/pyttsx3 ⭐ 2,266
Offline Text To Speech synthesis for python
ianmiell/shutit ⭐ 2,146
Automation framework for programmers
🔗 ianmiell.github.io/shutit
seperman/deepdiff ⭐ 2,131
DeepDiff: Deep Difference and search of any Python object/data. DeepHash: Hash of any object based on its contents. Delta: Use deltas to reconstruct objects by adding deltas together.
🔗 zepworks.com
grahamdumpleton/wrapt ⭐ 2,108
A Python module for decorators, wrappers and monkey patching.
google/gin-config ⭐ 2,088
Gin provides a lightweight configuration framework for Python
omry/omegaconf ⭐ 2,084
Flexible Python configuration system. The last one you will ever need.
mitmproxy/pdoc ⭐ 2,054
API Documentation for Python Projects
🔗 pdoc.dev
pyfilesystem/pyfilesystem2 ⭐ 2,027
Python's Filesystem abstraction layer
🔗 www.pyfilesystem.org
python-rope/rope ⭐ 2,026
a python refactoring library
numba/llvmlite ⭐ 2,012
A lightweight LLVM python binding for writing JIT compilers
🔗 llvmlite.pydata.org
julienpalard/Pipe ⭐ 2,010
A Python library to use infix notation in Python
landscapeio/prospector ⭐ 1,987
Inspects Python source files and provides information about type and location of classes, methods etc
hbldh/bleak ⭐ 1,977
A cross platform Bluetooth Low Energy Client for Python using asyncio
carpedm20/emoji ⭐ 1,946
emoji terminal output for Python
samuelcolvin/watchfiles ⭐ 1,939
Simple, modern and fast file watching and code reload in Python.
🔗 watchfiles.helpmanual.io
pygments/pygments ⭐ 1,933
Pygments is a generic syntax highlighter written in Python
🔗 pygments.org
open-telemetry/opentelemetry-python ⭐ 1,927
OpenTelemetry Python API and SDK
🔗 opentelemetry.io
pydoit/doit ⭐ 1,926
CLI task management & automation tool
🔗 pydoit.org
p0dalirius/Coercer ⭐ 1,920
A python script to automatically coerce a Windows server to authenticate on an arbitrary machine through 12 methods.
🔗 podalirius.net
chaostoolkit/chaostoolkit ⭐ 1,914
Chaos Engineering Toolkit & Orchestration for Developers
🔗 chaostoolkit.org
home-assistant/supervisor ⭐ 1,882
🏡 Home Assistant Supervisor
🔗 home-assistant.io/hassio
konradhalas/dacite ⭐ 1,839
Simple creation of data classes from dictionaries.
mkdocstrings/mkdocstrings ⭐ 1,827
📘 Automatic documentation from sources, for MkDocs.
🔗 mkdocstrings.github.io
joowani/binarytree ⭐ 1,816
Python Library for Studying Binary Trees
🔗 binarytree.readthedocs.io
rubik/radon ⭐ 1,787
Various code metrics for Python code
🔗 radon.readthedocs.org
anthropics/anthropic-sdk-python ⭐ 1,783
SDK providing access to Anthropic's safety-first language model APIs
kalliope-project/kalliope ⭐ 1,727
Kalliope is a framework that will help you to create your own personal assistant.
🔗 kalliope-project.github.io
quodlibet/mutagen ⭐ 1,658
Python module for handling audio metadata
🔗 mutagen.readthedocs.io
instagram/LibCST ⭐ 1,632
A concrete syntax tree parser and serializer library for Python that preserves many aspects of Python's abstract syntax tree
🔗 libcst.readthedocs.io
facebookincubator/Bowler ⭐ 1,589
Safe code refactoring for modern Python.
🔗 pybowler.io
imageio/imageio ⭐ 1,570
Python library for reading and writing image data
🔗 imageio.readthedocs.io
fabiocaccamo/python-benedict ⭐ 1,545
📘 dict subclass with keylist/keypath support, built-in I/O operations (base64, csv, html, ini, json, pickle, plist, query-string, toml, xls, xml, yaml), s3 support and many utilities.
lcompilers/lpython ⭐ 1,525
Python compiler
🔗 lpython.org
nficano/python-lambda ⭐ 1,503
A toolkit for developing and deploying serverless Python code in AWS Lambda.
aws-samples/aws-glue-samples ⭐ 1,467
AWS Glue code samples
lidatong/dataclasses-json ⭐ 1,413
Easily serialize Data Classes to and from JSON
aio-libs/yarl ⭐ 1,384
Yet another URL library
🔗 yarl.aio-libs.org
brandon-rhodes/python-patterns ⭐ 1,382
Source code behind the python-patterns.guide site by Brandon Rhodes
ossf/criticality_score ⭐ 1,365
Gives criticality score for an open source project
oracle/graalpython ⭐ 1,329
GraalPy – A high-performance embeddable Python 3 runtime for Java
🔗 www.graalvm.org/python
pypy/pypy ⭐ 1,288
PyPy is a very fast and compliant implementation of the Python language.
🔗 pypy.org
ariebovenberg/whenever ⭐ 1,224
⏰ Modern datetime library for Python
🔗 whenever.rtfd.io
pyfpdf/fpdf2 ⭐ 1,212
Simple PDF generation for Python
🔗 py-pdf.github.io/fpdf2
pyo3/rust-numpy ⭐ 1,204
PyO3-based Rust bindings of the NumPy C-API
pdoc3/pdoc ⭐ 1,153
🐍 ➡️ 📜 Auto-generate API documentation for Python projects
🔗 pdoc3.github.io/pdoc
milvus-io/pymilvus ⭐ 1,126
Python SDK for Milvus.
fsspec/filesystem_spec ⭐ 1,123
A specification that python filesystems should adhere to.
c4urself/bump2version ⭐ 1,077
Version-bump your software with a single command
🔗 pypi.python.org/pypi/bump2version
daveebbelaar/python-whatsapp-bot ⭐ 1,069
This guide will walk you through the process of creating a WhatsApp bot using the Meta (formerly Facebook) Cloud API with pure Python, and Flask
🔗 www.datalumina.com
juanbindez/pytubefix ⭐ 1,054
Python3 library for downloading YouTube Videos.
🔗 pytubefix.readthedocs.io
metachris/logzero ⭐ 1,024
Robust and effective logging for Python 2 and 3.
🔗 logzero.readthedocs.io
extensityai/symbolicai ⭐ 1,018
Compositional Differentiable Programming Library - divide-and-conquer approach to break down a complex problem into smaller, more manageable problems.
fastai/fastcore ⭐ 1,007
Python supercharged for the fastai library
🔗 fastcore.fast.ai
lastmile-ai/aiconfig ⭐ 997
AIConfig saves prompts, models and model parameters as source control friendly configs. This allows you to iterate on prompts and model parameters separately from your application code.
🔗 aiconfig.lastmileai.dev
barracuda-fsh/pyobd ⭐ 940
An OBD-II compliant car diagnostic tool
qdrant/qdrant-client ⭐ 902
Python client for Qdrant vector search engine
🔗 qdrant.tech
samuelcolvin/dirty-equals ⭐ 850
Doing dirty (but extremely useful) things with equals.
🔗 dirty-equals.helpmanual.io
tox-dev/filelock ⭐ 833
A platform independent file lock in Python, which provides a simple way of inter-process communication
🔗 py-filelock.readthedocs.io
modal-labs/modal-examples ⭐ 809
Examples of programs built using Modal
🔗 modal.com/docs
open-telemetry/opentelemetry-python-contrib ⭐ 798
OpenTelemetry instrumentation for Python modules
🔗 opentelemetry.io
pypa/build ⭐ 773
A simple, correct Python build frontend
🔗 build.pypa.io
chrishayuk/mcp-cli ⭐ 758
A protocol-level CLI designed to interact with a Model Context Protocol server. The client allows users to send commands, query data, and interact with various resources provided by the server.
gefyrahq/gefyra ⭐ 720
Blazingly-fast 🚀, rock-solid, local application development ➡️ with Kubernetes.
🔗 gefyra.dev
platformdirs/platformdirs ⭐ 682
A small Python module for determining appropriate platform-specific dirs, e.g. a "user data dir".
🔗 platformdirs.readthedocs.io
instagram/Fixit ⭐ 674
Advanced Python linting framework with auto-fixes and hierarchical configuration that makes it easy to write custom in-repo lint rules.
🔗 fixit.rtfd.io/en/latest
argoproj-labs/hera ⭐ 669
Hera makes Python code easy to orchestrate on Argo Workflows through native Python integrations. It lets you construct and submit your Workflows entirely in Python. ⭐️ Remember to star!
🔗 hera.rtfd.io
google/pyglove ⭐ 651
Manipulating Python Programs
fastai/ghapi ⭐ 645
A delightful and complete interface to GitHub's amazing API
🔗 ghapi.fast.ai
nv7-github/googlesearch ⭐ 629
A Python library for scraping the Google search engine.
🔗 pypi.org/project/googlesearch-python
methexis-inc/terminal-copilot ⭐ 573
A smart terminal assistant that helps you find the right command.
tavily-ai/tavily-python ⭐ 551
The Tavily Python wrapper allows for easy interaction with the Tavily API, offering the full range of our search and extract functionalities directly from your Python programs.
🔗 docs.tavily.com
pypdfium2-team/pypdfium2 ⭐ 547
Python bindings to PDFium
🔗 pypdfium2.readthedocs.io
tonybaloney/CSnakes ⭐ 542
CSnakes is a .NET Source Generator and Runtime that you can use to embed Python code and libraries into your .NET Solution without the need for REST, HTTP, or Microservices.
🔗 tonybaloney.github.io/csnakes
salesforce/logai ⭐ 525
LogAI - An open-source library for log analytics and intelligence
steamship-core/steamship-langchain ⭐ 512
steamship-langchain
secretiveshell/MCP-Bridge ⭐ 463
A middleware to provide an openAI compatible endpoint that can call MCP tools
neuml/annotateai ⭐ 304
Automatically annotates papers using Large Language Models (LLMs)

Vizualisation

Vizualisation tools and libraries. Application frameworks, 2D/3D plotting, dashboards, WebGL.

apache/superset ⭐ 65,005
Apache Superset is a Data Visualization and Data Exploration Platform
🔗 superset.apache.org
streamlit/streamlit ⭐ 38,226
Streamlit — A faster way to build and share data apps.
🔗 streamlit.io
gradio-app/gradio ⭐ 36,899
Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
🔗 www.gradio.app
danny-avila/LibreChat ⭐ 23,324
LibreChat is a free, open source AI chat platform. This Web UI offers vast customization, supporting numerous AI providers, services, and integrations.
🔗 librechat.ai
plotly/dash ⭐ 22,151
Data Apps & Dashboards for Python. No JavaScript Required.
🔗 plotly.com/dash
matplotlib/matplotlib ⭐ 20,892
matplotlib: plotting with Python
🔗 matplotlib.org/stable
bokeh/bokeh ⭐ 19,693
Interactive Data Visualization in the browser, from Python
🔗 bokeh.org
plotly/plotly.py ⭐ 16,878
The interactive graphing library for Python ✨
🔗 plotly.com/python
mwaskom/seaborn ⭐ 12,944
Statistical data visualization in Python
🔗 seaborn.pydata.org
visgl/deck.gl ⭐ 12,562
WebGL2 powered visualization framework
🔗 deck.gl
<a href="https://github.com/marceloprates/prettymap

Name		Name	Last commit message	Last commit date
Latest commit History 386 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
github_data.csv		github_data.csv
github_data.json		github_data.json

License

dylanhogg/awesome-python

Folders and files

Latest commit

History

Repository files navigation