Skip to content
Change the repository type filter

All

    Repositories list

    • lmms-eval

      Public
      Accelerating the development of large multimodal models (LMMs) with one-click evaluation module - lmms-eval.
      Python
      Other
      2932.6k2293Updated Jun 4, 2025Jun 4, 2025
    • MGPO

      Public
      High-Resolution Visual Reasoning via Multi-Turn Grounding-Based Reinforcement Learning
      01700Updated May 28, 2025May 28, 2025
    • Aero-1

      Public
      Python
      Apache License 2.0
      67030Updated May 4, 2025May 4, 2025
    • VideoMMMU

      Public
      Python
      Other
      14600Updated Apr 8, 2025Apr 8, 2025
    • Python
      Apache License 2.0
      710420Updated Apr 8, 2025Apr 8, 2025
    • EgoLife

      Public
      [CVPR 2025] EgoLife: Towards Egocentric Life Assistant
      Python
      Other
      1728740Updated Mar 19, 2025Mar 19, 2025
    • LongVA

      Public
      Long Context Transfer from Language to Vision
      Python
      Other
      19378270Updated Mar 18, 2025Mar 18, 2025
    • .github

      Public
      0100Updated Mar 7, 2025Mar 7, 2025
    • A fork to add multimodal model training to open-r1
      Python
      Apache License 2.0
      621.3k231Updated Feb 8, 2025Feb 8, 2025
    • Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.
      Python
      Other
      613430Updated Jan 24, 2025Jan 24, 2025
    • my-python-template

      Public template
      My template repo for setting up a new python repo
      Python
      1000Updated Dec 11, 2024Dec 11, 2024
    • demos

      Public
      Python
      0000Updated Sep 18, 2024Sep 18, 2024
    • sglang

      Public
      SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with models faster and more controllable.
      Python
      Apache License 2.0
      1.9k400Updated Sep 18, 2024Sep 18, 2024
    • Otter

      Public
      🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
      Python
      MIT License
      2123.3k611Updated Mar 5, 2024Mar 5, 2024
    • Relate Anything Model is capable of taking an image as input and utilizing SAM to identify the corresponding mask within the image.
      Python
      Apache License 2.0
      2245760Updated Jul 4, 2023Jul 4, 2023