Skip to content
Change the repository type filter

All

    Repositories list

    • HTML
      44700Updated Jul 29, 2025Jul 29, 2025
    • CMM

      Public
      ✨✨The Curse of Multi-Modalities (CMM): Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio
      Python
      24610Updated Jul 11, 2025Jul 11, 2025
    • [CVPR 2025] The code for "VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM"
      Jupyter Notebook
      1424880Updated Jun 20, 2025Jun 20, 2025
    • [ACL 2025] Analyzing LLMs' Multilingual Knowledge Boundary Cognition Across Languages Through the Lens of Internal Representations
      Jupyter Notebook
      11310Updated Jun 6, 2025Jun 6, 2025
    • [ACL 2025] FineReason: Evaluating and Improving LLMs' Deliberate Reasoning through Reflective Puzzle Solving
      Python
      01010Updated Jun 4, 2025Jun 4, 2025
    • Frontier Multimodal Foundation Models for Image and Video Understanding
      Jupyter Notebook
      68915631Updated May 19, 2025May 19, 2025
    • SeaBench

      Public
      Python
      0300Updated May 16, 2025May 16, 2025
    • SeaExam

      Public
      SeaExam: Benchmarking LLMs for Southeast Aisa languages with Human Exam Questions
      Python
      0600Updated May 16, 2025May 16, 2025
    • [NAACL 2025] Is Translation All You Need? A Study on Solving Multilingual Tasks with Large Language Models
      Python
      0100Updated Apr 21, 2025Apr 21, 2025
    • [ICCV 2025 Highlight] The official repository for "2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining"
      Python
      1616750Updated Mar 17, 2025Mar 17, 2025
    • MMR1

      Public
      MMR1: Advancing the Frontiers of Multimodal Reasoning
      5000Updated Mar 12, 2025Mar 12, 2025
    • LongPO

      Public
      [ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization
      Python
      43800Updated Feb 27, 2025Feb 27, 2025
    • VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs
      Python
      801.2k830Updated Jan 23, 2025Jan 23, 2025
    • Inf-CLIP

      Public
      [CVPR 2025 Highlight] The official CLIP training codebase of Inf-CL: "Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss". A super memory-efficiency CLIP training scheme.
      Python
      1126320Updated Jan 16, 2025Jan 16, 2025
    • CoI-Agent

      Public
      Official code for paper: Chain of Ideas: Revolutionizing Research via Novel Idea Development with LLM Agents
      Python
      27466100Updated Jan 15, 2025Jan 15, 2025
    • LLM-R2

      Public
      Python
      1247100Updated Nov 26, 2024Nov 26, 2024
    • [NeurIPS 2024] How do Large Language Models Handle Multilingualism?
      Python
      53780Updated Nov 8, 2024Nov 8, 2024
    • DiGIT

      Public
      [NeurIPS 2024] Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective
      Python
      16920Updated Oct 31, 2024Oct 31, 2024
    • [EMNLP 2024] Reasoning Paths Optimization: Learning to Reason and Explore From Diverse Paths
      Python
      11000Updated Oct 30, 2024Oct 30, 2024
    • VCD

      Public
      [CVPR 2024 Highlight] Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding
      Python
      17301170Updated Oct 7, 2024Oct 7, 2024
    • Jupyter Notebook
      14010Updated Oct 7, 2024Oct 7, 2024
    • WebDesignAgent : Towards Effortless Website Creation
      Python
      2925640Updated Sep 20, 2024Sep 20, 2024
    • [ACL2024] Exploring the Potential of Large Language Models in Computational Argumentation
      Python
      11610Updated Aug 21, 2024Aug 21, 2024
    • [ACL 2024 Demo] SeaLLMs - Large Language Models for Southeast Asia
      JavaScript
      14170100Updated Jul 30, 2024Jul 30, 2024
    • [EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
      Python
      2783k636Updated Jun 4, 2024Jun 4, 2024
    • [ICLR2024] Chain-of-Knowledge: Grounding Large Language Models via Dynamic Knowledge Adapting over Heterogeneous Sources
      Python
      96730Updated Jun 3, 2024Jun 3, 2024
    • A chatbot UI for RAG, multimodal, text completion. (support Transformers, llama.cpp, MLX, vLLM)
      Python
      41900Updated Apr 18, 2024Apr 18, 2024
    • AdamergeX

      Public
      Python
      01110Updated Apr 2, 2024Apr 2, 2024
    • use gpt-turbo-3.5 to evaluate abstractive summaries
      Python
      1700Updated Mar 20, 2024Mar 20, 2024
    • code and data for "A Hierarchical Encoding-Decoding Scheme for Abstractive Multi-document Summarization"
      Python
      1920Updated Mar 20, 2024Mar 20, 2024