Language Technology Lab at Alibaba DAMO Academy

All

53 repositories

SeaLLMs-Audio
Public
HTML
•4•47•0•0•Updated Jul 29, 2025Jul 29, 2025
CMM
Public
✨✨The Curse of Multi-Modalities (CMM): Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio
Python
•2•46•1•0•Updated Jul 11, 2025Jul 11, 2025
VideoRefer
Public
[CVPR 2025] The code for "VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM"
video-understanding mllm pixel-understanding sam2
Jupyter Notebook
•14•248•8•0•Updated Jun 20, 2025Jun 20, 2025
LLM-Multilingual-Knowledge-Boundaries
Public
[ACL 2025] Analyzing LLMs' Multilingual Knowledge Boundary Cognition Across Languages Through the Lens of Internal Representations
multilingual evaluation datasets hallucination llm
Jupyter Notebook
•
MIT License
•1•13•1•0•Updated Jun 6, 2025Jun 6, 2025
FineReason
Public
[ACL 2025] FineReason: Evaluating and Improving LLMs' Deliberate Reasoning through Reflective Puzzle Solving
reasoning logic-puzzles llms
Python
•
MIT License
•0•10•1•0•Updated Jun 4, 2025Jun 4, 2025
VideoLLaMA3
Public
Frontier Multimodal Foundation Models for Image and Video Understanding
Jupyter Notebook
•
Apache License 2.0
•68•915•63•1•Updated May 19, 2025May 19, 2025
SeaBench
Public
Python
•0•3•0•0•Updated May 16, 2025May 16, 2025
SeaExam
Public
SeaExam: Benchmarking LLMs for Southeast Aisa languages with Human Exam Questions
Python
•0•6•0•0•Updated May 16, 2025May 16, 2025
translation-all-you-need
Public
[NAACL 2025] Is Translation All You Need? A Study on Solving Multilingual Tasks with Large Language Models
Python
•
MIT License
•0•1•0•0•Updated Apr 21, 2025Apr 21, 2025
multimodal_textbook
Public
[ICCV 2025 Highlight] The official repository for "2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining"
Python
•
Apache License 2.0
•16•167•5•0•Updated Mar 17, 2025Mar 17, 2025
MMR1
Public
MMR1: Advancing the Frontiers of Multimodal Reasoning
Apache License 2.0
•5•0•0•0•Updated Mar 12, 2025Mar 12, 2025
LongPO
Public
[ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization
Python
•4•38•0•0•Updated Feb 27, 2025Feb 27, 2025
VideoLLaMA2
Public
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs
Python
•
Apache License 2.0
•80•1.2k•83•0•Updated Jan 23, 2025Jan 23, 2025
Inf-CLIP
Public
[CVPR 2025 Highlight] The official CLIP training codebase of Inf-CL: "Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss". A super memory-efficiency CLIP training scheme.
memory-efficient clip contrastive-learning flash-attention ring-attention infinite-batch-size
Python
•
Apache License 2.0
•11•263•2•0•Updated Jan 16, 2025Jan 16, 2025
CoI-Agent
Public
Official code for paper: Chain of Ideas: Revolutionizing Research via Novel Idea Development with LLM Agents
Python
•
Apache License 2.0
•27•466•10•0•Updated Jan 15, 2025Jan 15, 2025
LLM-R2
Public
Python
•12•47•10•0•Updated Nov 26, 2024Nov 26, 2024
multilingual_analysis
Public
[NeurIPS 2024] How do Large Language Models Handle Multilingualism?
Python
•5•37•8•0•Updated Nov 8, 2024Nov 8, 2024
DiGIT
Public
[NeurIPS 2024] Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective
transformer image-generation gpt language-model autoregressive fairseq neurips
Python
•
MIT License
•1•69•2•0•Updated Oct 31, 2024Oct 31, 2024
reasoning-paths-optimization
Public
[EMNLP 2024] Reasoning Paths Optimization: Learning to Reason and Explore From Diverse Paths
reasoning llm
Python
•
Apache License 2.0
•1•10•0•0•Updated Oct 30, 2024Oct 30, 2024
VCD
Public
[CVPR 2024 Highlight] Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding
Python
•
Apache License 2.0
•17•301•17•0•Updated Oct 7, 2024Oct 7, 2024
Auto-Arena-LLMs
Public
Jupyter Notebook
•
Apache License 2.0
•1•40•1•0•Updated Oct 7, 2024Oct 7, 2024
WebDesignAgent
Public
WebDesignAgent : Towards Effortless Website Creation
Python
•
Apache License 2.0
•29•256•4•0•Updated Sep 20, 2024Sep 20, 2024
LLM-argumentation
Public
[ACL2024] Exploring the Potential of Large Language Models in Computational Argumentation
argument-mining argument-generation computational-argumentation llm
Python
•1•16•1•0•Updated Aug 21, 2024Aug 21, 2024
DAMO-SeaLLMs
Public
[ACL 2024 Demo] SeaLLMs - Large Language Models for Southeast Asia
JavaScript
•14•170•10•0•Updated Jul 30, 2024Jul 30, 2024
Video-LLaMA
Public
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
llama large-language-models video-language-pretraining vision-language-pretraining cross-modal-pretraining blip2 minigpt4 multi-modal-chatgpt
Python
•
BSD 3-Clause "New" or "Revised" License
•278•3k•63•6•Updated Jun 4, 2024Jun 4, 2024
chain-of-knowledge
Public
[ICLR2024] Chain-of-Knowledge: Grounding Large Language Models via Dynamic Knowledge Adapting over Heterogeneous Sources
Python
•
MIT License
•9•67•3•0•Updated Jun 3, 2024Jun 3, 2024
Multipurpose-Chatbot
Public
A chatbot UI for RAG, multimodal, text completion. (support Transformers, llama.cpp, MLX, vLLM)
chatbot-application gradio-interface llm-inference gradio-python-llm
Python
•
Apache License 2.0
•4•19•0•0•Updated Apr 18, 2024Apr 18, 2024
AdamergeX
Public
Python
•0•11•1•0•Updated Apr 2, 2024Apr 2, 2024
LLM_summeval
Public
use gpt-turbo-3.5 to evaluate abstractive summaries
Python
•1•7•0•0•Updated Mar 20, 2024Mar 20, 2024
HierEncDec
Public
code and data for "A Hierarchical Encoding-Decoding Scheme for Abstractive Multi-document Summarization"
Python
•1•9•2•0•Updated Mar 20, 2024Mar 20, 2024