Change the repository type filter
All
Repositories list
69 repositories
- NORA: A Small Open-Sourced Generalist Vision Language Action Model for Embodied Tasks
- [Arxiv 2024] Official Implementation of the paper: "Towards Robust Instruction Tuning on Multimodal Large Language Models"
- TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching
- This repository contains the official implementation of Error Typing for Smarter Rewards: Improving Process Reward Models with Error-Aware Hierarchical Supervision.
Emma-X
PublicEmma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial ReasoningMM-InstructEval
PublicThis repository contains code to evaluate various multimodal large language models using different instructions across multiple multimodal content comprehension tasks.- This repository is maintained to release dataset and models for multimodal puzzle reasoning.
darwin
Publicsafety-arithmetic
Publictango
PublicA family of diffusion models for text-to-audio generation.PROVE
Publicferret
PublicFerret: Faster and Effective Automated Red Teaming with Reward-Based Scoring TechniqueSealing
Public[NAACL 2024] Official Implementation of paper "Self-Adaptive Sampling for Efficient Video Question Answering on Image--Text Models"- DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling
HyperTTS
Public- Restore safety in fine-tuned language models through task arithmetic
adapter-mix
Public- [EMNLP 2022] This repository contains the official implementation of the paper "MM-Align: Learning Optimal Transport-based Alignment Dynamics for Fast and Accurate Inference on Missing Modality Sequences"
MELD
PublicMELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversationinstruct-eval
Publicconv-emotion
PublicThis repo contains implementation of different architectures for emotion recognition in conversations.red-instruct
Public- A comprehensive reading list for Emotion Recognition in Conversations
mustango
Public