Skip to content
Change the repository type filter

All

    Repositories list

    • MIR-SafetyBench: Evaluating Multi-image Reasoning Safety of Multimodal Large Language Models
      0010Updated Jan 19, 2026Jan 19, 2026
    • JPS

      Public
      [MM'25] JPS: Jailbreak Multimodal Large Language Models with Collaborative Visual Perturbation and Textual Steering
      Python
      31520Updated Dec 23, 2025Dec 23, 2025
    • IF-CRITIC

      Public
      IF-CRITIC: Towards a Fine-Grained LLM Critic for Instruction-Following Evaluation
      Shell
      0610Updated Nov 14, 2025Nov 14, 2025
    • Glyph

      Public
      Official Repository for "Glyph: Scaling Context Windows via Visual-Text Compression"
      Python
      5055360Updated Nov 4, 2025Nov 4, 2025
    • CROPI

      Public
      Official Repository for for paper "Data-Efficient RLVR via Off-Policy Influence Guidance"
      01400Updated Nov 1, 2025Nov 1, 2025
    • SocialEval

      Public
      [ACL'25] SocialEval: Evaluating Social Intelligence of Large Language Models
      Python
      0900Updated Oct 20, 2025Oct 20, 2025
    • CogFlow

      Public
      Think Socially via Cognitive Reasoning
      Python
      1600Updated Oct 2, 2025Oct 2, 2025
    • CharacterGLM-6B

      Public
      [EMNLP'24] CharacterGLM: Customizing Chinese Conversational AI Characters with Large Language Models
      Python
      3749040Updated Oct 2, 2025Oct 2, 2025
    • Crisp

      Public
      [EMNLP'25] Crisp: Cognitive Restructuring of Negative Thoughts through Multi-turn Supportive Dialogues
      Python
      01110Updated Sep 2, 2025Sep 2, 2025
    • AISafetyLab: A comprehensive framework covering safety attack, defense, evaluation and paper list.
      Python
      1422600Updated Aug 29, 2025Aug 29, 2025
    • Python
      0400Updated Aug 15, 2025Aug 15, 2025
    • Python
      48910Updated Aug 11, 2025Aug 11, 2025
    • [AAAI'25] CharacterBench: Benchmarking Character Customization of Large Language Models
      Python
      11900Updated Aug 1, 2025Aug 1, 2025
    • ShieldVLM

      Public
      Python
      0600Updated Jul 31, 2025Jul 31, 2025
    • Official github repo for SafetyBench, a comprehensive benchmark to evaluate LLMs' safety. [ACL 2024]
      Python
      1226800Updated Jul 28, 2025Jul 28, 2025
    • VPO

      Public
      Python
      12220Updated Jul 20, 2025Jul 20, 2025
    • [ACL 2025] LongSafety: Evaluating Long-Context Safety of Large Language Models
      Python
      01500Updated Jun 18, 2025Jun 18, 2025
    • SPaR

      Public
      Python
      34600Updated Jun 11, 2025Jun 11, 2025
    • [ACL 2025] Guiding not Forcing: Enhancing the Transferability of Jailbreaking Attacks on LLMs via Removing Superfluous Constraints
      Python
      11700Updated May 23, 2025May 23, 2025
    • HPSS

      Public
      HPSS: Heuristic Prompting Strategy Search for LLM Evaluators (ACL 2025 Findings)
      Python
      1300Updated May 23, 2025May 23, 2025
    • Python
      62910Updated May 22, 2025May 22, 2025
    • BARREL

      Public
      Python
      11600Updated May 21, 2025May 21, 2025
    • MAPS

      Public
      Official Implementation of ICLR25 paper "MAPS: Advancing Multi-modal Reasoning in Expert-level Physical Science"
      Python
      2800Updated Mar 12, 2025Mar 12, 2025
    • Benchmarking Complex Instruction-Following with Multiple Constraints Composition (NeurIPS 2024 Datasets and Benchmarks Track)
      Python
      1210050Updated Feb 20, 2025Feb 20, 2025
    • MiniPLM

      Public
      [ICLR 2025] MiniPLM: Knowledge Distillation for Pre-Training Language Models
      Python
      97140Updated Nov 23, 2024Nov 23, 2024
    • Python
      11710Updated Nov 7, 2024Nov 7, 2024
    • OpenMEVA

      Public
      Benchmark for evaluating open-ended generation
      Python
      75031Updated Nov 6, 2024Nov 6, 2024
    • CodePlan

      Public
      21610Updated Oct 16, 2024Oct 16, 2024
    • ShieldLM

      Public
      ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors [EMNLP 2024 Findings]
      Python
      1022110Updated Sep 29, 2024Sep 29, 2024
    • PICL

      Public
      Code for ACL2023 paper: Pre-Training to Learn in Context
      Python
      410611Updated Jul 26, 2024Jul 26, 2024