Skip to content

Prompt injection vulnerability in SALMONN #114

@VulnerabilityReport

Description

@VulnerabilityReport

Team Name: A University Research Team

Date: 26/08/2025

Target Model: SALMONN (Whisper-2+BEATs + Vicuna-7B)

Vulnerability Description:

In typical application scenarios of speech-based large language models, user inputs generally consist of two parts:

  1. Explicit instruction prompts (e.g., “Please summarize this meeting recording”), and

  2. User-provided auxiliary data (e.g., the meeting audio or its transcript).

Within this interaction paradigm, we discovered that an adversary can poison the auxiliary data (by carefully crafting the speech content or transcript) to conduct a prompt injection attack, thereby overriding the user’s original intent instructions.

Experimental Evaluation and Results:

Based on the above threat model, we propose a new white-box injection attack method, which successfully hijacks user control over the model by embedding malicious instructions into the auxiliary data.

  • We designed multiple injection strategies and conducted approximately 500 repeated experiments on a single model under controlled conditions;

  • The attacks demonstrated stable effectiveness across diverse prompt structures and context settings;

  • The average success rate exceeded 99%, achieving efficient overriding of the user’s original instructions.

Potential Exploitation Scenarios:

This type of injection attack could lead to the following risks:

  1. Denial of Service (DoS): An adversary could induce the model to ignore or refuse legitimate user requests;

  2. Goal Hijacking: An adversary could manipulate the model to execute operations entirely misaligned with the user’s intent, such as leaking information or generating misleading outputs.

Disclosure Statement:

This vulnerability report is submitted in good faith under responsible disclosure principles. The research and experiments were conducted in a controlled environment without unauthorized access to production systems or real user data. We share these findings solely for the purpose of improving system security and are willing to cooperate with the vendor/organization to develop mitigation measures.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions