Add audio NIM model and audio probes #1163

erickgalinkin · 2025-04-16T21:12:51Z

Adds support for audio probes using [the datasets I have somewhere] and multimodal NIM.

garak/generators/nim.py

…ents for `audio` group.

…estGenerator

jmartin-tech · 2025-05-05T17:22:53Z

garak/probes/audio.py

+        audio_achilles_data_dir = (
+            _config.transient.cache_dir / "data" / "audio_achilles"
+        )


prefer load using the data pattern:

Suggested change

audio_achilles_data_dir = (

_config.transient.cache_dir / "data" / "audio_achilles"

)

from garak.data import path as data_path

audio_achilles_data_dir = (

data_path / "audio_achilles"

)

This will move the location expected to $XDG_DATA_HOME/garak/data/audio_achilles.

This can serve as a good example of how a user can bring their own data files. The data_path should be treated as read only by garak and can be provided by the project installation or overridden/extended by the user. The _config.transient.cache_dir paths should not be user provided files as garak expected to manage files in that path.

jmartin-tech · 2025-05-05T17:34:01Z

garak/probes/audio.py

+            def write_audio_to_file(audio_data, file_path, sampling_rate):
+                """Writes audio data to a file.
+
+                Args:
+                    audio_data: A 1D numpy array containing the audio data.
+                    file_path: The path to the output audio file.
+                    sampling_rate: The sampling rate of the audio data.
+                """
+                sf.write(file_path, audio_data, sampling_rate)
+
+            import soundfile as sf
+            from datasets import load_dataset
+
+            os.makedirs(audio_achilles_data_dir)
+            dataset = load_dataset("garak-llm/audio_achilles_heel")
+            for item in dataset["train"]:
+                audio_data = item["audio"]["array"]
+                sampling_rate = item["audio"]["sampling_rate"]
+                file_path = str(audio_achilles_data_dir / item["audio"]["path"])
+                write_audio_to_file(audio_data, file_path, sampling_rate)


This explains the reasoning for searching the cache_dir path, I assume this is meant to mirror the visual_jailbreak probe. In the visual_jailbreak probe however the cache_dir usage is due to the direct download of data into a cached location, is this really analogous to that?

I theory the the structure of the tests sent by this probe could allow user provided audio files in data however connecting the content of that file to an expected mitigation seems to be missing a link in the chain to allow/enable inspection of the data the detector result is based on.

jmartin-tech reviewed Apr 16, 2025

View reviewed changes

garak/generators/nim.py Outdated Show resolved Hide resolved

jmartin-tech reviewed Apr 16, 2025

View reviewed changes

garak/generators/nim.py Outdated Show resolved Hide resolved

erickgalinkin force-pushed the audio-probes branch from 65c2dae to 633969e Compare April 24, 2025 19:43

erickgalinkin added 3 commits April 24, 2025 15:52

Add NVMultimodal generator for supporting text+image+audio

28a1596

Fix generator and test for NVMultimodal.

2250a95

AudioAchillesHeel probe and docs

3a5282a

erickgalinkin force-pushed the audio-probes branch from 633969e to 3a5282a Compare April 24, 2025 19:52

erickgalinkin added 4 commits April 24, 2025 15:55

Fix Tier.

3078dfa

Fix lang. Modify generator to do things more clearly. Update requirem…

dc23c12

…ents for `audio` group.

Refactor NVMultimodal to build on base Generator class instead of R…

af84d45

…estGenerator

Enforce upload limit. Fix file path in audio probe.

9210e9a

erickgalinkin marked this pull request as ready for review April 29, 2025 14:02

erickgalinkin requested review from leondz and jmartin-tech April 29, 2025 14:02

Handle read timeout exception

32ba6c7

jmartin-tech reviewed May 5, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add audio NIM model and audio probes #1163

Add audio NIM model and audio probes #1163

erickgalinkin commented Apr 16, 2025

jmartin-tech May 5, 2025 •

edited

Loading

jmartin-tech May 5, 2025 •

edited

Loading

Add audio NIM model and audio probes #1163

Are you sure you want to change the base?

Add audio NIM model and audio probes #1163

Conversation

erickgalinkin commented Apr 16, 2025

jmartin-tech May 5, 2025 • edited Loading

Choose a reason for hiding this comment

jmartin-tech May 5, 2025 • edited Loading

Choose a reason for hiding this comment

jmartin-tech May 5, 2025 •

edited

Loading

jmartin-tech May 5, 2025 •

edited

Loading