HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models

This repository contains code for reproducing HarmAug introduced in

HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models

Seanie Lee*, Haebin Seong*, Dong Bok Lee, Minki Kang, Xiaoyin Chen, Dominik Wagner, Yoshua Bengio, Juho Lee, Sung Ju Hwang (*: Equal contribution)

[arXiv link]
[Model link]
[Dataset link]

Reproduction Steps

First, we recommend to create a conda environment with python 3.10.

conda create -n harmaug python=3.10
conda activate harmaug

After that, install the requirements.

pip install -r requirements.txt

Then, download necessary files from Google Drive and put them into their appropriate folders.

mv [email protected] ./data

Finally, you can start the knowledge distillation process.

bash script/kd.sh

Reference

To cite our paper, please use this BibTex

@article{lee2024harmaug,
  title={{HarmAug}: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models},
  author={Lee, Seanie and Seong, Haebin and Lee, Dong Bok and Kang, Minki and Chen, Xiaoyin and Wagner, Dominik and Bengio, Yoshua and Lee, Juho and Hwang, Sung Ju},
  journal={arXiv preprint arXiv:2410.01524},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
data		data
script		script
src		src
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models

Reproduction Steps

Reference

About

Releases

Packages

Contributors 2

Languages

imnotkind/HarmAug

Folders and files

Latest commit

History

Repository files navigation

HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models

Reproduction Steps

Reference

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages