Skip to content

TimochiL/llm_benchmark

Repository files navigation

LLM Safety Benchmark for KV Cache Quantization

Warning

The files in this repository contain data or code that may be harmful or offensive.

Status

Note

Stable.

Features

This benchmark evaluates the effect of KV cache quantization on LLM response safety using sample questions distributed among 13 forbidden scenarios.

Note

Currently, only the Meta Llama-2 7B Chat model is implemented in the benchmark with HQQ backend. This benchmark serves as a proof-of-concept. Other models, model families, and backends are considerations for future work.

Resources

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages