skey
is a Python package for state-of-the-art automatic musical key detection from audio recordings, based on the S-KEY model proposed by Yuexuan Kong et al. The package provides an efficient pipeline for loading audio and inferring musical key using a trained deep learning model ChromaNet. It will be made into a PyPI package soon.
- 🎼 End-to-end musical key detection from raw audio
- 🧠 A open-sourced pretrained model
- ⚙️ Simple CLI and Python API
- 💽 Support for
.wav
,.mp3
, etc. - 🔌 CPU and GPU support
poetry install
poetry skey path/to/audio --device cpu
This will run key detection on the specified audio file or directory using the default model and settings. The prediction will be printed if the path is an audio file, will be saved into a .csv file if the path is a directory.
To specify additional options, use the following arguments:
poetry skey path/to/audio --checkpoint path/to/model.pt --ext mp3 --device cpu
--checkpoint
: Path to a custom model checkpoint (.pt
). If not provided, the default model is used.--ext
: Audio file extension (default:wav
), ifpath/to/audio
is a directory. Else infers the extension from the file. Supports all formats readable by torchaudio.--device
: Device to run on (default:cpu
, e.g.,cuda
,mps
).
Arguments:
Argument | Description |
---|---|
path/to/audio |
Path to directory with audio files or a single audio file |
--checkpoint |
Path to model checkpoint (.pt ). Loads default if not provided. |
--ext |
Audio file extension (default: wav , supports all formats that can be read by torchaudio) if path/to/audio is a directory |
--device |
Device to run on (default: cpu , e.g., cuda , mps ) |
from skey import detect_key
detect_key(
audio_dir="path/to/audio",
extension="mp3",
device="cpu"
)
Parameters:
audio_dir
(str): Path to the audio file or directory containing audio filesckpt_path
(str or None, optional): Path to the model checkpoint file. IfNone
, the default model is used.extension
(str, optional): File extension (default:"wav"
)device
(str, optional): Device to run on (default:cpu
)
skey
├── Dockerfile
├── LICENSE
├── Makefile
├── poetry.lock
├── pyproject.toml
├── README.md
├── skey
│ ├── __init__.py
│ ├── chromanet.py
│ ├── cli.py
│ ├── convnext.py
│ ├── hcqt.py
│ ├── key_detection.py
│ ├── models
│ │ └── skey.pt
├── tests
│ ├── nocturne_n02_in_e-flat_major.mp3
│ └── test_detect_key.py
└── training_utils
├── config
│ └── skey.gin
├── skey_loss.py
└── skey.py
training_utils/
directory is not used in the skey
package for inference. However, it is essential if you plan to retrain the model. It contains:
- full model definition
- loss functions
- Configuration file (
skey.gin
)
To retrain, you will need to plug in your own dataloader and training loop using this codebase as a foundation.
If you use this work in your research, please cite:
@INPROCEEDINGS{kongskey2025,
author={Kong, Yuexuan and Meseguer-Brocal, Gabriel and Lostanlen, Vincent and Lagrange, Mathieu and Hennequin, Romain},
booktitle={ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
title={S-KEY: Self-supervised Learning of Major and Minor Keys from Audio},
year={2025},
pages={1-5},
doi={10.1109/ICASSP49660.2025.10890222}}
The code of SKEY is MIT-licensed.