Skip to content

A voice and speech (Spanish) corpus of patients who underwent upper airway surgery in pre- and post-operative states #27

@aguerrerolopez

Description

@aguerrerolopez

Hi again!

I recommend also to add this dataset:
https://zenodo.org/records/11654546

The data set comprises 3,800 speech audio files of 3 types of upper respiratory tract surgeries and 1 control set. The dataset has an average of 35.51 +- 5.91 audio recordings per patient. It provides valuable resources to the scientific community to systematically investigate the objective effects of upper respiratory tract surgery on voice and speech.

This data set is a complete corpus comprising data from 107 Spanish Castilian speakers. This corpus encompasses voice and speech recordings from both control speakers and patients who underwent upper airway surgical procedures in pre- and post-operative stages. The surgeries in focus include Tonsillectomy, Functional Endoscopic Sinus Surgery, and Septoplasty, all consistently performed by a single surgeon.

There is a paper where the dataset is described:
https://www.nature.com/articles/s41597-024-03540-5

and there is also a github repo where code can be found to preprocess the data and launch some machine learning experiments:
https://github.com/BYO-UPM/CUCO_Database

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions