GitHub - centre-for-humanities-computing/literary_sentiment_benchmarking

What to do:

install requirements
run main command like so:

python -m src.sentiment_benchmarking
--dataset-name chcaa/fiction4sentiment
--model-names cardiffnlp/twitter-xlm-roberta-base-sentiment-multilingual
--model-names cardiffnlp/xlm-roberta-base-sentiment-multilingual
--model-names MiMe-MeMo/MeMo-BERT-SA
--model-names alexandrainst/da-sentiment-base
--model-names vesteinn/danish_sentiment
--translate

option to add, for testing a number of rows to run, e.g.: --n-rows 10

option to add, for google-translating, --translate (NB: will be slow going)

I.e.: script, model_names (can be multiple) and n-rows (optional) and translate (optional)
results (csv cols data & spearman results) in results folder

What it does:

takes (transformers-compatible) finetuned models to SA-score sentences of a HF dataset (default Fiction4-dataset); format should be ["text", "label"], where label is human gold standard
cleans up text (whitespace removal mainly)
takes the categorical scoring (positive, [neutral,] negative) and turns it continuous, using the model's assigned confidence score (see utils.py, conv_scores function) & saves results
computes the spearman correlation of chosen models (+precomputed* vader and roberta_xlm_base) with human gold standard ("label") both on the raw and -- [forthcoming] detrended gold standard (see utils.py)

*Note that the precomputed vader & roberta_xlm_base were applied to Google-translated sentences. For details, see our paper here -- these are the baselines to beat.

What it could do [forthcoming]

Compare performance on continuous-scale converted scores to binary classification performance (using, e.g., the memo dataset: https://huggingface.co/datasets/MiMe-MeMo/MeMo-Dataset-SA)

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
__pycache__		__pycache__
results		results
src		src
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
notebook.ipynb		notebook.ipynb
requirements.txt		requirements.txt
sentiment.log		sentiment.log
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

What to do:

What it does:

What it could do [forthcoming]

About

Uh oh!

Releases

Packages

Languages

centre-for-humanities-computing/literary_sentiment_benchmarking

Folders and files

Latest commit

History

Repository files navigation

What to do:

What it does:

What it could do [forthcoming]

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages