Skip to content

Repository for the paper Measuring Idiomaticity in Text Embedding Models with epsilon-compositionality, EACL 2026.

License

Notifications You must be signed in to change notification settings

ltgoslo/epsilon-compositionality

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Measuring Idiomaticity in Text Embedding Models with 𝜀-compositionality

This is the repository for our EACL 2026 paper:

Sondre Wold*, Étienne Simon*, Erik Velldal, Lilja Øvrelid. 2026. Measuring Idiomaticity in Text Embedding Models with 𝜀-compositionality

The data is included as a git submodule, from there the python module epsilon_compositionality runs all the necessary code and outputs a results directory containing the LaTeX code we included in the article. To reproduce everything, run the following:

git clone --recurse-submodules https://github.com/ltgoslo/epsilon-compositionality
python -m epsilon_compositionality

For a more decomposed (pun intended) approach, this is equivalent to:

python -m epsilon_compositionality.build_dataset
python -m epsilon_compositionality.extract_similarities
python -m epsilon_compositionality.compute_statistics

About

Repository for the paper Measuring Idiomaticity in Text Embedding Models with epsilon-compositionality, EACL 2026.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published