DeepNSM

DeepNSM is a large language model that has been fine-tuned for generating Natural Semantic Metalanguage explications of word-meanings. This repository includes the code, models, and experiment code for the paper "Towards Universal Semantics With Large Language Models."

The models used in the paper are uploaded on HuggingFace:

https://huggingface.co/baartmar/DeepNSM-1B https://huggingface.co/baartmar/DeepNSM-8B https://huggingface.co/baartmar/nsm_dataset

For more details, please see our preprint on arxiv:

@misc{baartmans2025universalsemanticslargelanguage,
      title={Towards Universal Semantics With Large Language Models}, 
      author={Raymond Baartmans and Matthew Raffel and Rahul Vikram and Aiden Deringer and Lizhong Chen},
      year={2025},
      eprint={2505.11764},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2505.11764}, 
}

Requirements

Python 3.12.5. The code for this repository is tested on Python 3.12.5. If you would like to use newer versions of python, you may need to relax some of the version constraints on requirements.txt to do so.
NVIDIA GPU with CUDA Support. A card with > 16GB VRAM is likely required to run full experiments.
You will probably need at least 8GB of free disk space to install the packages and download model weights, but much more is likely needed for running full experiments.

First-Time Setup

Create a virtual environment (recommended) and install dependencies

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Create a .env file and fill it in with the correct values and API keys. To run the

cp .env.example .env

(Optional) If you run a 50 series GPU or newer, you will likely need to install Torch 2.7 over the version provided in requirements.txt in order to run the LLMs properly.

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128

Start-Up

Activate the virtual environment you created and the env file

source .venv/bin/activate
source .env

Try out DeepNSM.

Colab Demo: https://colab.research.google.com/drive/1kWesMSQOgKOsXxONvZyinpdgh86gDBcy?usp=drive_link

To run the DeepNSM models on your machine, you will need to have followed the setup guide for this repository. You will also need a NVIDIA GPU capable of running inference on up to 8B parameter LLMs, if you would like to try the 8B variants. This script also allows you to try out DeepNSM-1B and Llama-3.2-1B for generating NSM explications.

python test_deepnsm.py

Run Experimental Evaluation

Follow setup and startup instructions, then run the following script.

mkdir results
python nsm_evaluation.py --config_path eval_config.json

You can view the config JSON to see what models are being used for testing and evaluation. This will take some time to run and will require a moderately strong GPU, if you are running DeepNSM or Llama models locally. The results will be stored in a folder called "results."

Llama3 Fine-Tuning for NSM

Installation

To run install the dependencies with pip install -r requirements.txt.

Example

Run the following script for fine-tuning.

python3 train.py
	--model meta-llama/Llama-3.2-1B --training-set baartmar/nsm_dataset
	--lora-alpha 16 --lora-dropout 0.1 --lora-r 64 --peft
	--use-4bit --bnb-4bit-compute-dtype bfloat16 --bnb-4bit-quant-typenf4 --bnb
	--bsz 64 --update-freq 1 --optim paged_adamw_32bit --lr 2e-4 --lr-scheduler inverse_sqrt
	--warmup-ratio 0.03 --max-grad-norm 0.3 
	--save-interval 1000 --eval-interval 1000 --log-interval 1000
	--max-seq-length 256 --save-strategysteps --num-train-epochs 1
	--output-dir ${SAVE_DIR} 
	--eval-strategy steps --train

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
data_prep		data_prep
prompts		prompts
results		results
train_wrappers		train_wrappers
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
calculate_error.py		calculate_error.py
colab_demo.ipynb		colab_demo.ipynb
eval_config.json		eval_config.json
nsm_evaluation.py		nsm_evaluation.py
prompts.py		prompts.py
requirements.txt		requirements.txt
test_deepnsm.py		test_deepnsm.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DeepNSM

Requirements

First-Time Setup

Start-Up

Try out DeepNSM.

Run Experimental Evaluation

Llama3 Fine-Tuning for NSM

Installation

Example

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

OSU-STARLAB/DeepNSM

Folders and files

Latest commit

History

Repository files navigation

DeepNSM

Requirements

First-Time Setup

Start-Up

Try out DeepNSM.

Run Experimental Evaluation

Llama3 Fine-Tuning for NSM

Installation

Example

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages