CoMAT: Chain of Mathematically Annotated Thought Improves Mathematical Reasoning

Installation

Clone the repository:
Navigate to the project directory:

cd COMAT

Install the required packages

pip install -r requirements.txt

Evaluation

To evaluate CoMAT and other corresponding methods, run the following command:

python main.py --dataset [dataset] --method [method] --model [model] --dataconfig [dataconfig]

For example.

python main.py --dataset AQUA --method symbolicot --model gpt --dataconfig normal

Datasets You can evaluate the following datasets:

MMLU-Redux
AQUA
GSM8K
Olympiad Bench
GaoKao Mathematics
MGSM

Methods The evaluation can be performed using different methods:

noncot: Standard question-answering without reasoning steps.
cot: Chain of thought reasoning, which involves multi-step reasoning processes.
comat: Our CoMAT approach in utilising symbolic reasoning for reasoning process.

Models The following models are supported:

gpt: gpt-4o
gemini: gemini-1.5-pro
qwen2: 7b and 72b

Dataconfig (optional)

Default: normal (unchanged)
swapping (randomly swap the answers and options) (optional)

Alternatively, we can evaluate using the bash script below:

bash evaluate.sh

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
data_preprocess		data_preprocess
prompts		prompts
.env		.env
README.md		README.md
evaluate.sh		evaluate.sh
evaluate_new.sh		evaluate_new.sh
main.py		main.py
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CoMAT: Chain of Mathematically Annotated Thought Improves Mathematical Reasoning

Installation

Evaluation

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

joshuaongg21/CoMAT

Folders and files

Latest commit

History

Repository files navigation

CoMAT: Chain of Mathematically Annotated Thought Improves Mathematical Reasoning

Installation

Evaluation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages