You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: lm_eval/tasks/sciknoweval_mcqa/README.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,7 @@ Its intended use is within the context of Small Language Model (SLM) evaluation
7
7
8
8
This task uses a subset of the [SciKnowEval](https://huggingface.co/datasets/hicai-zju/SciKnowEval) dataset. Specifically, it filters out non-MCQA samples and focuses on questions from levels L1, L2, and L3, which are designed to assess knowledge memory, comprehension and reasoning respectively, as described in the original [paper](https://arxiv.org/pdf/2406.09098v2).
9
9
10
-
The full SciKnowEval dataset is a comprehensive benchmark for evaluating the scientific knowledge reasoning capabilities of Large Language Models (LLMs). It spans across a few scientific domains: Physics, Chemistry, Biology and Materials.
10
+
The full SciKnowEval dataset is a comprehensive benchmark for evaluating the scientific knowledge reasoning capabilities of Large Language Models (LLMs). It spans across a few scientific domains: Physics, Chemistry, Biology and Materials.
0 commit comments