Feat: Add team Shaikespear submission from NeurIPS E2LM Competition #3437

younesbelkada · 2025-11-29T10:32:17Z

What does this PR do ?

This new evaluation benchmark was submitted at the NeurIPS 2025 E2LM competition, and reached $3^{rd}$ place on the general leaderboard.

Its intended use is within the context of Small Language Model (SLM) evaluation in early training stages. More details are provided in the competition proposal paper.

Example command to get started:

lm_eval --model hf \                                                                                              
    --model_args pretrained=EleutherAI/pythia-160m,revision=step100000,dtype="float" \
    --tasks sciknoweval_mcqa \
    --device cuda:0 \
    --batch_size 8

Original authors

@DaGrapix @EricSaikali

@baberabb

Co-authored-by: Anthony Kalaydjian <[email protected]> Co-authored-by: EricSaikali <[email protected]>

Add Shaipeskear submission

e119859

Co-authored-by: Anthony Kalaydjian <[email protected]> Co-authored-by: EricSaikali <[email protected]>

younesbelkada requested a review from baberabb as a code owner November 29, 2025 10:32

pre-commit

0214e6c

younesbelkada changed the title ~~Feat: Add team Shaikespear submission~~ Feat: Add team Shaikespear submission from NeurIPS E2LM Competition Nov 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feat: Add team Shaikespear submission from NeurIPS E2LM Competition #3437

Feat: Add team Shaikespear submission from NeurIPS E2LM Competition #3437

younesbelkada commented Nov 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Feat: Add team Shaikespear submission from NeurIPS E2LM Competition #3437

Are you sure you want to change the base?

Feat: Add team Shaikespear submission from NeurIPS E2LM Competition #3437

Conversation

younesbelkada commented Nov 29, 2025

What does this PR do ?

Original authors

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant