Guidance Request for Reproducing OpenbookQA Dataset Results #49

FairyFali · 2023-11-17T18:40:11Z

Hi,

I am encountering difficulties in reproducing the experimental results on the OpenbookQA dataset. The output format is unexpected; for instance, I'm getting responses like "1 is correct. 2 is incorrect. 3 is incorrect. 4 is incorrect.", whereas the anticipated format should be "answer1". Could you please provide a detailed command or set of instructions for both fine-tuning and evaluating the model, to enable accurate reproduction of the results on OpenbookQA?

HZQ950419 · 2023-11-19T15:38:37Z

Hi,

I have replied to your email in case you haven't seen it.
You can refer to issue #38 and we use commonsense_evaluate.py for evaluation. BTW, please try to use single GPU for training, multi-GPU training may not reproduce the results. We are still trying to figure out the reason.

If you have further questions, please let us know!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Guidance Request for Reproducing OpenbookQA Dataset Results #49

Guidance Request for Reproducing OpenbookQA Dataset Results #49

FairyFali commented Nov 17, 2023

HZQ950419 commented Nov 19, 2023

Guidance Request for Reproducing OpenbookQA Dataset Results #49

Guidance Request for Reproducing OpenbookQA Dataset Results #49

Comments

FairyFali commented Nov 17, 2023

HZQ950419 commented Nov 19, 2023