Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TriviaQA结果复现求助 #33

Open
HYZ17 opened this issue Jan 14, 2024 · 4 comments
Open

TriviaQA结果复现求助 #33

HYZ17 opened this issue Jan 14, 2024 · 4 comments
Assignees

Comments

@HYZ17
Copy link

HYZ17 commented Jan 14, 2024

你好,我尝试着复现base模型(7B和67B)在TriviaQA上的结果。发现使用tech report 中的prompt格式,结果还是相差了7个点左右。请问可以提供复现的代码吗?感谢你的帮助。

@luofuli
Copy link
Contributor

luofuli commented Feb 4, 2024

@hwxu20 @DeepSeekPH

@hwxu20
Copy link
Contributor

hwxu20 commented Feb 5, 2024

TriviaQA我们测试的是web的子集,实际评测时每个样本选择的few-shot example是随机从train里面挑选的,tech report中只是给出了其中的一个示例。
另外评估结果差7个点可能是对答案的后处理不一致,我们使用的后处理脚本供参考:

def normalize_answer(s):
    """Lower text and remove punctuation, articles and extra whitespace."""

    def remove_articles(text):
        return re.sub(r"\b(a|an|the)\b", " ", text)

    def white_space_fix(text):
        return " ".join(text.split())

    def handle_punc(text):
        exclude = set(string.punctuation + "".join(["‘", "’", "´", "`"]))
        return "".join(ch if ch not in exclude else " " for ch in text)

    def lower(text):
        return text.lower()

    def replace_underscore(text):
        return text.replace("_", " ")

    return white_space_fix(remove_articles(handle_punc(lower(replace_underscore(s))))).strip()

@RoacherM
Copy link

哪里可以看到你们的测评脚本?想复线一波?

@hwxu20
Copy link
Contributor

hwxu20 commented Mar 21, 2024

现在还没有开源评测的脚本

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants