Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release kbl-v0.1 #2476

Closed
wants to merge 8 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 36 additions & 0 deletions lm_eval/tasks/kbl/bar_exam/civil/_base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
tag:
- kbl
- kbl_bar_exam_em
- kbl_bar_exam_em_civil
description: '당신은 사용자의 질문에 친절하고 논리적으로 답변해 주는 법률 전문가 챗봇 입니다.\n'
dataset_path: lbox/kbl
test_split: test
output_type: generate_until
doc_to_text: '### 질문: {{question}}

다음 각 선택지를 읽고 A, B, C, D, E 중 하나를 선택하여 ''답변: A'' 와 같이 단답식으로 답해 주세요.

A. {{A}}

B. {{B}}

C. {{C}}

D. {{D}}

E. {{E}}

### 답변: '
doc_to_target: gt
metric_list:
- metric: exact_match
aggregation: mean
higher_is_better: true
ignore_case: true
ignore_punctuation: true
filter_list:
- name: get-answer
filter:
- function: regex
regex_pattern: ([A-E]).*
- function: take_first
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_civil_2012
dataset_name: bar_exam_civil_2012
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_civil_2013
dataset_name: bar_exam_civil_2013
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_civil_2014
dataset_name: bar_exam_civil_2014
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_civil_2015
dataset_name: bar_exam_civil_2015
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_civil_2016
dataset_name: bar_exam_civil_2016
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_civil_2017
dataset_name: bar_exam_civil_2017
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_civil_2018
dataset_name: bar_exam_civil_2018
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_civil_2019
dataset_name: bar_exam_civil_2019
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_civil_2020
dataset_name: bar_exam_civil_2020
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_civil_2021
dataset_name: bar_exam_civil_2021
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_civil_2022
dataset_name: bar_exam_civil_2022
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_civil_2023
dataset_name: bar_exam_civil_2023
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
task: kbl_bar_exam_em_civil_2023
dataset_kwargs:
data_files:
test: bar_exam/civil/civil2023.json
include: _base_em_yaml
36 changes: 36 additions & 0 deletions lm_eval/tasks/kbl/bar_exam/criminal/_base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
tag:
- kbl
- kbl_bar_exam_em
- kbl_bar_exam_em_criminal
description: '당신은 사용자의 질문에 친절하고 논리적으로 답변해 주는 법률 전문가 챗봇 입니다.\n'
dataset_path: lbox/kbl
test_split: test
output_type: generate_until
doc_to_text: '### 질문: {{question}}

다음 각 선택지를 읽고 A, B, C, D, E 중 하나를 선택하여 ''답변: A'' 와 같이 단답식으로 답해 주세요.

A. {{A}}

B. {{B}}

C. {{C}}

D. {{D}}

E. {{E}}

### 답변: '
doc_to_target: gt
metric_list:
- metric: exact_match
aggregation: mean
higher_is_better: true
ignore_case: true
ignore_punctuation: true
filter_list:
- name: get-answer
filter:
- function: regex
regex_pattern: ([A-E]).*
- function: take_first
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_criminal_2012
dataset_name: bar_exam_criminal_2012
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_criminal_2013
dataset_name: bar_exam_criminal_2013
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_criminal_2014
dataset_name: bar_exam_criminal_2014
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_criminal_2015
dataset_name: bar_exam_criminal_2015
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_criminal_2016
dataset_name: bar_exam_criminal_2016
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_criminal_2017
dataset_name: bar_exam_criminal_2017
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_criminal_2018
dataset_name: bar_exam_criminal_2018
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_criminal_2019
dataset_name: bar_exam_criminal_2019
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_criminal_2020
dataset_name: bar_exam_criminal_2020
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_criminal_2021
dataset_name: bar_exam_criminal_2021
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_criminal_2022
dataset_name: bar_exam_criminal_2022
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_criminal_2023
dataset_name: bar_exam_criminal_2023
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
task: kbl_bar_exam_em_criminal_2024
dataset_kwargs:
data_files:
test: bar_exam/criminal/criminal2024.json
include: _base_em_yaml
36 changes: 36 additions & 0 deletions lm_eval/tasks/kbl/bar_exam/public/_base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
tag:
- kbl
- kbl_bar_exam_em
- kbl_bar_exam_em_public
description: '당신은 사용자의 질문에 친절하고 논리적으로 답변해 주는 법률 전문가 챗봇 입니다.\n'
dataset_path: lbox/kbl
test_split: test
output_type: generate_until
doc_to_text: '### 질문: {{question}}

다음 각 선택지를 읽고 A, B, C, D, E 중 하나를 선택하여 ''답변: A'' 와 같이 단답식으로 답해 주세요.

A. {{A}}

B. {{B}}

C. {{C}}

D. {{D}}

E. {{E}}

### 답변: '
doc_to_target: gt
metric_list:
- metric: exact_match
aggregation: mean
higher_is_better: true
ignore_case: true
ignore_punctuation: true
filter_list:
- name: get-answer
filter:
- function: regex
regex_pattern: ([A-E]).*
- function: take_first
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_public_2012
dataset_name: bar_exam_public_2012
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_public_2013
dataset_name: bar_exam_public_2013
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_public_2014
dataset_name: bar_exam_public_2014
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_public_2015
dataset_name: bar_exam_public_2015
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_public_2016
dataset_name: bar_exam_public_2016
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_public_2017
dataset_name: bar_exam_public_2017
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_public_2018
dataset_name: bar_exam_public_2018
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_public_2019
dataset_name: bar_exam_public_2019
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_public_2020
dataset_name: bar_exam_public_2020
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_public_2021
dataset_name: bar_exam_public_2021
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_public_2022
dataset_name: bar_exam_public_2022
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_public_2023
dataset_name: bar_exam_public_2023
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
task: kbl_bar_exam_em_public_2024
dataset_kwargs:
data_files:
test: bar_exam/public/public2024.json
include: _base_em_yaml
34 changes: 34 additions & 0 deletions lm_eval/tasks/kbl/bar_exam/responsibility/_base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
tag:
- kbl
- kbl_bar_exam_em
- kbl_bar_exam_em_responsibility
description: '당신은 사용자의 질문에 친절하고 논리적으로 답변해 주는 법률 전문가 챗봇 입니다.\n'
dataset_path: lbox/kbl
test_split: test
output_type: generate_until
doc_to_text: '### 질문: {{question}}

다음 각 선택지를 읽고 A, B, C, D 중 하나를 선택하여 ''답변: A'' 와 같이 단답식으로 답해 주세요.

A. {{A}}

B. {{B}}

C. {{C}}

D. {{D}}

### 답변: '
doc_to_target: gt
metric_list:
- metric: exact_match
aggregation: mean
higher_is_better: true
ignore_case: true
ignore_punctuation: true
filter_list:
- name: get-answer
filter:
- function: regex
regex_pattern: ([A-D]).*
- function: take_first
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_responsibility_2010
dataset_name: bar_exam_responsibility_2010
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_responsibility_2011
dataset_name: bar_exam_responsibility_2011
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_responsibility_2012
dataset_name: bar_exam_responsibility_2012
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_responsibility_2013
dataset_name: bar_exam_responsibility_2013
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_responsibility_2014
dataset_name: bar_exam_responsibility_2014
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_responsibility_2015
dataset_name: bar_exam_responsibility_2015
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_responsibility_2016
dataset_name: bar_exam_responsibility_2016
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_responsibility_2017
dataset_name: bar_exam_responsibility_2017
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_responsibility_2018
dataset_name: bar_exam_responsibility_2018
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_responsibility_2019
dataset_name: bar_exam_responsibility_2019
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_responsibility_2020
dataset_name: bar_exam_responsibility_2020
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_responsibility_2021
dataset_name: bar_exam_responsibility_2021
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
task: kbl_bar_exam_em_responsibility_2022
dataset_name: bar_exam_responsibility_2022
include: _base_em_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
task: kbl_bar_exam_em_responsibility_2023
dataset_kwargs:
data_files:
test: bar_exam/responsibility/responsibility2023.json
include: _base_em_yaml
20 changes: 20 additions & 0 deletions lm_eval/tasks/kbl/knowledge/_kbl_knowledge_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
tag:
- kbl
- kbl_knowledge_em
description: '당신은 사용자의 질문에 친절하고 논리적으로 답변해 주는 법률 전문가 챗봇 입니다.\n'
dataset_path: lbox/kbl
test_split: test
output_type: generate_until
doc_to_target: "{{label}}"
metric_list:
- metric: exact_match
aggregation: mean
higher_is_better: true
ignore_case: true
ignore_punctuation: true
filter_list:
- name: "get-answer"
filter:
- function: "regex"
regex_pattern: "([A-E]).*"
- function: "take_first"
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
task: kbl_knowledge_common_legal_mistake_qa_em
dataset_name: kbl_knowledge_common_legal_mistake_qa
doc_to_text: "### 질문: {{question}}\nA. {{A}}\nB. {{B}}\nC. {{C}}\n'A', 'B', 'C' 중 하나를 선택하여 ''답변: A'' 와 같이 단답식으로 답해 주세요."
include: _kbl_knowledge_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
task: kbl_common_legal_mistake_qa_reasoning_em
dataset_name: kbl_knowledge_common_legal_mistake_qa_reasoning
doc_to_text: "### 질문: {{question}}\nA. {{A}}\nB. {{B}}\nC. {{C}}\n'A', 'B', 'C' 중 하나를 선택하여 ''답변: A'' 와 같이 단답식으로 답해 주세요."
include: _kbl_knowledge_yaml
Loading
Loading