add labr arabic task

actualize-ae · Mar 26, 2024 · 9c79834 · 9c79834
1 parent 1dc5cae
commit 9c79834
Show file tree

Hide file tree

Showing 3 changed files with 97 additions and 0 deletions.
diff --git a/lm_eval/tasks/arabic_tasks/README.md b/lm_eval/tasks/arabic_tasks/README.md
@@ -0,0 +1,50 @@
+# ArSentimentAnalysisLabr
+
+### Paper
+
+Title: `LABR: A Large Scale Arabic Book Reviews Dataset`
+
+Abstract: `https://aclanthology.org/P13-2088.pdf`
+
+`
+This dataset contains over 63,000 book reviews in Arabic. It is the largest sentiment analysis dataset for Arabic to-date. The book reviews were harvested from the website Goodreads during the month or March 2013. Each book review comes with the goodreads review id, the user id, the book id, the rating (1 to 5) and the text of the review.`
+
+Homepage: `https://aclanthology.org/P13-2088.pdf`
+
+
+### Citation
+
+```
+@inproceedings{aly2013labr,
+  title={Labr: A large scale arabic book reviews dataset},
+  author={Aly, Mohamed and Atiya, Amir},
+  booktitle={Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)},
+  pages={494--498},
+  year={2013}
+}
+
+```
+
+### Groups and Tasks
+
+#### Groups
+
+* `group_name`: `Short description`
+
+#### Tasks
+
+* `task_name`: `1-sentence description of what this particular task does`
+* `task_name2`: ...
+
+### Checklist
+
+For adding novel benchmarks/datasets to the library:
+* [ ] Is the task an existing benchmark in the literature?
+  * [ ] Have you referenced the original paper that introduced the task?
+  * [ ] If yes, does the original paper provide a reference implementation? If so, have you checked against the reference implementation and documented how to run such a test?
+
+
+If other tasks on this dataset are already supported:
+* [ ] Is the "Main" variant of this task clearly denoted?
+* [ ] Have you provided a short sentence in a README on what each new variant adds / evaluates?
+* [ ] Have you noted which, if any, published evaluation setups are matched by this variant?
diff --git a/lm_eval/tasks/arabic_tasks/ar_sentiment_analysis_labr.yaml b/lm_eval/tasks/arabic_tasks/ar_sentiment_analysis_labr.yaml
@@ -0,0 +1,16 @@
+task: ar_sentiment_analysis_labr
+dataset_path: labr
+output_type: multiple_choice
+training_split: train
+test_split: test
+doc_to_text: !function preprocess_ar_sentiment_analysis_labr.doc_to_text
+doc_to_target: !function preprocess_ar_sentiment_analysis_labr.doc_to_target
+process_docs: !function preprocess_ar_sentiment_analysis_labr.process_docs
+doc_to_choice: ['1','2','3','4', '5']
+metric_list:
+  - metric: acc
+    aggregation: mean
+    higher_is_better: true
+  - metric: acc_norm
+    aggregation: mean
+    higher_is_better: true
diff --git a/lm_eval/tasks/arabic_tasks/preprocess_ar_sentiment_analysis_labr.py b/lm_eval/tasks/arabic_tasks/preprocess_ar_sentiment_analysis_labr.py
@@ -0,0 +1,31 @@
+import datasets
+
+
+def process_docs(dataset: datasets.Dataset):
+
+    choices =['1','2','3','4', '5']
+
+
+    def _helper(doc):
+      print(doc["label"])
+      # modifies the contents of a single
+      # document in our dataset.
+      doc["query"]= doc["text"]  # The query prompt.
+      doc["choices"] = choices
+      doc["gold"] = doc["label"]
+      return doc
+
+    return dataset.map(_helper) # returns back a datasets.Dataset object
+
+def doc_to_text(doc) -> str:
+    print(doc)
+    return (
+            "You are a highly intelligent Arabic speaker  who analyze the following texts and answers with the sentiment analysis.\nOnly write the answer down."
+            + "\n\n**Text:**" + doc["query"] + "\n\n"
+            + ",".join(doc['choices'])
+            + "\n\n**Answer:**"
+    )
+
+
+def doc_to_target(doc) -> int:
+    return " " + doc["choices"][doc["gold"]] + "\n\n"