Skip to content

Commit 7d44a7a

Browse files
authored
Update README.md
1 parent c955e4d commit 7d44a7a

File tree

1 file changed

+36
-1
lines changed

1 file changed

+36
-1
lines changed

README.md

+36-1
Original file line numberDiff line numberDiff line change
@@ -1 +1,36 @@
1-
# IndicCounterSpeech
1+
# Low-Resource Counterspeech Generation for Indic Languages: The Case of Bengali and Hindi
2+
[[Paper: Low-Resource Counterspeech Generation for Indic Languages: The Case of Bengali and Hindi]](https://aclanthology.org/2024.findings-eacl.111/)
3+
4+
*Mithun Das, Saurabh Kumar Pandey, Shivansh Sethi, Punyajoy Saha, Animesh Mukherjee* \
5+
**Indian Institute of Technology Kharagpur** \
6+
[European Chapter of the Association for Computational Linguistics (EACL 2024)](https://2024.eacl.org/)
7+
8+
## Abstract
9+
10+
With the rise of online abuse, the NLP community has begun investigating the use of neural architectures to generate counterspeech that can “counter” the vicious tone of such abusive speech and dilute/ameliorate their rippling effect over the social network. However, most of the efforts so far have been primarily focused on English. To bridge the gap for low-resource languages such as Bengali and Hindi, we create a benchmark dataset of 5,062 abusive speech/counterspeech pairs, of which 2,460 pairs are in Bengali, and 2,602 pairs are in Hindi. We implement several baseline models considering various interlingual transfer mechanisms with different configurations to generate suitable counterspeech to set up an effective benchmark. We observe that the monolingual setup yields the best performance. Further, using synthetic transfer, language models can generate counterspeech to some extent; specifically, we notice that transferability is better when languages belong to the same language family.
11+
12+
**[Note]** Code release is in progress. Stay tuned!!
13+
14+
# Citation
15+
16+
## If you find our work useful, please cite using:
17+
```
18+
@inproceedings{das-etal-2024-low,
19+
title = "Low-Resource Counterspeech Generation for {I}ndic Languages: The Case of {B}engali and {H}indi",
20+
author = "Das, Mithun and
21+
Pandey, Saurabh and
22+
Sethi, Shivansh and
23+
Saha, Punyajoy and
24+
Mukherjee, Animesh",
25+
editor = "Graham, Yvette and
26+
Purver, Matthew",
27+
booktitle = "Findings of the Association for Computational Linguistics: EACL 2024",
28+
month = mar,
29+
year = "2024",
30+
address = "St. Julian{'}s, Malta",
31+
publisher = "Association for Computational Linguistics",
32+
url = "https://aclanthology.org/2024.findings-eacl.111",
33+
pages = "1601--1614",
34+
abstract = "With the rise of online abuse, the NLP community has begun investigating the use of neural architectures to generate counterspeech that can {``}counter{''} the vicious tone of such abusive speech and dilute/ameliorate their rippling effect over the social network. However, most of the efforts so far have been primarily focused on English. To bridge the gap for low-resource languages such as Bengali and Hindi, we create a benchmark dataset of 5,062 abusive speech/counterspeech pairs, of which 2,460 pairs are in Bengali, and 2,602 pairs are in Hindi. We implement several baseline models considering various interlingual transfer mechanisms with different configurations to generate suitable counterspeech to set up an effective benchmark. We observe that the monolingual setup yields the best performance. Further, using synthetic transfer, language models can generate counterspeech to some extent; specifically, we notice that transferability is better when languages belong to the same language family.",
35+
}
36+
```

0 commit comments

Comments
 (0)