Multilingual-Claim-Span-Identification

A lot of claims are made in social media posts, which often contain misinformation or fake news. Hence, it is crucial to identify claims as a first step towards claim verification. Given the huge number of social media posts, the task of identifying claims needs to be automated.

This competition deals with the task of 'Claim Span Identification' in which, given a text, parts/spans that correspond to claims are to be identified. This task is more challenging than the traditional binary classification of text into claims or not-claims, and will require state-of-the-art methods in Pattern Recognition, Natural Language Processing and Machine Learning. See Evaluation tab for details.

For this task, we will use a newly developed dataset containing about 8K posts in English and about 8K posts in Hindi with claim-spans marked by human annotators.

Dataset Preparation: Utilized a dataset with 8K English and 8K Hindi posts annotated for claim spans. Prepared the dataset for a token classification task to be used by the model. • Model Development: Fine-tuned BERT and Multilingual BERT models specifically for claim-span identification. Implemented advanced architectures such as BERT with a CNN head for enhanced performance.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
claim-span-identification-in-social-media-posts.ipynb		claim-span-identification-in-social-media-posts.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Multilingual-Claim-Span-Identification

About

Uh oh!

Releases

Packages

Languages

shu-shobhit/Multilingual-Claim-Span-Identification

Folders and files

Latest commit

History

Repository files navigation

Multilingual-Claim-Span-Identification

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages