An Agentic Fact-Checking Framework for Urdu
with Evidence Boosting and Benchmarking
Overview • Installation • Usage
UrduFactCheck is an open-source fact-checking pipeline for Urdu language. It is designed to be integrated in OpenFactCheck.
First step is to clone the repository:
git clone github.com/mbzuai-nlp/UrduFactCheck.git
cd UrduFactCheck
Then, install the required packages, OpenFactCheck will also be installed as a submodule:
pip install -r requirements.txt
To use UrduFactCheck, you first need to set up the config.json
file for OpenFactCheck. You can use this as a template.
UrduFactCheck provide three type of retrievers:
urdufactcheck_retriever
: This retriever retrieves the evidence directly in Urdu language.urdufactcheck_translator_retriever
: This retriever first translates the query to English and then retrieves the evidence in English and finally translates the evidence back to Urdu.urdufactcheck_thresholded_translator_retriever
: This retriever first retrieves the evidence in Urdu language. If the evidence count is less than the threshold, it boosts the evidence asurdufactcheck_translator_retriever
.
These retrievers can be specified in the pipeline
section of the config.json
file. For example:
{
"pipeline": [
"urdufactcheck_claimprocessor",
"urdufactcheck_thresholded_translator_retriever",
"urdufactcheck_verifier"
],
}
To run the pipeline, you can create a python script as follows:
from openfactcheck import OpenFactCheck, OpenFactCheckConfig
config = OpenFactCheckConfig(
filename_or_path="config.json"
)
response = OpenFactCheck(config).ResponseEvaluator.evaluate(
response="قائداعظم محمد علی جناح پاکستان کے بانی اور پہلے گورنر جنرل تھے۔",
)