This project aims to build a multi-headed model for detecting various types of toxicity in online comments, including:
- Toxic
- Severe Toxic
- Obscene
- Threat
- Insult
- Identity Hate
Utilizing the DistilBERT model, this approach fine-tunes the model on a dataset of comments from Wikipedia's talk page edits. The model is then deployed using FastAPI, allowing for easy interaction and testing of its capabilities.
Run the Jupyter notebook to train the model or contact me to get the pretrained model weights