-
Notifications
You must be signed in to change notification settings - Fork 1
Classify Emails
Panagiotis Antoniadis edited this page Jun 23, 2019
·
1 revision
One way to create specific language models for different categories of emails is to classify them using a Greek topic classifier. In a 2018 Google Summer of Code project, a Greek topic classifier was implemented as a part of integrating the Greek language into Spacy. The output categories are Sports, Greece, Science, World News, Economy, Environment, Politics, Art, Health.
The classification.py
tool classifies fetched emails in these categories using the API of the classifier:
Usage:
$ python classification.py -h
usage: classification.py [-h] --input INPUT --output OUTPUT
Classify emails in predefined categories. More info on the classifier here:
https://github.com/eellak/nlpbuddy/wiki/Category-prediction
optional arguments:
-h, --help show this help message and exit
required arguments:
--input INPUT Input directory
--output OUTPUT Output directory
The results of the classification were not good enough since the categories are not representative of emails that people usually send. Therefore, clustering methods will be used that are described here.