Skip to content

StergiosChatzikyriakidis/Greek_dialect_corpus

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Greek_dialect_corpus

A collection of raw text from various Greek dialects. Contains data from the following dialects:

  • Cypriot Greek
  • Cretan Greek
  • Pontic Greek
  • Northern Greek
  • Some part of the Modern Greek wikipedia

The repository contains data collected from the web and other textual resources (blogs, websites, theatrical plays among other things). The folder SMG_CG contains twitter data from Standard Modern Greek and Cypriot that have been originally collected by Hanna Sababa for her project A Classifier to Distinguish Between Cypriot Greek and Standard Modern Greek. Mr Sfakianakis is thanked from providing us with his Cretan translations of a number of Ancient Greek tragedies and comedies. The folder all_dialects contains a zip file that has the collection of data with minimal pre-processing and annotation for the respective dialect.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •