Skip to content

Topic Modeling and Text Network Analysis for Indonesian Tweets on Cryptocurrencies

License

Notifications You must be signed in to change notification settings

SokKanaTorajd/gemastik21

Repository files navigation

gemastikUnjani

This repository stores research results for Gemastik activities in 2021 for the Data Mining competition branch. We conducted research on the topic of cryptocurrency in Indonesia which was discussed on social media twitter. The goal is to determine what sub-topics are discussed from the tweet data that has been collected. Then by using LDA (latent dichellet allocation) for topic modeling and continued by doing a text network on each of the resulting sub-topics.

Dataset

You can download the dataset here: https://www.kaggle.com/wijatama/indonesiancryptotweets Data were collected using web-scraping technique (thanks to Hasan as our Mining Engineer). The data range to be used starts from January 1, 2021 to May 31, 2021.

Indonesian slang-words

We use Indonesian slang-words provided by nasalsabila. You can visit her repo here https://github.com/nasalsabila/kamus-alay

Team Member

guided by: Bpk. Rifqi Ma'arif

About

Topic Modeling and Text Network Analysis for Indonesian Tweets on Cryptocurrencies

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •