Skip to content

1. Using facebook's Prophet to predict Bitcoin values. 2. Using Reddit comments and scores to measure time series sentiment, as a supplementary feature in a multivariate LSTM RNN.

Notifications You must be signed in to change notification settings

Mikhail-Naumov/Another_Bitcoin_Predictor

Repository files navigation

Time Series Analysis using LSTM RNNs and ARIMA.

Repository Contents

FILENAME DESCRIPTION
README Repo Overview & Simple Time Series EDA
LSTM README Explanation & Implementation of LSTM
Time Series README Explanation & Implementation of ARIMA & Prophet
LSTM Notebook Notebook with Process & Model
ARIMA & Prophet Notebook Notebook with Process and Model

README Contents

What is Time Series Analysis

Time Series Analysis is the process of interpreting trends from historical data. In this case we will the historical trends of Bitcoin to predict its future, additionally we will use the public perception of Bitcoin as a supplemental variable to facilitate that prediction. The rationale behind that being, perception may likely affect the trend, and perception will have its own underlying trend.

  • note: Public perception of Bitcoin will be determined using sentiment analysis in the Bitcoin & Cryptocurrency subreddit of Reddit. The choice being due to the fact that it is a very large and active community who's main topic of conversation will be Bitcoin.

Data Extraction

After pulling from the API at alphavantage.co and preliminary cleaning, we see the raw Bitcoin opening values.

raw_1

PRAW to facilitate Reddit Comment aggregation.

Rolling Average

Because of day to day variation, rolling averages are used to look at the general trends by day, week, month & year

raw_2

The weekly or monthly rolling average seems to be the cleanest and will likely be used over the daily values. There is not enough years over which data has been collected for looking at a yearly trend to be something you might glean immediate value from.

Time Adjusted

Because of the dragging tail, we will move the window of observation from the first day Bitcoin was launched, to Oct 25, 2017 we have a far cleaner picture of its patterns.

screen shot 2018-07-18 at 8 17 08 pm

after

This window change was picked as it:

  • As to avoid giving more weight during which Bitcoin was worthless, which while it changes the data, but I felt it was relevant because it show the times during which Bitcoin truly existed.

  • Bitcoin's very low value could be highly due to its nature being unknown to the majority of the public, after that arbitrary point, it became more well known and thus more active

Again, let us look at the rolling trends screen shot 2018-07-18 at 8 19 42 pm Here we can see a goldilocks like selection of rolling average:

  • Daily : Lots of observations, not very smooth
  • Monthly : Very smooth, not many observations
  • Weekly : Good balance of smooth & number of observations
It would make the most sense to use a Weekly average

Autocorrelation

As in all time series analysis have a level of autocorrelation, "That is to say: today’s value is dependent yesterday’s value". So we will look to see how heavily these data correlate with itself.

ac_1

ac_2

Each day is autocorrelated up to 2 timestamps

Seasonal Decomposition

season

Here we see:

  • a clearer trend over time peaking in late Dec
  • Evidence that Bitcoin does not seem to be heavily affected by seasonality, only +/- 100
  • Residuals seem to make up the majority of variability, (>2500) which makes sense considering how bitcoin may have strong social factors driving its growth rather than seasonal/global

Model Results

FILENAME DESCRIPTION
LSTM README Explanation & Implementation of LSTM
Time Series README Explanation & Implementation of ARIMA & Prophet
LSTM Notebook Notebook with Process & Model
ARIMA & Prophet Notebook Notebook with Process and Model

The training data consisted of all the data between the starting point (Oct-25-2017) and March-2018

The testing/predictive data extended past March and into April-25-2018

split

LSTM RNN

To maintain uniformity between Univariate & Multivariate LSTMs the underlying structure is the same:

  • 128 Memory LSTM Layer
  • 1 Neuron Dense Layer
    • Batch : 16
    • Optimizer : adam
    • Activator : relu
    • Loss : mean absolute error
    • Epochs : 50

Univariate on testing data

  • Predicting daily opening bell variations, using trends in daily variations

Multivariate on the testing data

  • Predicting daily opening bell variations, use of trends in:
    • Reddit Sentiment (both Positive & Negative, Scaled & Unscaled)
    • Daily opening, volume, cap ... variations

Univariate Predictions:

screen shot 2018-07-24 at 3 13 53 pm

Multivariate Predictions:

multi_pred

Difference between Univariate & Multivariate:

compare_pred mae

Prophet

screen shot 2018-07-18 at 9 05 44 pm

part_prophet_1

part_prophet_2

part_prophet_decomp

Future Directions

  • Reddit comments as another predictor.
  • Non opening values as predictors
  • ARIMA Modeling

Project Concepts

Data munging; time series; lstm neural networks, ARIMA, Prophet, Cryptocurrency, magic

About

1. Using facebook's Prophet to predict Bitcoin values. 2. Using Reddit comments and scores to measure time series sentiment, as a supplementary feature in a multivariate LSTM RNN.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published