Sitemap

Can you predict the price of bitcoin using a linear regression model?

5 min readMay 21, 2024
Press enter or click to view image in full size

This may very well be my last post on predicting bitcoin prices (for a while anyway) becauseIi have covered several different models that can be used to make these predictions. I decided to make one more post and discuss what is perhaps one of the simplest machine learning models, linear regression.

Linear regression is a type of supervised machine learning algorithm that learns from labelled datasets and maps the data points to the most optimised linear functions, which can be used for prediction on new datasets. Linear regression is transparent, easy to implement, and serves as a foundational concept for more complex algorithms.

The formula for linear regression can be seen below:-

Press enter or click to view image in full size

Where:-

Y is the dependent variable, plotted along the Y axis

X is the independent variable, plotted along the X axis

A is the intercept

B is the slope

The system architecture of the model can be seen in the diagram below:-

I have written the program in Python using Google Colab, which is a free online Jupyter Notebook hosted by Google.

When the Jupyter Notebook was created, I imported the libraries that I would need to execute the program, being:-

  1. Pandas to create dataframes and process data,
  2. Numpy to create numpy arrays and perform numerical computations,
  3. Datetime to timestamp dates,
  4. Math to make mathematical computations,
  5. Sklearn to provide machine learning functionality,
  6. Matplotlib to visualise the data, and
  7. Seaborn to statistically visualise the data.
Press enter or click to view image in full size

I was able to obtain the bitcoin price data from coincodex, and it can be found here:- https://coincodex.com/crypto/bitcoin/historical-data/

I used pandas to read the csv file into the program and convert it to a dataframe, df:-

Press enter or click to view image in full size

I then used pandas to timestamp the column, End:-

Press enter or click to view image in full size

I utilised feature selection and selected the columns, End and Close, to create a new dataframe that would be used to plot the data onto a graph:-

I used matplotlib to plot the time series data onto a graph:-

Press enter or click to view image in full size

I then created another dataframe, ml_btc, and set the index to be the End column:-

I then reversed the dataframe because the earliest date needs to be at row:-

I defined the variable, train_len, which sets the length of the training set.

I then defined the length of training and validation sets as being 80% and 20% respectively:-

I used sklearn’s MinMaxScaler to scale the train and validation sets to values between 0 and 1:-

I defined X_train and y_train by converting the training set into a supervised learning configuration:-

I defined X_val and y_val by converting the validation set into a supervised learning configuration:-

Once the training and validation sets had been converted to supervised learning configurations, I converted the independent variables to dataframes and the dependent variables to series:-

I used sklearn’s Linear Regression model to train the training set into the model.

I then made predictions on the validation set:-

Press enter or click to view image in full size

I reverse scaled the validation set to take it back to the pre-normalisation values:-

Press enter or click to view image in full size

I also reverse scaled the predictions:-

Press enter or click to view image in full size

I then calculated the error of the predictions and I have to say that I was astonished to find that it was 1609, much lower than the error that I found in any of the deep learning models that I had previous developed:-

Lastly, I used matplotlib to plot the predictions against the actual values:-

Press enter or click to view image in full size

I was very surprised to find that a simple linear regression model outperformed neural networks. It is for that reason that, when working on predictive models, to try the data out on several different models before making a decision on which one to use. I do have to say that I personally would not have thought to make bitcoin predictions using a linear regression model, but this model turned out to be a quite satisfactory algorithm to use in this instance.

I have created a code review to accompany this blog post and it can be viewed here:- https://youtu.be/TgduQ2EoTa8

--

--

Crystal X
Crystal X

Written by Crystal X

I have over five decades experience in the world of work, being in fast food, the military, business, non-profits, and the healthcare sector.

Responses (1)