Predicting the Price of Bitcoin Using Sentiment-Enriched Time Series Forecasting
Sprache des Titels:
Englisch
Original Kurzfassung:
first_pagesettingsOrder Article Reprints
Open AccessArticle
Predicting the Price of Bitcoin Using Sentiment-Enriched Time Series Forecasting
by Markus Frohmann 1,2ORCID,Manuel Karner 1,Said Khudoyan 1,Robert Wagner 1 andMarkus Schedl 1,2,*ORCID
1
Multimedia Mining and Search Group, Institute of Computational Perception, Johannes Kepler University Linz (JKU), 4040 Linz, Austria
2
Human-Centered AI Group, AI Laboratory, Linz Institute of Technology (LIT), 4040 Linz, Austria
*
Author to whom correspondence should be addressed.
Big Data Cogn. Comput. 2023, 7(3), 137; https://doi.org/10.3390/bdcc7030137
Received: 22 May 2023 / Revised: 21 July 2023 / Accepted: 28 July 2023 / Published: 31 July 2023
(This article belongs to the Topic Artificial Intelligence Applications in Financial Technology)
Downloadkeyboard_arrow_down Browse Figures Versions Notes
Abstract
Recently, various methods to predict the future price of financial assets have emerged. One promising approach is to combine the historic price with sentiment scores derived via sentiment analysis techniques. In this article, we focus on predicting the future price of Bitcoin, which is currently the most popular cryptocurrency. More precisely, we propose a hybrid approach, combining time series forecasting and sentiment prediction from microblogs, to predict the intraday price of Bitcoin. Moreover, in addition to standard sentiment analysis methods, we are the first to employ a fine-tuned BERT model for this task. We also introduce a novel weighting scheme in which the weight of the sentiment of each tweet depends on the number of its creator?s followers. For evaluation, we consider periods with strongly varying ranges of Bitcoin prices. This enables us to assess the models w.r.t. robustness and generalization to varied market conditions. Our experiments demonstrate that BERT-based sentiment analysis and the proposed weighting scheme improve upon previous methods. Specifically, our hybrid models that use linear regression as the underlying forecasting algorithm perform best in terms of the mean absolute error (MAE of 2.67) and root mean squared error (RMSE of 3.28). However, more complicated models, particularly long short-term memory networks and temporal convolutional networks, tend to have generalization and overfitting issues, resulting in considerably higher MAE and RMSE scores.