Περίληψη: | The aim of this thesis is to examine whether the integration of sentiment variables, ex tracted from Reuters, The Guardian and CNBC and quantified with the use of the NLP Sentiment Analysis tool VADER (Valence Aware Dictionary and sEntiment Reasoner), can enhance the forecasting performance of models that target Realized Volatility of the S&P500 index and the VIX index. Utilizing data spanning from the 20th of March 2018 up until the 16th of July 2020, recursive one step ahead forecasting is implemented, using the period of the 20th of March 2018 up until the 31st of December 2019 as the in-sample period, while retaining the period from the 2nd of January 2020 up until the 16th of July for out-of-sample forecasting. A bench-marking approach is applied, where a Random Walk model is set as a floor, while the Heterogeneous Auto-regressive (HAR) model, well known in the literature for being robust in its forecasting capabilities, is set as a ceiling in performance. The methodology entails employing an AR(1) model, and then enhancing it, along with the HAR, with the aforementioned sentiment variables. The performance of the models created is then tested via a variety of evaluation functions, while the Diebold-Mariano test is also utilized as a second phase of evaluation. It is found that the sentiment variables generated do in fact provide a boost in forecasting perfor mance in several occasions, and even though when forecasting the VIX all models prove inferior even to the naive Random Walk in performance, a HAR variant enhanced with the sentiment variable extracted from Reuters is eventually proven to possess the best forecasting capabilities when targetting Realized Volatility, surpassing even the vanilla HAR.
|