Thomas Bathgate MSc Finance
This dissertation explores the creation of a sentiment proxy from Twitter posts about the S&P500 between the period of January 2020-January 2021 to address a research gap found between traditional sentiment proxy research and social media sentiment research.
Investor sentiment is elusive.

Wang et al. (2022)

Sentiment research has developed from the early discussions of human aspects of mispricing at the close of the millennium by a new breed of economists with psychological backgrounds. This new field of sentiment research often involves large scale data collection and quantitative studies, often using direct surveys of opinion such as consumer confidence surveys. Sentiment can therefore be split into these ‘direct’ measures of sentiment, and ‘indirect’ measures of sentiment, proxied by technical and market variables.

Such sentiment research came to the fore in the 2000s, with key papers exploring whether direct surveys of consumer sentiment (referred to as ‘traditional sentiment research’ throughout) can be linked with returns in the stock market; effectively exploring sentiment as a new mover of markets. These papers often find that sentiment has statistically significant predictive power over return in different contexts–however most agree on one common point: that sentiment is difficult to quantify, as it has a number of potential sources, and can contain noise from a number of different factors. This often leads to the creation of sentiment indices, where sentiment proxies do not rely on a single data collection method as discussed above, instead choosing to use a number of proxies in one study and extracting a common sentiment component from them.[1]

A new strand of sentiment research emerged towards the late 2010s, with the availability of big data and powerful computer processing allowing for the use of novel data sources for capturing direct sentiment – social media (hereafter referred to as ‘social media sentiment research’ - especially discussion forums such as Yahoo Finance forums, and eventually mainstream platforms such as Twitter. This research, on the other hand, tends to focus more on the power of social media sentiment predictors and does not tend to explore the noise concept common to the traditional literature[2]; leading to the question of whether social media sentiment is a novel proxy.

In the work that follows, an attempt is made to fit between the research areas of the best practices of social media data collection and the theme of predictive sentiment with the constraints of being a noisy proxy within the traditional sentiment research. The selected research questions are as follows:

  • Is Twitter sentiment a new proxy, or does it simply encapsulate other indirect proxies for sentiment?
  • In the case of the S&P500 in the period 2020-2021, does Twitter Sentiment predict stock returns in the context of a Baker and Wurgler-style[3] portfolio approach?

Conclusion

It is clear from the this study that the potential scale of Twitter sentiment research far exceeds the boundaries of a single dissertation or research paper. From the pioneering 2004 research of Antweiler and Frank[4], studies of internet sentiment have grown into highly technical projects – requiring intensive computing power, machine learning algorithms, and datasets that can be as expansive as over a million social media messages. Along the way a few guiding principles have been followed in terms of aims and methodologies, maturing to the point of open-source lexicons and tailor-made social media language analysis algorithms. The literature review follows this maturation and ultimately shows that social media sentiment literature seems now well-positioned in late 2022 to begin to explore the various sources of data with a common framework.

The implications of the above findings are wide ranging, and return again back to the suggestion of Renault[5] that every study into the effect on sentiment on stock returns is inherently unique. Whilst this is true, there are still several methodological points necessary to create a level of comparability that will be vital as social media sentiment research continues to grow.

Primarily, this dissertation has shown that it is possible to conduct a study guided by a combination of best practices from the novel social media analysis and traditional sentiment research. The work of Renault[6] is used as a starting point for the development of the collection of social media messages for sentiment analysis. Notably, this method is diverged from where appropriate – to match the needs of this study, and where impactful novel methods can be developed. This was the case when the VADER algorithm was extended with the lexicon with Renault’s finance lexicon to create a novel finance social media sentiment analysis algorithm.

With this developed, the dissertation turns to the insight of some of the more traditional research papers to attempt to strip some of the macro/indirect sentiment noise from this proxy, following the methods of Brown and Cliff and Baker and Wurgler.[7] The importance of attempting to remove the noise from the Twitter sentiment proxy cannot be understated, and this takes place in two separate places.

Firstly, the idiosyncratic firm risk information is stripped away using PCA before the creation of the portfolios; this is then continued in section 4 when the Twitter Sentiment proxy is correlated with a number of traditional indirect sentiment proxies and is shown to be significantly correlated with two of them; the CEFD and the flow of funds to/from mutual funds and ETFs – PCA is again used to strip this information from the sentiment proxy.

Firstly, what is deemed a suitably clean sentiment proxy is used to regress returns on a number of portfolios, created on the basis of S&P500 firm metric deciles. In terms of comparison with the work of Baker and Wurgler[8], they find higher levels of predictive power with an index of traditional sentiment measures than this study finds. This study is successful in finding evidence of the predictive power of Twitter sentiment, with a portfolio of (effectively) dividend payers vs non-payers having returns predictable using Twitter sentiment when controlling for the Fama-French idiosyncratic risk factors. Notably, although using the portfolio-based approach of Baker and Wurgler (2006), the findings instead align with Brown and Cliff[9] – where sentiment has a negative regression co-efficient.

This is then explored in two contexts, firstly that of Brown and Cliff[10], who propose that high pre-period sentiment leads to a positive mispricing through excessive buying, followed by a sell-off in the observed period leading to negative returns. Secondly, this is explored in the context of the background of Twitter sentiment and the institutional attitude towards social media. Until very recently, in what is termed the ‘post-GameStop’ era, social media information and retail traders were shown to be viewed in a negative, mistrusting light by institutions – terming such traders and information as ‘dumb money’. Therefore for the period observed, it is likely that if social media sentiment was taken as a source of information by institutional traders, this would likely be thought of as uninformed opinions to trade against to take advantages of mispricing – rather than as a legitimate source of prevailing investor sentiment to trade with. Further, institutions may also recognise the findings of Brown and Cliff[11], and seek to trade the end of period mispricing in the opposite direction to profit – as such furthering and strengthening the negative relationship.

The discovered research gap of combining the traditional sentiment research with social media sentiment is then further explored in a novel way, with time-series regressions used to explore how Twitter sentiment spreads over time. In order to do this, to maximise the availability of information, the time-series regressions being used investigate returns across portfolios. Firstly, the cross-portfolio effect of pre-period sentiment is investigated – primarily of interest is the choice of regression, with a fixed effects estimator, correlating errors by portfolio, being selected as optimal considering the outputs of a number of heteroskedasticity and regression suitability tests. The output of this first regression ultimately does not show a significant predictive power of pre-period sentiment, however the large error term when clustering errors by portfolio prompts further research.

These findings culminate in a large time-series regression into the long-term returns predictability of sentiment, from a 1-11 month lag. This follows the literature suggestions[12] that some sentiment measures can have predictive power over US market returns in the medium to long term. The literature is also followed, with a control for pre-period sentiment added to the regression to attempt to separate the effects of lagged sentiment. While this particular regression becomes limited in the long term as the number of data points drops as the prediction spreads to the 11 month regression, significant results are found. A key finding is that the predictive Twitter sentiment does have a predictive effect on returns beyond the one-month lag. Notably, this effect of Twitter sentiment is only seen to manifest from a 6-month period onwards, even ignoring some of the longer-term results where multicollinearity begins to potentially detract from the findings.

Further, the detected predictive effect does not align with other sentiment research[13], not showing a decline in the magnitude of the effect of sentiment on returns over time – instead showing that it remains consistent up to a 12 month period. To further this dissertation, Tweet collection should become a focus to allow for comparisons of sentiment up to the 36 month period of Wang et al[14], to investigate if the trend of consistent effect continues, or whether Twitter sentiment falls in line with the literature findings. Also notable is the direction of impact that Twitter sentiment is shown to have; with the monthly impact switching between a positive and negative effect on returns. This is explored in the context of the spread of information, and the bandwagon effects of interactions on social media.

Modelling and discussions of the spread of information does become another (at the time of writing) underdiscussed avenue for future research when observing the Tweet dataset.

When creating a dataset of Tweets from the API, it is possible to specify whether to include retweets or not. As discussed, retweets are not included in this study – as they simply copy Tweets to the dataset and do not necessarily imply an opinion about the content of the original tweet. However, it can still be seen that some Tweets in the dataset are partly or entirely the same content, posted at different times. What this shows is that some Tweets are actually directly reposted, rather than retweeted. The impact of reposts are taken into account as they form a part of the dataset of Tweets, however in a further study this could be combined with the effect of retweets – with a weighting being applied to original Tweets, retweets, and reposts to capture how information spreads as part of Twitter sentiment. This could then fill the gap in research – few articles at the time of writing even note the existence of retweets – with these being discussed in the context of trends associated with social media.

Overall the sentiment research has come far, however with tools now available to researchers it could be said that the social media sentiment field is just beginning to reach a period where the opportunity for large-scale studies producing results that are comparable across studies is now a realistic option. Certainly, it appears that the future of social media sentiment research is rich.


References

[1] Brown, G. W. and Cliff, M. T., 2004. 'Investor sentiment and the near-term stock market', The Journal of Empirical Finance, Volume 11, pp. 1-27.
Baker, M. and Wurgler, J., 2006. 'Investor sentiment and the cross-section of stock returns'. The Journal of Finance, 61(4), pp.1645-1680.

[2] Renault, T., 2017. 'Intraday online investor sentiment and return patterns in the U.S. stock market', Journal of Banking and Finance, 84(1), pp. 25-40.

[3] Baker and Wurgler, 2006.

[4] Antweiler, W. and Frank, M. Z., 2004. 'Is all that talk just noise? The information content of internet stock message boards', The Journal of Finance, 59(3), pp. 1259-1294.

[5] Renault, T., 2020. 'Sentiment analysis and machine learning in finance: a comparison of methods and models on one million messages', Digital Finance, Volume 2, pp. 1-13.

[6] Renault (2020).

[7] Brown and Cliff, 2004; Baker and Wurgler, 2006.

[8] Baker and Wurgler, 2006.

[9] Brown, G. W. and Cliff, M. T., 2005. 'Investor sentiment and asset valuation'. The Journal of Business, 78(2), pp.405-440.

[10] Brown and Cliff, 2005.

[11] Brown and Cliff, 2005.

[12] Wang, W., Su, C. and Duxbury, D., 2021. 'Investor sentiment and stock returns: Global evidence'. The Journal of Empirical Finance, Volume 63, pp. 365-391.

[13] Wang et al., 2021.

[14] Wang et al., 2021.

16 November 2022