Keywords

1 Introduction

Acuity TradingFootnote 1 produces a variety of news-based sentiment indicators for many markets’ assets conveying different emotions, with the collaboration of a research team lead by the author from the Polytechnical University of Catalonia.

This alternative data can be used in many ways in the financial business, and the purpose of this note is to give some ideas to practitioners in the industry, and consumers of this sentimental data, on how to make use of these sentiment indicators in their investment decisions. We focus on ideas for the construction of algorithmic trading rules, portfolio selection, and sentimental factor models, which are useful in forecasting, assets’ return covariance estimation and assets’ industry classification. Hence, this is a survey paper of methods for exploiting the news-based sentimental information on markets’ assets, intended for hedge fund managers, traders and practitioners in the financial industry in general.

1.1 Sentiment Analysis in Finance

Several existing studies in behavioural finance have shown evidence to the fact that investors do react to news. Usually, they show greater propensity for making an investment move based on bad news rather than on good news (e.g. as a general trait of human psychology [4, 15], or due to specific investors trading attitudes [6]). Li [10] and Davis, Piger, and Sedor [5], analyse the tone of qualitative information using term-specific word counts from corporate annual reports and earnings press releases, respectively. Tetlock, Saar-Tsechansky and Macskassy [18] examine qualitative information in news stories at daily horizons, and find that the fraction of negative words in firm-specific news stories forecasts low firm earnings. Loughran and McDonald [12] worked out particular lists of words specific to finance, extracted from 10-K filings, and tested whether these lists actually gauge tone. The authors found significant relations between their lists of words and returns, trading volume, subsequent return volatility, and unexpected earnings. The important corollary of these works is that the selection of documents from where to build a basic lexicon has major influence on the accuracy of the final forecasting model, as sentiment varies according to context, and lists of words extracted from popular newspapers or social networks convey emotions differently than words from financial texts. Being aware of this, the sentimental lexicons used in this study are built from financial documents provided by Dow Jones Newswires, and in a way similar to [12].

Once a sound sentiment lexicon is built (and as stated before much the soundness relies on the choice of appropriate news sources), we build sentiment indicators quantifying, on a daily basis (usually), the mood of the public towards a financial entity. Ways of building sentiment indicators are well explained in [11]. Then financial modelling based on these sentiment indicators is done basically from two perspectives: either use the sentiment indicators as exogenous features in econometric or machine learning forecasting models, and test their relevance in forecasting price movements, returns of price or other statistics of the price; or use them as external advisors for ranking the subjects (target-entities) of the news (e.g. exchange market stocks) and create a portfolio. A few selected examples from the vast amount of published research on the subject of forecasting and portfolio management with sentiment data are [9, 12, 18, 19], and further review of econometric models that include text as data can be found in [8].

1.2 The Sentiment Indicators

Acuity trading tracks news for more than 90K companies worldwide, and produces news-based entity sentiment indicators for each one of these. The sentiment indicators are based on proprietary lexicons, from which Acuity is able to extract up to nine different emotions pertaining to a given entity. This article focus on 6 of these sentiment types, which can be grouped into Bullish and Bearish emotions. In the Bullish emotions group we have indicators for (the terminology is from Acuity):

Positivity, Certainty, FinancialUp;

and in the Bearish emotions group we have

Negativity, Uncertainty, FinancialDown.

We can make the following aggregations of the different sentiment indicators exposed above to build general Bull/Bear signals:

  • \(BULL = 0.33\cdot (Positivity + Certainty + FinancialUP) \); that is, at each time step consider the arithmetic average of bullish emotion scores. Likewise, consider

  • \(BEAR = 0.33\cdot (Negativity + Uncertainty + FinancialDown)\);

  • \(BBr = 100\cdot BULL/(BULL + BEAR)\);

  • \(PNlog = 0.5\cdot \ln ((Positivity+1)/(Negativity+1))\).

The BBr has been inspired by the well-known Bull-Bear ratio of Technical Analysis [1], which in the pre-internet era was concocted from market professionals opinion polls. In the sentiment data it may well be that for particular stocks, and for particular timestamps, all bullish and bearish sentiment scores are 0. In this case we interpolate the non-existent (or NA) BBr score by leftmost and rightmost non-NA values. The PNlog is of similar nature as BBr [3].

For readers who are new to sentiment analysis, and its particular application to Finance, a good book to start is [11], and survey papers [2, 3]. In particular [3] gives details of the construction of the sentiment indicators presented above. In the following sections I shall describe different ideas for using Acuity’s entity sentiment indicators in your investment decisions.

2 Technical Trading with Sentiment

The general idea is to take your favorite trading rule from Technical Analysis (the book by Achelis [1] presents a large list of these trading rules), and instead of using the price of the stock in the rule substitute this by a sentiment indicator. To illustrate this idea consider the Dual Moving Average Crossover rule. This consist on computing two moving averages on the Closing price, one a short term over s days, named MA(s), and the other a long term over m days, MA(m), up to day t. The trading rule states to go long on the next day \(t+1\) if \(MA(s) > MA(m)\), or short otherwise. An example of parameters values is \(s = 12\), \(m = 50\), but of course these can be tuned from data.

I applied this trading rule separately to each sentiment indicator Positivity, Negativity, Bull, Bear, and BBr, in place of the price, for the JP Morgan Chase & Co. stock (JPM:NYSE) from Jannuary 2, 2018 to May 22, 2020, an epoch that reflects both bull and bear market conditions. Thus, I feed the sentiment time series to the technical indicator, take position in the stock according to the signal and hold it until the next signal. My main measure of performance is the cumulative excess return given by the strategy with respect to buy-and-hold, but I will also consider the strategy annualized return, annualized volatility, its win-rate, maximum drawdown and Sharpe ratio (considering a risk-free interest rate of 1%). There are other important performance measures that one may consider, but the subset I propose give a fair idea of the health of the strategy with respect to benefits and risk.

I repeated the experiment with different values for s and m (in fact, \((s,m) \in \{5,10,15,20\} \times \{25,50,100\}\)), considered long-only and long-short trading, and applied a rolling window analysis with window sizes of 254 d (a year) and 127 d (6 months), both with 1 day increments. The results obtained showed that 96 out of the 240 variants of the MA strategy yielded positive excess return. All results are plotted in Fig. 1, where a code of diamond shape of different sizes and various shades of blue represent the combinations of pair values for (sm); right boxes show long-only trading whilst left boxes show long-short trades; upper boxes show results of rolling window analysis with window size 127 (6 months), whilst lower boxes contains results of rolling analysis with window size 254 (a year). We readily observed that the best performing strategy (with respect to excess return) was based on the BULL sentiment indicator, with \(s=10\), \(m=25\), window size of a year and allowing long and short positions. This strategy yielded 33.9% excess return and a Sharpe ratio of 1.74; its annualized volatility is 23.9% and maximum drawdown of –15.9%. For a more conservative strategy with volatility 14% and maximum drawdown of –9%, and a reasonable excess return of 10.6%, whilst offering a Sharpe ratio of 1.34, we have the BBr strategy with \(s=15\), \(m=25\), trading long only and window size of a year. Table 1 exhibits a count of successful strategies per sentiment. We can see there that the BEAR and Negativity sentiments give the greater number of successful variants of the MA strategy (i.e. with positive excess returns, ER).

Fig. 1.
figure 1

All combinations of MA trading strategy with the five sentiment indicators and their performance with respect to excess return.

Table 1. Count of successful strategies per sentiment.

3 Sentiment Driven Portfolio Selection

The next idea is to use the sentiment indicators to rank stocks and use this ranking in the popular heuristic of quintile portfolio weighting. Subsequently, a backtesting approach is implemented to compare these sentiment-based quintile portfolio selection with other popular portfolio selection and rebalancing strategies, and across the trading performance measures already mentioned in Sect. 2.

The quintile portfolio selection strategy is a popular simple strategy in financial investment. This consists on first sort the stocks according to some characteristics (e.g. in our case, this will be done with respect to the sentiment scores), and then the strategy equally longs the top 20% (i.e., top quintile) and possibly shorts the bottom 20% (i.e., bottom quintile). In my experiments I will restrict trading to long positions only. Despite its simplicity, the quintile portfolio strategy has shown great advantage over more sophisticated portfolios in terms of stable performance and easy deployment. Moreover, a recent paper [20] gives a mathematical interpretation of quintile portfolios as solutions of robust portfolio designs, with respect to some uncertainty sets for the expected returns.

In this study, I make use of the various functionalities of the R package portfolioBacktest [14], which allows to automate the performance analysis of backtests on a list of portfolios over multiple datasets on a rolling-window basis. By performing a rolling-window analysis one can cover many of the performance weakness of a single backtest and obtain more realistic results.

3.1 The Experiments and Results

The dataset consists of a set of 16 stocks from different sectors including the technological, oil, pharmaceutical, banking and financial services, and entertainment. This includes the following companies (listed by their market ticker): AAPL, ABBV, AMZN, DB, DIS, FB, GOOG, GRFS, HAL, HSBC, JPM, KO, MCD, MSFT, PFE, XOM. Their price history is taken on a daily basis from January 1, 2015 to June 9, 2020.

Several types of portfolios are constructed on the basis of different approaches for weighting the different stocks in the portfolio. As benchmarks, I use both the Global Minimum Variance Portfolio (GMVP) and the classical Mean-Variance (MV) portfolio due to Markowitz [13], which is the tangency portfolio constructed from the “efficient frontier” of optimal portfolios offering the maximum possible expected return for a given level of risk. I also include a simple portfolio in which the same weight is assigned to each stock (the uniform or equal weighted portfolio), as well as a quintile portfolio built simply on the basis of estimated expected returns. I use these portfolios as reference points for comparison with the sentiment-based portfolios, the Quintile-BBr and the Quintile-PNLog, which are constructed using the sentiment indicators BBr and PNLog, respectively, as the key input used for selecting stocks in a quintile portfolio strategy. I apply a look-back rolling window of length 252, and optimize the portfolio every 20 (i.e. perform a selection of stocks roughly every month according to strategy). For comparison purposes among the different portfolio selection strategies I do not consider transaction costs. However, I have made a simulation of the quintile portfolio with BBr selection considering transaction costs. Results are summarized below. Table 2 exhibits the performance of the six different portfolio selection strategies under the different measures considered (where Sharpe ratio is abbreviated as Sharpe, Maximum Drawdown as Max-DD, Annualized return as A_return, and Annualized volatility as A_volat).

Table 2. Performance of the six different portfolio selection strategies.

Performance can be also viewed in the plots below of cumulative returns and bar-plots of the drawdown and Sharpe ratio (Figs. 2 and 3). It can be observed that the quintile portfolio with BBr sentiment selection constructs relatively more successful portfolios in terms of Sharpe ratio and annual return. Moreover, all methods result in an approximately similar maximum drawdown. Additionally, it is remarkable that the uniform approach to assign weights performs comparably to other more sophisticated methods such as the Markowitz and the GMVP. This is consistent with the literature on portfolio management and highlights the key flaw in general Markowitz mean-variance optimization, as it demonstrates that a large degree of instability in the covariance matrix makes implementation of Markowitz not especially fruitful in practice (more on this in Sect. 4).

Fig. 2.
figure 2

Cumulative returns of the six portfolio’s strategies.

Fig. 3.
figure 3

Sharpe ratio and Maximum Drawdown of the six portfolio’s strategies.

Fig. 4.
figure 4

Performance Quintile-BBr strategy with transaction costs set at 15 bps (red) and without (black). (Color figure online)

Finally, I simulate the Quintile-BBr portfolio selection strategy with transaction costs set at 15 bps, and compare to the same strategy without transaction costs. It can be observed that both strategies performed quite similarly (Fig. 4).

Overall, this study indicates that incorporation of sentiment analysis to portfolio selection has the potential to enhance risk-adjusted returns when compared with many of the standard portfolio choice frameworks. In particular, the Bull-Bear sentiment scoring used as the criteria for sorting in the quintile portfolio selection strategy performed substantially better than the reference portfolios, and the PNlog-Quintile portfolio performed slightly better than the best reference portfolio (the equal-weighted portfolio).

4 Sentiment Factor Model of Returns

In this section I go a step further and show how to leverage a macroeconomic factor model for stock returns with a market sentiment indicator. Factor models are used to make good estimates of the covariance of capital asset returns. Covariance matrices of asset returns are fundamental for choosing diversified portfolios and are key inputs to portfolio optimization routines, dating back to the now classical mean-variance model of Harry Markowitz [13].

The use of factor models to estimate large covariance matrices of asset returns dates back to William Sharpe [16]. The most well known factor models for capital assets are the capital asset pricing model, which uses excess market returns as the only factor (Sharpe [17]), and the Fama-French 3-factor model (Fama and French [7]).

Let us begin with a brief review of factor models (for full details see [21, Ch. 15]). Multifactor models for N asset returns and K factors have the general form

$$\begin{aligned} \textbf{R}_t =\mathbf{\alpha } + \textbf{B}\cdot \textbf{f}_t + \epsilon _t, \qquad t= 1, \ldots , T \end{aligned}$$
(1)

where \(\displaystyle \textbf{R}_t = \left[ \begin{array}{c}R_{1t}\\ \vdots \\ R_{Nt}\end{array}\right] \) is the vector of N assets log-returns, \(\displaystyle \textbf{f}_t = \left[ \begin{array}{c} f_{1t}, \ldots , f_{Kt} \end{array}\right] \) is the vector of K factors, \(\displaystyle \mathbf {\epsilon }_t = \left[ \begin{array}{c} \epsilon _{1t}, \ldots , \epsilon _{Nt} \end{array}\right] \) is the vector of N assets specific factors, \(\displaystyle \mathbf {\alpha } = \left[ \begin{array}{c} \alpha _{1}, \ldots , \alpha _{N} \end{array}\right] \) is the vector of N assets alpha (which in a macroeconomic model corresponds to the excess return or abnormal rate of return), and

$$\textbf{B} = \left[ \begin{array}{c}\mathbf {\beta }'_{1}\\ \vdots \\ \mathbf {\beta }'_{N}\end{array}\right] = \left[ \begin{array}{ccc}\mathbf {\beta }_{11} &{} \cdots &{} \mathbf {\beta }_{1K} \\ \vdots &{} \ddots &{} \vdots \\ \mathbf {\beta }_{N1} &{} \cdots &{} \mathbf {\beta }_{NK}\end{array}\right] $$

is matrix of factor loadings (each \(\beta _{ki}\) being the factor beta for asset i on the k-th factor).

In the multifactor model it is assumed that the factor realizations are independent with unconditional moments, and that the asset specific error terms are uncorrelated with each of the common factors, and are serially uncorrelated and contemporaneously uncorrelated across assets:

$$\begin{aligned} cov(\epsilon _{it}, \epsilon _{js}) = \sigma _i^2\quad \text{ for } \text{ all } \ i=j\ \text{ and } \ t=s, \text{ or } \ 0 \text{ otherwise } \end{aligned}$$

Under these assumptions the covariance matrix of asset returns has the form

$$\begin{aligned} cov(\textbf{R}_t) = \varSigma = \textbf{B} cov(\textbf{f}_t)\textbf{B}' + \textbf{D} \end{aligned}$$
(2)

where \(\textbf{D} = cov(\epsilon _t) = E[\epsilon _t\epsilon '_t|\textbf{f}_t]\) a diagonal matrix. From Eq. (2) we have that the variance of each asset is given by

$$\begin{aligned} var(R_{it}) = \beta _i' cov(\textbf{f}_t)\beta _i + \sigma _i^2 \end{aligned}$$
(3)

and the assets’ pairwise covariance is fully determined by the covariance of the market factors:

$$\begin{aligned} cov(R_{it},R_{jt}) = \beta _i' cov(\textbf{f}_t)\beta _j \end{aligned}$$
(4)

4.1 Sentiment Factor Models for the US Market

I shall consider the following five factor models for stocks of companies trading in the New York Stock Exchange:

  1. 1.

    macroeconomic 1-factor model based on the SP500 returns (factor name: SP500)

  2. 2.

    macroeconomic 1-factor model based on a Sentiment index (factor name: Sentiment)

  3. 3.

    fundamental 3-factor Fama-French model (factors: SMB, HML, Mkt.RF)

  4. 4.

    fundamental 4-factor Fama-French and Sentiment index model (factors: SMB, HML, Mkt.RF, Sentiment)

  5. 5.

    macroeconomic 2-factor model based on SP500 and Sentiment index (factors: SP500, Sentiment).

The Fama-French factors are constructed using 6 value-weight portfolios formed on size and book-to-market. SMB (Small Minus Big market capitalization) is the average return on the three small portfolios minus the average return on the three big portfolios; HML (High Minus Low book-to-market ratio) is the average return on the two value portfolios minus the average return on the two growth portfolios; Mkt.RF is the excess return on the market, value-weight return of all CRSP firms incorporated in the US and listed on the NYSE, AMEX, or NASDAQ. These factors are compiled and kept up to date by Professor French in his web page at the University of Dartmouth. The Sentiment factor will be Acuity’s PNlog described above.

I consider the set of stocks from NYSE, with the following tickers: AAPL, ABBV, AMZN, DB, DIS, FB, GOOG, HAL, HSBC, JPM, KO, MCD, MSFT, PFE, XOM, and sample their prices from 1-1-2015 to 31-12-2019, a bullish period for the American stock market. Let S be the set containing the log-returns of these stocks in the aforementioned period. I construct all our five factors (Mkt.RF, SMB, HML, SP500, Sentiment) in the same period.

It is instructive to see first how the factors we are considering correlate to each other. Table 3 shows the correlation between these factors.

Table 3. Correlation matrix of factors

We can observe that none of the correlations are statistically significant (except of course between Mkt.RF and SP500 which are both quantifying basically the same statistic: Mkt.RF is the American’s markets joint excess return while the other is the SP500 return). One can conclude from this correlation analysis that the Sentiment index does provide different information on the stocks from the market.

Next, I fit a 1-factor model based on Sentiment to the log-returns of portfolio S, and estimate the covariance matrix of the residuals of this factor model fit. I apply a hierarchical clustering algorithm using as similarity metric the correlation of these residuals. Figure 5 shows the covariance matrix of residuals and in rectangular boxes the clusters obtained by correlation on these residuals.

Fig. 5.
figure 5

Covariance of sentiment factor model and clustering.

We can see that the clustering performed on residuals (or asset’s sentiment-specific factor) correctly identifies the sector of each stock: ABBV, PFE (pharmaceuticals); AAPL, FB, AMZN, GOOG, MSFT (technologicals); DB, HSBC, JPM (financials); HAL, XOM (oil); KD,MCD (consumption); DIS (entertainment).

4.2 Comparison of Returns Covariance Matrix Estimation via Different Factor Models

For further reference I will denote by SP500 the 1-factor model based on the SP500 returns; by Sentiment the 1-factor model based on Sentiment index (PNlog); by FF the 3-factor model due to Fama and French; by FFwSent the 4-factor Fama-French and Sentiment index model; and by SPwSent the 2-factor model based on SP500 and Sentiment index.

I fit each one of these factor models to the log-returns of our considered portfolio (the set S), and estimate for each the returns covariance matrix according to Eq. (2). I will estimate the models during a training phase (first half of the period considered) and then I will compare how well the estimated covariance matrices do compared to the sample covariance matrix of the test phase (second half of the period considered), and do this for different length periods to assess the impact of the length of sample data on the estimations. The estimation error will be evaluated in terms of the Frobenius norm \(||\varSigma - \varSigma _{true}||_F^2\) as well as the PRIAL (PeRcentage Improvement in Average Loss):

$$PRIAL(\varSigma ) = 100 \times \frac{||\varSigma _{scm} - \varSigma _{true}||_F^2 - ||\varSigma - \varSigma _{true}||_F^2}{||\varSigma _{scm} - \varSigma _{true}||_F^2}$$

which goes to 0 when the estimation \(\varSigma \) tends to the sample covariance matrix \(\varSigma _{scm}\) and goes to 100 when the estimation \(\varSigma \) tends to the true covariance matrix \(\varSigma _{true}\) (the sample covariance matrix of the test phase). Since one can not expect perfect uncorrelated residuals across assets, nor with the factors, the PRIAL can be negative when the sample covariance is very close to the true covariance and the factor model estimation of the covariance is not as good. This can (and surely) happen for example when taking large samples, which improves the asymptotic convergence of the sample covariance to the true covariance, but makes for a bad covariance matrix for portfolio management.

Tables 4 and 5 present the covariance estimation error and the PRIAL for each one of the five considered factor models on a selection of different periods varying their lengths and beginning date.

Table 4. Frobenius-norm error in covariance estimation by the different factor models in different sampling periods
Table 5. PRIAL in covariance estimation by the different factor models in different sampling periods

We can observe that in the period 2015-01-01/2017-12-31, the Sentiment-factor model by itself beat all other models in covariance estimation. In the periods where separately the SP500 and Sentiment factors have similar estimation accuracy (marked in bold), their joint model (the 2-factor model of SP500 and Sentiment) remarkably improves the error in the covariance estimation to the level of the Fama and French model. Considering a large sampling period (2015-01-01/ 2019-12-20) improves notably the accuracy of the sample covariance estimation (SCM), but deteriorates the estimation by all factor models, most notably that of the Sentiment factor model, as one may expect since old news is no news.

To end, as it has been shown financial news sentiment is largely uncorrelated to other well-known financial factors and, in consequence, it does give complementary information about the market. The assets’ sentiment-specific residuals from the Sentiment factor model of log-returns can help identify assets with similar risk, and the classification based on these residuals coincide with their sector classification. Using sentiment as a factor on its own can often give good estimations of assets’ returns covariance matrix, and in combination with the SP500 returns series make a 2-factor model as comparatively as good as the Fama-French 3-factor model.