1 Introduction

Financial instability is often preceded by the booms in assets markets, eventually resulting in busts. Such boom-bust dynamics is not only observed in stock markets, but also in the real estate sector. The emergence of such dynamics is largely due to a substantial build-up of systemic risk, which triggers the occurrence of a full-fledged financial crisis. The relationship between the real estate sector and systemic risk is documented theoretically (Allen and Carletti 2013). Besides, the 2008 global financial crisis (GFC) originated in the US sub-prime mortgage market provides compelling empirical evidence of the real estate sector contributing to the build-up of systemic risk (Geanakoplos et al. 2012; Meng et al. 2014). For a large panel of countries over a long historical perspective this relationship is corroborated by Brunnermeier et al. (2020). Nowadays overheated real estate markets are found to fuel systemic risk in the G7 countries (Chiang and Chen 2022; Pavlidis et al. 2021), and China (Cao 2021; Cincinelli et al. 2021, 2022; Wang et al. 2021).

Nonetheless, the interaction between the real estate sector and financial instability does not necessarily boil down to a bivariate relationship between real estate prices and systemic risk. We conjecture that economic agents’ sentiment may mediate this relationship. Namely, our hypothesis is that adverse expectations regarding the real estate market may not only dampen the prices, but may also directly increase systemic risk. Likewise, decreasing real estate prices and elevated systemic risk may further feed the adverse market expectations, thereby forming a vicious circle. Thus, a potentially overlooked role of sentiment can aggravate the diabolic loop between adverse real estate market conditions and overall financial instability.

In this study, we construct a sentiment-based indicator of real estate market stress (SB-IREMS) as a proxy of adverse expectations and then examine the trivariate relationship described above. As far as we know, this paper is the first to explore the linkage among a sentiment-based indicator of real estate market stress, real estate price dynamics and systemic risk. This analysis is carried out for several major economies with sizeable real estate markets: the USA, the UK, Canada, India, and Australia. In addition, we conduct it on the global level. The analysis encompasses the period between January 2004 and December 2021. Since the sentiment-based indicator of real estate market stress and systemic risk are monthly indicators, while real estate prices are available on quarterly basis, we investigate the trivariate relationships using Granger causality tests for mixed frequency vector autoregressions (MF-VAR), as proposed by Ghysels et al. (2016). This is the most appropriate methodological approach in our setting, as temporal aggregation to a lower frequency, i.e. to quarters, may lead to spuriously hidden or generated causal relationships. We derive the sentiment-based indicator of real estate market stress using a novel two-step procedure. First, we read the titles and abstracts of the articles and working papers dedicated to real estate economics and finance, which are in English and are posted on the IDEAS RePEc, the largest public repository of economic research, during the period 2000–2021.Footnote 1 On this basis we compile a dictionary of terms and expressions capturing the negative perception of real estate market developments. This dictionary consists of 55 items which refer to different dimensions of residential and commercial real estate. Then, we measure the Internet search intensity for these terms and expressions by obtaining corresponding Google Trends series which span from January 2004 to December 2021. In order to derive the sentiment-based indicator of real estate market stress, we extract the first principal component from the series, using sparse principal component analysis (PCA). This technique is superior to the conventional PCA in terms of the interpretability of components, as it assigns zero loadings to less important input variables. Besides, the sparse PCA is particularly instrumental in a research setting with multiple input variables. In our study, we exploit nominal real estate price indices retrieved from the OECD statistical database for country-level analyses, while the Basel Bank for International Settlements (BIS) residential property prices dataset applies to the analysis on the global scale. Systemic risk is proxied with the conditional capital shortfall measure, SRISK, proposed by Brownlees and Engle (2017), which measures an aggregate shortage of financial institutions’ equity in the sample countries given a severe shock in the world stock market.

As regards the sentiment-based indicators of real estate market stress, we document their substantial heterogeneity in terms of the underlying dictionary items with non-zero component loadings as well as the overall dynamics during the observation period. The biggest number of terms and expressions underpinning the indicators is equal to 35 in case of global real estate market stress and 32 for the USA, whereas for the UK it only totals 10. The terms and expressions which are characterized by the highest component loadings are not uniform either. Across the countries considered and on the global level, only “mortgage crisis” and “foreclosure” appear the terms which are consistently found among top-5 dictionary items with the highest component loadings. For the USA and Canada, there is a major peak in the SB-IREMS dynamics during the outbreak of the GFC, while, for example, for Australia its national SB-IREMS exhibits only a minor increase during that period. Overall, the sentiment-based indicators of real estate market stress are not tightly correlated, with the highest ratio reaching 0.56 in case of the USA and Canada. Thus, there is limited evidence supporting commonality in the sentiment regarding real estate market stress across the major economies.

Against this backdrop, the patterns of sentiment interaction with real estate prices and systemic risk appear largely country-specific. In the USA, Australia and India, the sentiment-based indicators of real estate market stress Granger cause real estate prices, while in the UK the relationship runs in the opposite direction. In addition, in the UK the sentiment-based index of real estate market stress unidirectionally leads systemic risk, while the latter dampens real estate prices, thereby triggering a vicious circle embracing prices, sentiment and systemic risk. In case of India, real estate market stress sentiment affects both prices and systemic risk. Since the prices and systemic risk are also intertwined, stress sentiment reinforces their bidirectional linkage. As regards Canada, no relationship involving stress sentiment is found. On the global level, we only uncover the relationship between stress sentiment and real estate prices running in both directions. All in all, the findings confirm that the sentiment-based indicators of real estate market stress can mediate the interaction between real estate prices and financial instability. In the USA, the UK, Australia and India, there is every reason for financial regulators, banks and real estate companies to monitor stress sentiment regarding real estate markets, as it exerts a statistically significant causal effect on real estate prices and/or systemic risk. Thus, our findings suggest that policy makers should factor in sentiment regarding real estate market stress when elaborating macroprudential policies, as for a number of major economies it entails a decline in real estate prices and/or an increase in systemic risk. In this context, our findings are consistent with the studies investigating the impact of sentiment on financial sector performance and urging to account for sentiment when conducting macroprudential measures, e.g. Anastasiou and Drakos (2021), Anastasiou et al. (2022). Against this backdrop, actions by the regulators and market participants aimed at mitigating negative sentiment regarding real estate markets are also encouraged. These actions may involve, inter alia, fostering the overall real estate market transparency, preempting an unbalanced financial dependence between major real estate companies and systemically important financial institutions, properly articulating new regulatory measures in the field of real estate to the general public.

Overall, our research extends several strands of literature. First and foremost, it adds to the studies proposing sentiment-based indicators in real estate economics and finance involving the use of Google Trends (Dietzel et al. 2014; Hohenstatt et al. 2011; Hohenstatt and Käsbauer 2014; Wu and Deng 2015; Chauvet et al. 2016; Aroul et al. 2022) and textual analysis (Heinig and Nanda 2018; Ruscheinsky et al. 2018a; Beracha et al. 2019; Heinig et al. 2020). However, our research differs from the extant studies, as it proposes an innovative approach by synthesizing textual analysis and Internet search queries to identify the terms and expressions which underlie these sentiment-based indicators. Moreover, our sentiment-based indicators are based on the uniform methodology for all the sample countries and globally, which is an advantageous feature in comparison with the alternative sentiment measures provided by major international consultancies engaged in real estate research. These alternative measures are not necessarily comparable across countries and are often publicly unavailable as historical datasets. As an important by-product of the research, we create a dictionary capturing negative sentiment about real estate markets, which aligns our paper with the literature on specific economic/financial dictionaries, e.g. Loughran and McDonald (2011), Du et al. (2022). Thus, in a broader context, our study relates to the sentomentric literature which aims at extracting sentiment from various unconventional data sources, e.g. texts, audio and video data (Gentzkow et al. 2019; Algaba et al. 2020). Second, our paper enriches a voluminous literature, confirming the effects of real estate indicators on financial instability, e.g. Koetter and Poghosyan (2010) Anundsen et al. (2016), Jang et al. (2018), Deng et al. (2019), Duprey and Klaus (2022). Third, we contribute to the burgeoning literature which examines the interaction between systemic risk and various sentiment-based indicators (Kanas and Zervopoulos 2021; Nyman et al. 2021; Mbarki et al. 2022).

The rest of the paper is structured as follows. Section 2 reviews the literature on sentiment in real estate economics and finance. Section 3 describes the data and introduces the methodology underlying the computation of sentiment indices of real estate market stress as well as the mixed frequency vector autoregression used to uncover the relationships among these indices, real estate prices and systemic risk. Section 4 presents and discusses the estimation results for our sample countries and on the global scale. Section 5 concludes.

2 Literature review on sentiment indicators in real estate

Sentiment in real estate is usually measured on the basis of two approaches: (i) using Google search data and (ii) by means of textual analysis. The sentiment data based on Google searches and their aggregation via Google Trends have been extensively used in many fields of economics and finance, primarily for nowcasting and/or forecasting purposes in different markets, e.g. Choi and Varian (2012), Preis et al. (2013). Real estate markets are no exception. For example, Dietzel et al. (2014) show that using Google search volume statistics for relevant terms helps improve the forecasts of commercial real estate prices in the USA. Hohenstatt et al. (2011) as well as Hohenstatt and Käsbauer (2014) find that Google search data appears a reliable predictor for transaction volumes and house prices in the USA and the UK, respectively. Wu and Deng (2015) apply Google search data to construct intercity information flow indices for the Chinese housing market. In addition to nowcasting/forecasting, such data are exploited to build indicators of market pessimism. Chauvet et al. (2016) identify numerous terms related to mortgage default to create the Mortgage Default Risk Index (MDRI), which captures the borrowers’ perception of the corresponding risk and appears negatively correlated with future housing prices in major US metropolitan areas. In a similar vein, Aroul et al. (2022) construct a more comprehensive sentiment indicator for the US housing market based on Google search volume. This FEAR Index builds on specific real estate and more general economic terms and, like the MDRI index, exhibits a robust negative relationship with future housing returns.

In the realm of textual analysis, Beracha et al. (2019) find that sentiment embedded in the news abstracts of The Wall Street Journal appears an important and robust predictor of the US commercial real estate returns up to a year in advance. Ruscheinsky et al. (2018a) analyze newspaper headlines from The Financial Times, The Wall Street Journal, Forbes and Bloomberg, concluding that media-expressed sentiment leads US REIT market movement for the period 2005–2016. Ruscheinsky et al. (2018b) create German real estate sentiment dictionary and i investigate if its entries are useful to assess the indicators of the German real estate market based on “hard” data. Hausler et al. (2018) find that news-based sentiment measures obtained with the aid of machine learning are useful to forecast the returns of US REITs and direct commercial real estate markets. Koelbl (2020) conclude that management discussion and analyses (MD&A) of US REITs contain useful information about the future performance of the funds.

Heinig and Nanda (2018) as well as Heinig et al. (2020) examine a variety of sentiment indicators stemming from online search and textual information and corroborate the usefulness of both types of data. However, they also find that in terms of less informationally efficient markets online searches lead to more tangible forecast improvements of real estate returns.

3 Data and empirical design

3.1 Constructing the SB-IREMS

Our novel approach builds on the synthesis of textual analysis and Google search data. At the first stage, we seek to create a dictionary adequately capturing economic agents’ pessimism with respect to the real estate market. Given the specificity of real estate, we do not resort to any of the dictionaries previously elaborated to capture economic/financial sentiment, e.g. Loughran and McDonald (2011). Instead, we compile the dictionary encompassing terms and expressions with a negative connotation regarding real estate based on the articles and working papers posted on the IDEAS RePEc, the largest public repository of economic research. By choosing such approach to compiling the dictionary, we depart from the premise that examining a great number of scientific publications dedicated to a particular theme over a long run allows to accurately capture public sentiment about it, since researchers become more responsive to public concerns over time in case a particular problem persists. The diversity of general interest and field journals as well as research institutions’ working paper series represented on the IDEAS RePEc repository provides additional rationale for our approach. For instance, apart from many general interest journals, this database does not only give access to the article titles and abstracts from leading field journals on real estate economics and finance, e.g. Real Estate Economics, Journal of Real Estate Finance and Economics, Journal of Housing Economics, etc., but also allows to screen the titles and abstracts of working papers provided by numerous reputable research centers in this research area, e.g. specialized units of London School of Economics, Wharton School of Business, etc. Besides, the working papers prepared by the research staff of central banks and other financial regulators worldwide which have attached significant attention to real estate markets are available via the IDEAS RePEc.

We begin by conducting searches in this repository, using the most general keywords “real estate”, “housing” and “mortgage” which appear both in the titles and abstracts. The searches are performed for the period 2000–2021. Based on this selection criterion, we create a core pool of the articles and working papers made up of approximately 60,000 publications. We next extend it by examining the titles and abstracts of the papers citing those from the initial pool. Then, we read the titles and abstracts of the papers to identify the terms and expressions which undoubtedly have a negative connotation, thereby capturing adverse developments both in the residential and commercial real estate markets. We eventually come up with a list containing 55 terms and expressions (Table 1).

Table 1 Dictionary of terms and expressions conveying negative sentiment about real estate

Then, using Google Trends, a web application by Google, we retrieve the indices measuring the Internet search intensity for these terms and expressions for the period January 2004-December 2021. In our study, we extract such measures for the countries with sizeable real estate markets where English is the only official or co-official language, namely, the USA, the UK, Canada, India and Australia. In addition, we retrieve them on the global scale, as English is the primary language for Internet users and Google accounts for nearly 85% of such searches.Footnote 2

At the second stage of our exercise synthesizing textual analysis and Google search data, we transform these numerous indices of search intensity into a single sentiment-based indicator of real estate market stress, SB-IREMS, for each mentioned country and globally. To this end, these input indices are standardized so that each of them has the mean equal to zero and standard deviation to one. Then, we apply sparse principal component analysis (PCA). Based on this dimensionality reduction technique, we obtain the first principal component which represents the SB-IREMS for each country and worldwide. In comparison with the conventional PCA, this method features an attractive property, as it secures the interpretability of the principal components obtained by assigning zero component loadings to unimportant input variables.

Although the sentiment-based indicators we propose are not the only sentiment indicators which have been elaborated, they are based on the uniform methodology for all the sample countries and globally. This is an advantageous feature in comparison with the alternative sentiment measures typically provided by major international consultancies engaged in real estate research, e.g. Knight Frank. These competing measures are not necessarily comparable across countries and may be publicly unavailable as historical datasets.

After deriving the global and country-level SB-IREMS, we test how our sentiment-based indicators relate to the general economic policy uncertainty in the sample countries and globally. To this end, we employ the newspaper-based economic policy uncertainty (EPU) indices proposed by Baker et al. (2016) and run conventional and nonparametric Granger causality tests based on bi-variate vector autoregressions (VARs). In carrying out this empirical exercise, we aim to figure out out if the SB-IREMS contain any useful information not captured by the EPU measures. Testing for potential linkages between the EPU indices and our sentiment-based indicators matters, as shifts in sentiment, in particular, the investment one, are reported to serve a channel for economic policy uncertainty spilling over to the broad economy, e.g. Zhang (2019). Therefore, in order to make sure that the SB-IREMS are not dominated by the EPU indices, we expect our indicators to follow one of the three lead-lag patterns: (i) to lead the EPU indices, (ii) to be bi-directionally linked with them or (iii) to be fully independent of them.

3.2 Real estate prices and systemic risk data

In our study, we adopt nominal real estate price indices borrowed from the OECD statistical database for country-level analyses. In order to test for the relationship among the SB-IREMS, real estate prices and systemic risk on the global level, the Basel Bank for International Settlements (BIS) residential property prices dataset applies. As for systemic risk representing a threat which impairs the functioning of the entire financial system with adverse repercussions for real economic activity, we exploit the most widespread measure available internationally, conditional capital shortfall (SRISK) introduced by Brownlees and Engle (2017). This measure estimates the aggregate shortage of equity on the national level in case of a severe world stock market shock and can be represented as follows:

$$\begin{aligned} SRISK_{it} & = kD_{it} - \left( {1 - k} \right)W_{it} \left( {1 - LRMES_{it} } \right)\\ & = W_{it} \left[ {kLVG_{it} + \left( {1 - k} \right)LRMES_{it} - 1} \right],\end{aligned} $$
(1)

where Wit—the market value of equity, Dit is the book value of debt, and k denotes the prudential capital adequacy ratio. LRMESit, long-run marginal expected shortfall, measures the sensitivity of the financial institution’s equity value to the severe market decline, while LVGit is the so called quasi leverage ratio, i.e. \((D_{it} + W_{it} )/W_{it}\). In line with Brownlees and Engle (2017), k is set to 8%, while the severe market decline implies a 40-percent semi-annual shrinkage in global stock market indices, e.g. MSCI world index. Based on Eq. (1), a bigger size of financial institutions, higher values of LRMES and higher leverage entail an increase in SRISK. The SRISK series are available for standalone countries and on the global level via the Volatility Laboratory (V-Lab) at the New York University Stern School of Business.Footnote 3 Thus, the SRISK measure integrates balance sheet and high frequency market data to capture a multi-faceted threat to financial stability. This feature makes SRISK more comprehensive compared to stock market volatility indices and alternative systemic risk measures, e.g. delta conditional value-at-risk (∆CoVaR) by Adrian and Brunnermeier (2016) or marginal expected shortfall (MES) by Acharya et al. (2017). The extant literature finds that SRISK also tends to be more informative than the above mentioned measures, e.g. Dissem and Lobez (2020), Salim and Daly (2021).

The descriptive statistics for the SB-IREMS, real estate prices and SRISK are represented in Table 5 of the Appendix.

3.3 Mixed frequency Granger causality

The SB-IREMS and SRISK are monthly measures, whereas real estate prices, both country-level and global ones, are available on quarterly basis. It is well-known that time series sampled at different frequencies are usually aggregated to the common lowest frequency to test for relationships among them by means of Granger causality tests, impulse response functions, etc. However, this approach often suffers from spuriously hidden or generated causal effects. In order to overcome this issue, this study builds on Granger (no) causality tests derived from the mixed frequency VAR models, as proposed by Ghysels et al. (2016). This approach is efficient for small differences in sampling frequencies, e.g. quarterly/yearly or monthly/quarterly mixtures, which is precisely our case.

The advantage of Ghysels et al. (2016) specification compared to other approaches of handling mixed frequency data, e.g. state space or factor models, is that it does not depend on latent processes or hidden data. It is totally driven by observed data and, thus, is free of the difficulties related to equation specifications, filtering, and other parameter-dependent exercises. Moreover, mixed frequency VARs are proved to be more robust to misspecification than state-space models.

In our case, the mixed frequency VAR is presented in the following way:

$$ \left[ {\begin{array}{*{20}c} {SB - IREMS_{1t} } \\ {SB - IREMS_{2t} } \\ {SB - IREMS_{3t} } \\ {SRISK_{1t} } \\ {SRISK_{2t} } \\ {SRISK_{3t} } \\ {RE\_INDEX_{t} } \\ \end{array} } \right] = \mathop \sum \limits_{k = 1}^{4} \left[ {\begin{array}{*{20}c} {a_{11,k} } & \cdots & {a_{17,k} } \\ \vdots & \ddots & \vdots \\ {a_{71,k} } & \cdots & {a_{77,k} } \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} {SB - IREMS_{1,t - k} } \\ {SB - IREMS_{2,t - k} } \\ {SB - IREMS_{3,t - k} } \\ {SRISK_{1,t - k} } \\ {SRISK_{2,t - k} } \\ {SRISK_{3,t - k} } \\ {RE\_INDEX_{t - k} } \\ \end{array} } \right] + \left[ {\begin{array}{*{20}c} {\epsilon_{1t} } \\ {\epsilon_{2t} } \\ {\epsilon_{3t} } \\ {\epsilon_{4t} } \\ {\epsilon_{4t} } \\ {\epsilon_{6t} } \\ {\epsilon_{7t} } \\ \end{array} } \right], $$
(2)

where SB-IREMSjt and SRISKjt denote the sentiment-based real estate and SRISK indices respectively at the j-th month of the quarter t, where \(j \in \left\{ {1,2,3} \right\}\). Thus, if t represents the first quarter of the year 2004, then SB-IREMS1t denotes the value of the index in January of 2004. REt refers to real estate prices for the quarter t.

We can rewrite Eq. (2) in the matrix form:

$$ X_{t} = \mathop \sum \limits_{k = 1}^{4} A_{k} X_{t - k} + \epsilon_{t} , $$
(3)

where k denotes the number of lags, and \(\epsilon_{{\text{t}}}\) is the vector of error terms. In our case, the optimal lag order was set to 1, selected on the basis of the Bayesian information criterion (BIC) with maximum lag length set to k = 4 months.

When estimating a mixed frequency VAR, we stack all high frequency variables in a vector. This allows for the heterogeneous impact of high frequency variables on the low frequency ones within each low frequency time period. Equation 4, which derives the impact of the high frequency SB-IREMS and SRISK on the real estate prices in our case, illustrates that the corresponding coefficients (\(a_{71} , \ldots ,a_{76}\)) can take on different values.

$$\begin{aligned} RE_{t} & = \mathop \sum \limits_{k = 1}^{4} \left[ \mathop \sum \limits_{j = 1}^{3} a_{7j,k} SB - IREMS_{j,t - k}\right. \\ &\quad \left. + \mathop \sum \limits_{j = 1}^{3} a_{7j,k} SRISK_{j,t - k} + a_{77,k} RE\_INDEX_{t - k} \right] + \epsilon_{7t}\end{aligned} $$
(4)

This representation is a major advantage of the mixed frequency specification over a conventional quarterly VAR, which imposes homogeneous effects of high frequency variables on the low frequency ones, ruling out seasonal effects and lagged information transmission within each quarter. In asymptotic terms, a mixed frequency VAR can be treated a classical VAR with seven variables. This means that under certain assumptions ensuring VAR stationarity, OLS estimates of the coefficients are consistent and unbiased.

After estimating mixed frequency VARs on the global level and for the USA, the UK, Australia, India and Canada, we perform Granger causality tests and impulse response analysis. In doing so, we adopt the Cholesky identification scheme to variable ordering: \(SB - IREMS_{1} \to SB - IREMS_{2} \to SB - IREMS_{3} \to RE\_INDEX \to SRISK_{1} \to SRISK_{2} \to SRISK_{3}\).

Although the use of the Cholesky identification often raises questions in a traditional VAR framework, Ghysels et al. (2016) show that in a mixed-frequency setting it appears more appropriate, yielding robust results. As suggested by Ghysels et al. (2016), we set the number of forecast horizons equal to 5 quarters to derive the causality test statistics and corresponding p-values. In our analysis, we consider causal relationships significant at least at the 5% level.

3.4 Global and country-level SB-IREMS

Based on the methodology presented in Sect. 3.1, we derive the country-level SB-IREMS and find that they are heterogeneous in terms of the dynamics during the observation period. The figures below indicate that the SB-IREMS for the USA and Canada exhibit a major surge associated with the outbreak of the global financial crisis (GFC). The same pattern applies to the global SB-IREMS. In the meantime, real estate market stress sentiment in other countries appears either insensitive to the GFC or demonstrates a more moderate response (Fig. 1).

Fig. 1
figure 1

Dynamics of global and country-level SB-IREMS

Interestingly, we find notable spikes in this indicator for the USA, the UK, Australia and India before the GFC, namely, for the year 2004. They account for the regime shift from stable conditions to bubbly dynamics in the corresponding real estate markets. For instance, Tan and Cheong (2016) find that in case of the USA such shift proxied by an abrupt rise in the share of subprime mortgages in the aggregate mortgage loan portfolio occurred in the fourth quarter of the year 2003. The bubbly state of the US market had lasted nearly for four years before the crisis eventually broke out. Against this backdrop, it is not accidental that US internet users first intensified the searches related to real estate market instability in the early 2004, but then their intensity temporarily decreased, resuming growth in the run-up to the GFC. A similar pattern of real estate market stress sentiment is observed for the UK, where the pre-GFC bubble formation took place during 1997–2004 (Agnello and Schuknecht 2011). In case of Australia, the extant studies concur in identifying a starting point of the bubbly dynamics between the late 2002-early 2004 (Baur and Heaney 2017; Shi et al. 2016). However, unlike the US and British markets, the Australian bubble did not bust, but dissipated as a result of preventive macroprudential measures. In a similar vein, the signs of bubbly dynamics around the year 2004 are reported for India (Mahalik and Mallick 2011).

The correlations among the country-level SB-IREMS are not very high. The most tight correlation is found for the US and Canadian SB-IREMS, while, for instance, in case of the USA and the UK this measure, albeit positive, appears statistically insignificant (Table 2). Moreover, the first principal component extracted from all the country-level SB-IREMS accounts for only 42% of the variance of these indices.

Table 2 Correlation ratios among country-level SB-IREMS

Overall, there is limited commonality among the major real estate stress markets’ sentiment. This finding complements the conclusions drawn from the studies on international co-movement in real estate prices. Most of these studies acknowledge the synchronization of real estate prices across countries, but they tend to argue that its degree is not that pronounced (Miles 2017; Duca 2020; Hoesli 2020; Bhatt and Kishor 2022).Footnote 4 In case of the price dynamics, this result stems from the presence of idiosyncratic demand- and supply-side factors, which often appear as potent as common factors, e.g. the global financial cycle, business cycle synchronization, etc. Similarly, in case of real estate market stress sentiment country-specific determinants play a huge role. They can be partly manifested through the distinctive terms and expressions which underlie the SB-IREMS in the sample countries. It suggests that the Internet users perceive real estate market stress in these countries differently. For example, across all the countries only “mortgage crisis” and “foreclosure” appear the terms which are consistently found among top-5 dictionary items with the highest component loadings. In case of the global SB-IREMS the vast majority of the dictionary items with the highest component loadings are associated with distress in mortgage lending. There are 8 such terms and expressions out of the top-10 dictionary items. The same number is found for the USA and Australia. For the UK, these terms and expressions total 7 out of the top-10 dictionary items. For Canada, the SB-IREMS is also driven by the items capturing mortgage market distress, but to a less extent: 6 items out of 10. In case of India, the impact of mortgage on real estate stress is minimal across these countries: only 4 out of top-10 terms refer to this credit market segment. The number of the terms and expressions with statistically significant component loadings notably differs across the sample countries as well. The biggest number of the terms and expressions totals 35 in case of the global SB-IREMS and 32 for the USA, whereas for the UK it is only equal to 10. Detailed information about the country-specific terms and expressions is reported in the Appendix (Tables 6, 7, 8, 9, 10 and 11).Footnote 5

Then, we estimate bi-variate VAR models and run conventional and nonparametric Granger causality tests for the SB-IREMS and respective economic policy uncertainty (EPU) indices. The nonparametric tests are performed, as proposed by Diks and Panchenko (2006), and are based on the residuals extracted from the VAR models. We document no causal relationship between the SB-IREMS and the EPU indices on the global level and for most of the sample countries. The only exception is India for which the SB-IREMS leads the EPU index in the linear setting, but experiences a unidirectional feedback when we carry out the nonparametric Granger causality test (Table 3).

Table 3 Results of conventional and nonparametric Granger (no) causality tests between the SB-IREMS and EPU indices

Overall, the causal analysis corroborates that our sentiment-based indicators of real estate market stress contain valuable information which does not boil down to the informational content embedded in the EPU indices. Thus, we can proceed by investigating how the SB-IREMS interact with real estate prices and systemic risk.

4 Interaction among SB-IREMS, SRISK and real estate prices

In light of the distinctive dynamics of the global and country-level SB-IREMS described above, it comes as no surprise that the patterns of the sentiment interaction with real estate prices and systemic risk appear largely country-specific. Table 4 reports Granger causalities in the mixed-frequency framework. For brevity, the table only covers the p-values of the estimated test statistics.Footnote 6

Table 4 Results of Granger causality tests based on the MF-VAR model

Overall, there are 13 Granger causal uni- or bi-directional relationships revealed on the basis of this methodology. Our estimations indeed indicate that real estate market stress sentiment could be an overlooked variable, mediating the relationship between real estate prices and systemic risk. In the USA, the sentiment-based index of real estate market stress along with systemic risk Granger cause real estate prices. In the UK, the relationship runs from real estate prices to the stress sentiment indicator. Besides, the UK sentiment-based index of real estate market stress unidirectionally leads systemic risk, while the latter appears to drive real estate prices, thereby forming a vicious circle. A distinctive loop pattern exists for Australia where the SB-IREMS Granger causes real estate prices. Its effect is reinforced by SRISK, which affects the prices itself and fuels the SB-IREMS. In case of India, the SB-IREMS influences both real estate prices and systemic risk, amplifying the linkage between them. In Canada, no relationship involving the SB-IREMS is found. On the global level, we witness the relationship between real estate market stress sentiment and prices running in both directions.Footnote 7

Our findings reveal that in most of the sample countries as well as on the global level there is every reason for financial regulators, banks and real estate companies to monitor stress sentiment regarding real estate markets, as it exerts a statistically significant causal effect on real estate prices and/or systemic risk. Therefore, policy makers should incorporate sentiment measures capturing the degree of stress in the real estate market in their macroprudential policies. Against this backdrop, actions by the regulators and market players aimed at mitigating negative sentiment regarding real estate markets are also encouraged. These actions should help curb the spread of real estate market rumours, forestall its major players’ excessive nexus with the national systemically important financial institutions, while promoting a sufficient level of the overall market transparency. In the same vein, in order to anchor market sentiment, regulators are advised to better motivate new policy measures in the field of real estate and explain their goals in a plain way to the general public. Such actions are of paramount importance due to the cross-over effect of adverse sentiment across various asset classes, e.g. when negative sentiment about real estate markets affects sentiment in the stock market (Das et al. 2020). In addition, there is empirical evidence suggesting that adverse shifts in real estate market sentiment are often not preceded by any prior changes in aggregate fundamentals, e.g. Abildgren et al. (2018), Anastasiou et al. (2021).

Our findings emphasize a particular need for the suggested actions in case of the UK where the vicious circle embracing sentiment, real estate prices and systemic risk exists. This assertion meshes well with the study by Pavlidis et al. (2021), which emphasizes the role of real estate prices in the build-up of systemic risk in the UK. It also accords with the general warning that real estate market stress is conducive to systemic risk in advanced economies (Chiang and Chen 2022). Thus, the measures which seek to ameliorate negative sentiment in this market are also relevant, though to a differential extent, for the rest of the sample countries. In a broader context, our findings support the studies investigating the impact of sentiment on financial sector performance and urging to account for shifts in sentiment when designing and implementing regulatory measures, e.g. Anastasiou and Drakos (2021), Anastasiou et al. (2022).

5 Conclusion

This paper introduces a novel sentiment-based indicator capturing stress in the real estate market (SB-IREMS). The novelty of the indicator lies in the synthesis of textual analysis and Google Trends data. The new indicator is computed for the countries with sizeable real estate markets—the USA, the UK, Canada, Australia and India—for the period between January 2004–December 2021. In addition, we derive the global SB-IREMS. These indicators contain valuable information about real estate market stress in the sample countries and globally, which is not captured by general economic policy uncertainty measures for the respective countries. We showcase that the SB-IREMS index is an important omitted variable mediating the relationship between real estate prices and the build-up of systemic risk, both on the country-level and worldwide.

During the observation period the dynamics of the global and country-level SB-IREMS substantially differ. It mainly stems from the different number of terms and expressions underlying the indices as well as their distinctive informational content. The biggest number of the terms and expressions totals 35 in case of the global SB-IREMS and 32 for the USA, whereas for the UK it is only equal to 10. Although most of the terms and expressions refer to distress in mortgage lending in case of the USA, the UK, Australia and globally, the SB-IREMS for Canada and India are less dependent on the terms and expressions gauging sentiment in this segment.

The patterns of sentiment interaction with real estate prices and systemic risk appear largely country-specific. In case of the UK, there is evidence of a vicious circle embracing real estate market stress sentiment, prices and systemic risk: the UK SB-IREMS unidirectionally leads systemic risk, the latter impacts real estate prices, whereas the prices drive the SB-IREMS. In Australia and India, the SB-IREMS also mediates the relationship between real estate prices and systemic risk. Canada is the only country in our sample for which the SB-IREMS is unrelated to real estate prices ans systemic risk. On the global scale, there is only a bi-directional linkage between the SB-IREMS and prices, while systemic risk remains unaffected. All in all, the empirical findings indicate that real estate market stress sentiment needs to be taken into account by regulators conducting macroprudential policy measures and real estate market participants seeking to maintain their resilience in the face of increasing systemic risk.

Constructing sentiment-based indicators of real estate market stress based on the synthesis of textual analysis and Google Trends data for the countries with deep real estate markets where English is neither official, nor co-official language would be a natural extension to our research.