Keywords

1 Introduction

Repeated surveys about opinions, perceptions, or attitudes of the interviewees are regularly carried out by national statistical offices. Elementary data are usually not available because individuals are randomly selected each time, and only the aggregate frequency distributions of opinions are published. This is the case of the surveys concerning the qualitative assessment or anticipations on the price level that ISTAT carries out every month.

Measuring public’s inflation expectations and perceptions of inflation is of great importance for monetary authorities because both expectations and perceptions are key determinants of actual inflation. For this reason, numerous studies have focused their attention either on quantifying the observed opinion data in order to derive indices of perceived (or expected) inflation or on searching explicative models that could describe data in terms of economic explanatory variables [1, 13]. In this article, we discuss an innovative model for time series ordinal data, that extends the well-established CUB model to allow for time-varying parameters. The paper is organized as follows. Firstly, we briefly recall the main features of the CUB model. Then, we extend the formulation so that time-varying parameters are allowed. Finally, for illustrative purposes, the method is applied to the time series of consumers’ perceptions of inflation in Italy.

2 The Static CUB Model

A class of mixture distributions for ordinal data, denoted as CUB model, has been widely investigated in the past decade, proving its usefulness in numerous empirical studies (see, among others, [9,10,11]). In particular, ratings are described by a random variable Y characterized by the following probability mass distribution:

$$\begin{aligned} p(y; \theta )=\pi \left( \begin{array}{c} m-1\\ y-1 \\ \end{array} \right) (1-\xi )^{y-1} \xi ^{m-y}+(1-\pi )\frac{1}{m}, \,\,\,\,\,\,\,\,\,\,\, y=1,2,...,m \end{aligned}$$
(1)

where \({\theta }=(\pi , \xi )'\), \(\xi \in [0,1]\), \(\pi \in (0,1]\) and \(m>3\). Hence, the parameter space is given by:

$$\begin{aligned} \varOmega (\theta )=\varOmega (\pi ,\xi )= \{(\pi ,\xi ): \quad 0< \pi \le 1, 0 \le \xi \le 1\}. \end{aligned}$$
(2)

The weight \(\pi \) determines the contribution of the uniform distribution in the mixture, therefore, \((1-\pi )\) is interpreted as a measure of the ability of the rater to use the available rating scale. This component has been denoted as uncertainty. Besides, the parameter \(\xi \) characterizes the shifted Binomial distribution and is related to the rater’s perception of the item content. For this reason, it has been denoted as feeling. Specifically, \((1-\xi )\) denotes the degree of liking/disliking expressed by raters about the item. Assuming that the question is expressed with positive wording and that the lowest score is attached to the worst judgement, when \((1-\xi )>0.5\) the skewness of the distribution is negative so that the portion of individuals attaching a high rating to the item under evaluation is large. The opposite is verified when \((1-\xi ) <0.5\).

Various developments have been discussed in the literature. In particular, the model has been extended to account for the presence of a ‘shelter category’, where a respondent refuges himself when he is unwilling to elaborate an accurate judgement [8]. In this case, the random variable Y is described by a GeCUB model such that

$$\begin{aligned} p(y; \theta )= \delta D^{c}+(1-\delta )\left[ \pi \left( \begin{array}{c} m-1\\ y-1 \\ \end{array} \right) (1-\xi )^{y-1} \xi ^{m-y}+(1-\pi )\frac{1}{m}\right] \end{aligned}$$
(3)

where \(D^{c}\) is a degenerate distribution at the ’shelter category’ c, and \({\theta }=(\delta , \pi , \xi )'\), with \(0 \le \delta \le 1\).

In the following section, we will introduce a dynamic version of such a model that can be useful to describe the qualitative assessment of items in repeated surveys.

3 The Dynamic Model for Ordinal Time Series

Let \(\{Y_t,\,\, t=1,...,T\}\) be a collection of random variables describing ordinal data observed at different time points. We assume that at time t, the variable \(Y_t\) is characterized by the following GeCUB distribution:

$$\begin{aligned} P(Y_t=y|\textit{I}_{t-1})= \delta _t D_{t}+(1-\delta _t)\left[ \pi _{t} \left( \begin{array}{c} m-1\\ y-1 \\ \end{array} \right) (1-\xi _{t})^{y-1} \xi _{t}^{m-y}+(1-\pi _{t})\frac{1}{m}\right] \\ \nonumber y=1,2,...,m \end{aligned}$$

with

$$\begin{aligned} \nonumber \pi _{t}= \frac{1}{1+e^{-\beta _0-\beta _1 z_{t-1}...\beta _p z_{t-p} }}; \,\,\, \xi _{t}= \frac{1}{1+e^{-\gamma _0-\gamma _1 w_{t-1}...\gamma _s w_{t-s} }}; \,\, \\ \delta _{t}= \frac{1}{1+e^{-\alpha _0-\alpha _1 v_{t-1}...\alpha _k v_{t-k} }}; \end{aligned}$$
(4)

where \(z_t\), \(w_t\) and \(v_t\) are explanatory variables, \(\textit{I}_{t-1}\), is the set of information concerning these variables until time \((t-1)\). Moreover, \(D_{t}\) is a degenerate distribution such that: \(D_{t}=1\) for the shelter category and \(D_{t}=0\) for the remaining categories. Finally, \(\beta =(\beta _0,\beta _{1},...,\beta _{p})'\) and \(\gamma =(\gamma _0,\gamma _{1},...,\gamma _{s})'\), and \(\alpha =(\alpha _0,\alpha _{1},...,\alpha _{k})'\) are the parameter vectors. Without losing in generality, we concentrate our attention on the case when each GeCUB parameter is affected by one explanatory variable at various lags, but the model can be easily extended so that several explanatory variables are included. Moreover, note that when the shelter effect is not present, the model collapses to the CUB formulation with time-varying parameters.

Let us denote with \([f_{1t}, f_{2t},...,f_{mt}]\) the relative frequencies from a random sample of n observations drawn from \(Y_t\), \(t=1,2,...,T\). The estimation of the model (4) can be performed by minimizing the sum of the Pearson’s chi-square distances between the observed relative frequencies and the GeCUB probabilities:

$$\begin{aligned} G(\theta )= n \sum _{t=1}^T \sum _{y=1}^m [f_{yt}-{p}_{yt}]^2 / {p}_{yt} \end{aligned}$$
(5)

where the notation has been simplified denoting with \(\theta =(\alpha ', \beta ', \gamma ')'\) the vector of \(r=k+p+s+3\) parameters, and \({p}_{yt}={p}_{yt}(\theta )=P(Y_t=y|\textit{I}_{t-1})\). It is well known that the minimum chi-square method yields estimates that are asymptotically equivalent to maximum likelihood estimates. In particular, they are consistent and asymptotically efficient (see [3] p. 425–6; [7] and references therein). Then, the parameter estimators are asymptotically normal with mean \(\theta \) and asymptotic variance covariance matrix \(Q^{-1}\) with \(Q=\{q_{ih}\}\):

$$\begin{aligned} q_{ih}=-\sum _{t=1}^T \sum _{y=1}^m n_{yt}{p}_{yt}(\theta ) \frac{\partial ^2 \log {p}_{yt}(\theta )}{\partial \theta _i \partial \theta _h} . \end{aligned}$$
(6)

being \(n_{yt}=n f_{yt}\) the absolute frequencies.

Finally, the goodness of fit of the model is assessed by comparing \(\widetilde{G}_{mod}=n^{-1}G(\widehat{\theta })\) with the distance \(\widetilde{G}_U= m \sum _{t=1}^T \sum _{i=1}^m [f_{it}-m^{-1}]^2 \). This measures the discrepancy of the observed frequencies from the uniform probabilities, which reflects the situation of pure ignorance about the phenomenon under investigation.

The model (4) can be used for various purposes. Firstly, the dynamic pattern of the estimated parameters helps to detect how the ordinal distributions change over time. In our opinion, this characterization is more informative with respect to the study of the time series of a certain summary statistics (for example, the mean) of the empirical distribution observed at time t. Secondly, the model is useful for predicting the probability distribution of \(Y_{T+k}\) using the past realizations (or predictions) of the explanatory variables. Finally, by analogy to the static model, the pattern of the estimated time-varying parameters can be exploited to compare the dynamics of various ordinal time series.

4 A Case Study: Consumer Inflation Perceptions

Consumers’ qualitative opinions about the development of inflation are regularly surveyed by ISTAT within the harmonized European programme of business and consumer surveys. Specifically, in Italy, every month a sample of about 2000 consumers are interviewed about their perceptions of past inflation development and their expectations about the future. The first variable, \(Y_t\), is originated from the question (Q5): ‘How do you think that consumer prices have developed over the last 12 months? They have: risen a lot; risen moderately; risen slightly; stayed about the same; fallen’. The second one, \(Z_t\) refers to the question (Q6): ‘By comparison with the past 12 months, how do you expect consumer prices will develop in the next 12 months? They will: increase more rapidly; increase at the same rate; increase at a slower rate; stay about the same; fall’. Only the frequency distribution of the opinion categories is published monthly. In this section, we analyze data ranging from 1994.01 to 2018.1. A preliminary study of this data-set has been presented by [4]. Here, the observed categories have been recoded so that 1 is associated to the category ‘fallen/fall’, and 5 to the category ‘risen a lot/increase more rapidly’. This scale is reversed with respect to that widely used in the economic literature.

Fig. 1
figure 1

Examples of observed frequency distribution of \(Y_t\) for selected time points

The shape of the distributions of the ordinal variable, \(Y_t\), associated to each time point may vary depending on the economic situation, as Fig. 1 shows. For this reason, the perceived change in inflation is usually evaluated by the balance statistic: \(B(t)=b = -f_{1t} -0.5 f_{2t} + 0.5 f_{4t} +f_{5t}\). This measure is often compared graphically with the actual inflation rate. In this regard, it is worth recalling that the link between inflation perceptions and actual inflation had been quite strong before 2002, but this co-movement disappeared after the Euro cash changeover in 2002 in all EU countries [2]. In Italy, this gap was exceptionally large and persistent, and a similar divergent pattern also affected the balance statistic of perceived and expected inflation [5]. Only towards the end of 2007, perceptions and expectations started again to move together, even if the gap began to reduce only after the 2008 global economic crises (Fig. 2).

Fig. 2
figure 2

Balance statistic of perceived (solid line) and expected (dashes) inflation

We have applied the model (4) to describe the dynamics of ordinal data originated by the question concerning the perception of past price development. A conceptual framework of the process generating consumer’s opinions about inflation has been illustrated by [12]. The socio-economic environment, the amplification due to media, and personal attitudes (gender, personal income, level of education) are all important drivers. In addition, the perceptions are strictly related to the expectations. This is not only true from the present to the future, but expectation about the price trend, formed at some previous time, may in some cases bias the perceptions of the current situation [6, 14]. Moving from those considerations, we have specified the dynamics of the GeCUB coefficients as follows:

$$\begin{aligned} P(Y_t=y|\textit{I}_{t-1})= \delta _t D_{t}+(1-\delta _t)\left[ \pi _{t} \left( \begin{array}{c} m-1\\ y-1 \\ \end{array} \right) (1-\xi _{t})^{y-1} \xi _{t}^{m-y}+(1-\pi _{t})\frac{1}{m}\right] , \\ \nonumber y=1,2,...,5. \end{aligned}$$
$$\begin{aligned} \xi _{t}= \frac{1}{1+e^{-\gamma _0-\gamma _1 \overline{y}_{t-1} }}; \,\,\pi _{t}= \frac{1}{1+e^{-\beta _0-\beta _1 \overline{z}_{t-1}}}; \,\,\, \delta _{t}= \frac{1}{1+e^{-\alpha _0-\alpha _1 v_{t-1}}}; \end{aligned}$$
(7)

where, for any t:

  • the parameter \(\xi _t\) depends on \(\overline{y}_{t-1}\), the mean of the price past trend perceptions (this is simply the mean of the observed ratings) at time \(t-1\);

  • the parameter \(\pi _t\) depends on \(\overline{z}_{t-1}\), the mean of the expectations about future price level at time \(t-1\);

  • \(D_{t}=1\) for the category: ‘stayed about the same’, and 0 otherwise. The corresponding coefficient \(\delta _t\) depends on \(v_{t-1}=\overline{y}_{t-1}-\overline{z}_{t-1}\), the gap between price trend perceptions and future trend expectations at time \((t-1)\). When this gap is small, the perception that prices stayed about the same becomes stronger.

Table 1 Estimation results (standard errors in parenthesis)

Table 1 illustrates the estimated coefficients of the model with their standard errors in parenthesis. Computations have been done using the programming system GAUSS (Aptech Systems, Inc.). The global fitting of the model is satisfactory as the remarkable reduction of the discrepancy between the observed and fitted distributions shows. The time plot of \((1-\hat{\xi }_t)\) helps to detect the main characteristics of the distributions of the ordinal variable \(Y_t\) (see Fig. 3, panel a). From 1994 to the beginning of 2014, \((1-\hat{\xi }_t)>0.5\). This implies that most of the estimated ordinal distributions are left skewed because consumers tend to state that prices have increased in the last twelve months. High values of \((1-\hat{\xi }_t)\) are achieved after the Euro cash changeover. Other remarkable fluctuations can be recognized between 2010 and 2013 when various international and national political crises affected financial indicators (such as the increase of the spread between 10-year BTP and German bund) feeding the uncertainty of consumers about the economy. Only at beginning of 2014 the time series collapsed below 0.5 and start to fluctuate around that value.

Fig. 3
figure 3

Time-varying coefficients: a \((1-\hat{\xi }_t)\); b \(\hat{\delta }_t\) (solid line); \((1-\hat{\pi }_t)*(1-\hat{\delta }_t)\) (short dashed)

The pattern of weight of the shelter category \(\hat{\delta }_t\) and the weight of the Uniform distribution, \( (1-\hat{\pi }_t)(1-\hat{\delta }_t)\) (i.e. uncertainty) are illustrated in Fig. 3, panel b. Both components have a limited role in determining the mixture. However, after the Euro cash changeover, the two components follow an opposite but consistent pattern. As a matter of fact, the role of the ‘uncertainty’ increases whereas the weight of the refuge and neutral category decreases.

Fig. 4
figure 4

Balance statistic from the empirical distributions (solid line) and the estimated model (dashed line)

The plot of the observed balance statistic with that implied by the model confirms the goodness of the results (Fig. 4). The two time series are very close in all the considered time interval. In this regard, it is worth to point out that the model is also able to reproduce the large increase that occurred in the time series with the Euro cash changeover.

5 Final Remarks

We have presented a parsimonious model for describing time series of ordinal data that exploits the features of the CUB model. The analysis of the pattern of characterizing parameters helps to summarize the changes in the ordinal distributions along time. Firstly, this synthesis is more informative than using simple summary statistics, such as the average, to describe the dynamics of the phenomenon originating the ordinal data. Secondly, the model provides a useful tool for prediction and control, because the relationships that define the time-varying parameters are specified as a function of explanatory variables for which future scenarios may be elaborated.