Introduction

Mindfulness research has increased exponentially over the past two decades. Evidence for the benefits of both dispositional and cultivated mindfulness is substantial, growing, and has had a remarkable role in informing psychology (Rau and Williams 2016). Leading mindfulness researcher Jon Kabat-Zinn’s statement that mindfulness is “paying attention in a particular way: on purpose, in the present moment, nonjudgmentally” (1994, p. 4) is a widely-accepted definition of mindfulness. To understand mindfulness, researchers need to examine cognitive and psychological mechanisms involved in its function. These studies require assessment instruments which can validly and reliably measure mindfulness. Assessment of mindfulness has been largely achieved by self-report scales. The breadth and complexity of mindfulness has resulted in several measures being developed over the past 15 years. A review of mindfulness self-report measures notes that they are generally useful, and have been shown to predict downstream outcomes (Baer 2011). Significant progress in the psychometric measurement of mindfulness has been made, but while there has been considerable research evaluating the conceptual framework and psychometric properties of mindfulness measures (Rau and Williams 2016), there has been only limited research to establish the construct validity of measures or subscales (Siegling and Petrides 2016).

A fundamental limitation occurs if a measure lacks construct validity, that is, the degree to which a test can measure what it claims to measure (Cronbach and Meehl 1955). Questions have been raised about the construct validity of the Observing facet of the Five-Facet Mindfulness Questionnaire (FFMQ, Baer et al. 2006). This Observing facet will be referred to as FFMQ Observing. This facet shows unexpected relationships with psychological symptoms and other mindfulness facets, prompting discussion to ascertain the reason (e.g. Baer 2011). The FFMQ is one of the most widely used multidimensional mindfulness measures. It uniquely measures five different facets of mindfulness, Non-reactivity, Observing, Acting with Awareness, Describing, and Non-judging. Research has shown that these facets are relatively independent, reveal more than single scores, and their interactions are worth examination (Rau and Williams 2016), resulting in active investigation of the multidimensional nature of mindfulness (Desrosiers et al. 2013). The five facets are commonly taken to describe dispositional mindfulness (Hanley 2016) and the FFMQ is recommended as an instrument to measure it (e.g. Li et al. 2016). Thus, the FFMQ plays an important part in mindfulness research, and it would be valuable to ascertain the reasons for the anomalous function of FFMQ Observing. Further to that, it would be important to consider whether an improved group of items should represent FFMQ Observing. This could contribute to better assessment, research and interventions.

The Observing construct as defined in the development of the FFMQ refers to “observing/noticing/attending to sensations/perceptions/thoughts/feelings” (Baer et al. 2006, p. 34). Items state “I notice…,” “I pay attention to…,” or “I stay alert to.... This construct, henceforth, the Observing Construct, fits into Kabat-Zinn’s conceptualization of mindfulness as “the process of observing body and mind intentionally,” moment-to-moment and nonjudgmentally (2013, p.10). The intentional deployment of attention develops awareness, clarity, emotional balance and equanimity (Williams and Kabat-Zinn 2011). Thus, the Observing Construct defines behaviour which is a central (Bergomi et al. 2013) and essential (Lilja et al. 2013) aspect of mindfulness. Observing is therefore expected to contribute to wellbeing and correlate negatively with psychological symptoms such as anxiety, which is specifically linked to fear-based thought patterns.

FFMQ Observing shows unexpected relationships with other mindfulness facets and psychological symptoms (Bergomi et al. 2013). Studies show that FFMQ Observing often does not correlate with the other FFMQ facets. Starting with the study which developed the FFMQ (Baer et al. 2006), in which it did not correlate with FFMQ Non-judging (r = −.07), FFMQ Observing’s correlations with the other facets have been inconsistent. For instance, Siegling and Petrides (2016) found negative correlation of FFMQ Observing with Non-judging (r = −.22, p < .01), and Petrocchi and Ottaviani (2016) reported that FFMQ Observing had no significant correlations with any of the other facets. Desrosiers et al. (2013) found higher correlations for Non-judging (r = .16, p < .01) and Acting with Awareness (r = .22, p < .01), while Brown et al. (2015) showed a large correlation with Non-reactivity (r = .49, p < .05). Other studies with inconsistent facet correlations are reported by Siegling and Petrides (2016). FFMQ Observing generally fails to have consistent convergent validity—that is, it is not reliably related to the other facets (Goldberg et al. 2016).

FFMQ Observing does not have the same negative relationship with psychological symptoms that is usually found with the other four facets (e.g. it positively related to psychological symptoms, Baer et al. 2008). FFMQ Observing has been shown to have no relationship with depression in studies by Desrosiers et al. (2013) and Soysa and Wilcomb (2015), whereas the other facets correlated with depression negatively in the first of those studies. FFMQ Observing was found to be positively related to stress (β = .31, p < .05, Brown et al. 2015) after controlling for the other four FFMQ facets. In that particular study, the bivariate correlations showed that FFMQ Observing was negatively associated with all health outcomes. FFMQ Observing has also been positively related to anxious arousal in clinical populations (β = .31, p < .001, Curtiss and Klemanski 2014), dissociation, absentmindedness and thought suppression (Baer et al. 2006). Some researchers have excluded the poorly fitting FFMQ Observing from models (e.g. Baer et al. 2006), or have excluded all mention of this facet (Soysa and Wilcomb 2015). Siegling and Petrides (2016) referred to a problematic pattern of FFMQ Observing results and argued for exclusion of FFMQ Observing; Petrocchi and Ottaviani (2016) suggested that it does not appear to work adequately and should be re-evaluated, while Baer (2011) proposed that items may need to be modified or deleted. Inconsistencies are not limited to the FFMQ Observing items, but have also been shown in other Observing Construct items (Siegling and Petrides 2016). The Awareness subscale of the Philadelphia Mindfulness Scale (PHLMS, Cardaciotto et al. 2008) comprising Observing Construct items mirrors FFMQ Observing in showing inconsistent correlation with its Accept subscale, negative correlation with FFMQ Non-judging, and poor relationship with overall mindfulness.

Mindfulness experience may affect how people interpret items. The most notable unexpected relationships found in FFMQ Observing are in studies of non-meditators. At least two items (FF1 and 3, see Supplementary material) of the FFMQ Observing items were interpreted differently by meditators and non-meditators (Baer et al. 2011). Meditators’ FFMQ Observing scores consistently correlated with the other facets, resulting in five-facet models (e.g. Baer et al. 2008), whereas those for the general population have not, as above. In meditators, relationships between FFMQ Observing and psychological symptoms are more likely to be negative, for instance −.21 (p < .01) with worry, whereas non-meditators’ results showed no correlation (de Bruin et al. 2012). However, relationships between psychological symptoms and FFMQ Observing in meditators, although positive, are generally weaker than those seen with other FFMQ facets, such as Non-judging (e.g. de Bruin et al. 2012). Belzer et al. (2013) suggested that observing of present-moment experience (that is, the Observing Construct) may be limited or perceived in a different way in the mindfulness-naïve population. This was due to research showing differential interpretation of the Observing Construct according to meditation-status, not only in FFMQ Observing but in the short form of the Freiburg Mindfulness Inventory (Walach et al. 2006) as well (Belzer et al. 2013). Nevertheless, as dispositional mindfulness exists in all of us to one degree or another (Kabat-Zinn 2003), mindful (i.e. compatible with mindfulness) observing should be measurable in non-meditating populations.

Several researchers have proposed explanations for the “counterintuitive” (Desrosiers et al. 2014, p. 35) findings in which FFMQ Observing does not relate to psychological symptoms or mindfulness measures as expected. Baer et al. (2004), noting the same findings with the similar items in the Kentucky Inventory of Mindfulness Skills (KIMS, Baer et al. 2004) proposed that for individuals with no meditation experience, attending to experiences might involve judging them. Baer et al. (2006) acknowledged the limitations of FFMQ Observing in not adequately measuring mindful noticing, and noted that the external stimuli and bodily sensations items covered within the construct differ from other facets, as well as results being affected by meditation experience. Baer et al. (2008) suggested that FFMQ Observing results could be due to maladaptive self-focused attention. Bergomi et al. (2013) proposed that without an effortless and non-judgemental attitude it is a limited way of observing, akin to exaggerated self-attention. Desrosiers et al. (2013) suggested that without the beneficial influence of “how” (as revealed by the other facets) one observes, FFMQ Observing might represent heightened attention to internal experience known as interoceptive awareness, which may exacerbate physiological symptoms of anxiety. Desrosiers et al. (2014) proposed that a tendency to observe present-moment experience risks activating worry and rumination for those without the ability to be non-reactive to their observations. Lilja et al. (2013) suggested that FFMQ Observing not working as expected may be due to self-critical ruminative self-focus. Baer (2011) proposed that the results might be due to judgemental and reactive noticing of experience. Raphiphatthana et al. (2016) suggested that the results might reflect hypervigilant responses, that is, enhanced state of senses and behaviours responding to perceived threat (Eysenck 2014). These explanations have implications for mindfulness-based interventions (MBIs). Desrosiers et al. (2013) suggested that the correlation of FFMQ Observing with anxiety suggests that the components that represent FFMQ Observing (such as the body scan) may be less effective for anxious patients, because observation may exacerbate physiological symptoms of anxiety. It is important that the correct explanation for the anomalous findings is found. As Grossman and Van Dam (2011) argued, anomalies in measures of mindfulness may distort the meaning of mindfulness, and by doing so, may adversely affect development of MBIs. FFMQ Observing, as a dimension of an overarching construct of mindfulness, should encompass varying experiences which may not be closely related to other facets, but it does need to represent a valid aspect of mindfulness.

Baer et al. (2006) suggested that poor fit and unexpected relationships of FFMQ Observing might be partly due to the preponderance of items relating to body-awareness and external perceptions. FFMQ Observing is heavily weighted to this content, compared to Observing items of other measures, as identified by Bergomi et al. (2013). In the PHLMS and the Cognitive and Affective Mindfulness Scale-Revised (CAMS-R, Feldman et al. 2007), observation of thoughts and feelings have equal or more items than observation of bodily experiences and external perceptions. Baer et al. (2006) hypothesised that increased emphasis on other content (including emotion) may improve fit. Observation of feelings was included in the definition of FFMQ Observing, as noted above, indicating that observing emotions was regarded as central to mindfulness. Emotions can be defined as periods of high mental activity co-occurring with feelings of pleasure or displeasure (Cabanac 2002). Evolutionary heritage has shaped human emotions, in that the rapid, preconscious route to the emotional brain which activates physiological responses to danger is separate from the conscious perception of emotions (LeDoux 1998). Thus, cognition of emotions follows after innate response (e.g. Medvedev et al. 2015). There is strong support for the role that awareness of emotions plays in psychological health. Recognizing feelings is the keystone of emotional intelligence (Goleman 1995). Mindfulness research consistently shows that paying mindful attention to emotions is positively correlated with emotional intelligence (Baer 2011). Emotion regulation refers to the process whereby individuals’ cognition influences their experience and management of emotions. Emotion regulation difficulties contribute to anxiety (Roemer et al. 2009) and to the relationship between low dispositional mindfulness and psychosocial distress (Pepping et al. 2014). Studies have shown that mindfulness facilitates adaptive emotion regulation (e.g. Monshat et al. 2013). Exposure to emotions—observing, experiencing and accepting them—is an adaptive skill (Levitt et al. 2004). Thus, mindful observing of emotions can be seen to be crucially connected to wellbeing. Observing of emotions has been included in a new multidimensional mindfulness measure, the Comprehensive Inventory of Mindfulness Experiences (CHIME, Bergomi et al. 2014).

This study explored psychometrically why FFMQ Observing is not demonstrating expected (i.e. associated with mindful observing) relationships with psychological symptoms and other mindfulness measures in samples other than meditators. It examined the construct validity and the reliability of FFMQ Observing together with Observing items from other mindfulness questionnaires. It explored whether mindful observing can be measured in a better way (i.e. not producing the reported inconsistencies). Methods and results are described below.

Method

Participants

The study recruited 219 participants. The participants were predominantly from the community (n = 164, 75%), with the remainder being university students (n = 55, 25%). The participants were predominantly female (n = 154, 70%). Participants’ ages ranged from 18 to 80 (M = 41.93, SD = 16.45). Twenty-six percent were aged 18–25, 18% were 26–40, 39% were 41–55, and 16% were aged 56 and over. Ethnicity was made up as follows: Caucasian (n = 155, 71%), Maori (n = 18, 8%), Asian (n = 18, 8%), Pacific Peoples (n = 17, 8%) and other (n = 11, 5%). The participants were predominantly non-meditators (n = 169, 77%). A chi-square test for goodness of fit (with α = 0.05) was used to assess whether non-meditators were more numerous than meditators in the sample. The chi-square test was statistically significant χ 2 (1, N = 219) = 64.66, p < .001, indicating that non-meditation was reported with significantly greater frequency than meditation. Meditators had engaged in regular meditation a mean duration of 6.35 years (SD = 6.65), with days per month in which they meditated having a mean frequency of 13.75 days (SD = 8.75).

Procedure

The Auckland University of Technology Ethics Committee granted approval for the research. Participants were recruited at university campuses and by snowball recruitment via the lead researcher’s networks: an exercise group and friend network. Participants were asked to give 15–20 min of their time voluntarily to complete a paper questionnaire including: (a) a demographic form; (b) questions asking “Do you regularly engage in meditation/mindfulness practices?” and if so, how frequently and for how long has this practice continued; and (c) the self-report measures described below. Data-collection occurred from May until July 2016.

Measures

It was necessary for the purposes of this study to create a pool of questionnaire items to examine the construct represented by FFMQ Observing. Items identified as reflecting the operational definition of the Observing Construct were collated from various validated mindfulness measures. The existing measures below were identified by Bergomi et al. (2013) as having items which represented the Observing Construct. If items were identical in meaning only one was chosen. The following ones (see Supplementary material) remained. The eight items of the Observing facet of the FFMQ, which were producing the unexpected results on which this study is based, were included in the item pool to clarify what they measured. Two items from the KIMS were chosen, and four items from the long-form of the Freiburg Mindfulness Inventory long-form (FMI, Buchheld et al. 2001): they were all previously identified by Baer et al. (2006) as fitting the Observing Construct. The Awareness subscale of the PHLMS contributed nine items. This subscale was identified by Siegling and Petrides (2016) as examining the same construct as FFMQ Observing. Two items were chosen from the Awareness subscale of the CAMS-R.

In addition, two new items were created by the lead researcher in consultation with an experienced mindfulness teacher. These were added because it was hypothesised that the body observing items may not sufficiently discriminate between mindful observing and a possibly hypervigilant state. Together, 27 items comprised the item-pool. All scales in the questionnaire used a Likert scale response format. The FFMQ, the KIMS and the new questions use a 5-point scale (1 = Never or very rarely true, 5 = Very often or always true). The FMI uses a 4-point scale (1 = Rarely, 4 = Almost always). The PHLMS uses a 5-point scale (1 = Never, 5 = Very often). The CAMS-R uses a 4-point scale (1 = Rarely/Not at all, 4 = Almost always).

As the most-used mindfulness measure, the 15-item Mindfulness Attention Awareness Scale (MAAS, Brown and Ryan 2003, α = 0.87) was chosen to measure overall mindfulness. Worry was measured by the Penn State Worry Questionnaire (PSWQ, Meyer et al. 1990, α = 0.92) and the 10-item Perceived Stress Scale (PSS, Cohen et al. 1983, α = 0.90) was used to measure stress. Both are well-validated scales. Five items from the well-known Beck Anxiety Inventory (BAI, Beck et al. 1988, α = 0.82) were used to measure somatic anxiety (see Supplementary material). The 20-item Trait subscale of the well-validated State-Trait Anxiety Inventory (STAI, Spielberger et al. 1988, α = 0.93) was used to measure trait anxiety.

Data Analyses

Firstly, all measures were examined for missing data, and less than 1% were missing. Then, the internal consistency of all scales with more than three items were examined, followed by principal components analysis to investigate the underlying structure of these measures and to examine interrelationships among items, as shown in Table 1. Solutions were examined using Oblimin rotations to allow for correlation between the factors. Correlations between these individual measures were also computed.

Table 1 Descriptive statistics for the measures and results from principal components analysis (N = 219)

Next, all items measuring the Observing Construct (FFMQ Observing, FMI, PHLMS, KIMS, CAMS-R, and two new items) were combined into a single data set, and the overall internal consistency was examined. This study used item-level analysis which is a valid method to identify and conceptualise important factors explaining variance in the data, and explore the relationships between items. To explore and clarify the relationships among the items, data were subjected to exploratory factor analysis. First, a principal components analysis of the item-pool was completed. The number of factors to rotate was based upon three criteria of: number of components with eigenvalues over one, percentage of variance accounted for, and inspection of the Scree plot, and these criteria were used in all subsequent analyses. Second, a principal components analysis with Oblimin rotation of the Observing items was completed. This method of rotation was chosen as it is suitable when the goal is theoretical explorations of the underlying factor structure. Third, a principal components analysis with Varimax rotation of the items of the item-pool was completed. Varimax was chosen because the three components from the Oblimin rotation did not correlate highly with each other (above 0.3). Fourth, a three-factor principal components analysis without rotation of the items in the item pool was completed. Fifth, a principal axis factoring direct Oblimin rotation of the items in the item-pool was then completed. The number of factors to rotate was based upon the same criterion as for the previous analyses. Sixth, a principal components analysis without rotation of 12 items (the four highest from each factor) was completed. Based on the results of the factor analyses, correlations of study variables were examined. A series of bivariate Pearson’s product-moment correlation coefficients (r) were calculated among study variables. To investigate meditators’ and non-meditators’ responses, study variables were correlated separately for meditators and non-meditators. A Fisher's (1915) r-to-z transformation was performed to test for significant differences in correlation coefficients between meditators and non-meditators.

Results

Data met normality assumptions with skewness and kurtosis values within the acceptable range from −1 to +1 with exception of the partial BAI which was non-normally distributed, with skewness of 1.13 (SE = 0.05). Measures of central tendency and dispersion for the partial BAI (M = 2.72, SD 3.73) indicated that a large proportion of the sample reported no somatic anxiety. All measures with more than three items were found to have good internal consistency with Cronbach’s alpha ranges from .81 to .9, see Table 1.

Kaiser-Meyer-Olkin (KMO) values and Bartlett’s test of sphericity were examined for all measures with three or more items, and found to be acceptable (KMO > 0.6), see Table 1. Results from principal components analyses indicated most of the scales were unidimensional with the exception of FFMQ Observing as follows. Most items displayed communalities above .4, confirming that each item shared some common variance with other items, except the PSWQ, for which 11 of the 16 communalities were above .4, and both the MAAS and the PHLMS subscale which each had one item above .3. Examination of the MAAS, PSS, PSWQ and the STAI were consistent with the literature supporting unidimensionality of these scales. Examination of the PHLMS and FMI items also indicated one dimension underlying each of the two groups of items.

Three-Factor Solution

Initially, the factorability of the 27 candidate items was examined based on the responses of the 219 participants. The sample size of 219 cases was considered to be sufficient for the EFA, providing a ratio of over eight cases per variable (MacCallum et al. 1999). The correlations among the 27 candidate items were examined. One of the new questions created for this study was “When I am startled, I notice with concern what is going on in my body.” It was designed, when reversed, to discriminate between mindful observing and hypervigilant observing of the body. However, in reverse form it had low item-total correlation (r = −.411), so it was removed from subsequent analysis, given that a startle-reflex is adaptive for everyone, even meditators. The remaining 26 items showed high internal consistency, with an alpha coefficient of .91. Principal components analysis of these 26 items resulted in first component item-total loadings from to .37 to .62. The sample was adequate (KMO = .87, p < 0.001). All items displayed communalities above .4, further confirming that each item shared some common variance with other items. Five components (with eigenvalues exceeding 1) were identified as underlying the construct. Initial eigenvalues indicated that the first three components explained 30, 12, and 7% of the variance, respectively. A three-component solution was preferred because of: (a) it explaining 50% of the variance; (b) the “levelling off” of eigenvalues on the scree plot after three factors; and (c) the insufficient number of primary loadings for the fourth and subsequent factors (the fourth had only three factor loadings over .35, and none over .50).

Principal components analysis of the 26 items with Oblimin rotation resulted in three components which did not correlate above .3. After principal components analysis with Varimax rotation, items with poor statistical fit were examined. Two items were eliminated for the following reasons. The FFMQ item “When I am walking, I deliberately notice the sensations of my body moving” was eliminated because of having factor cross-loadings between 0.3 and 0.45 on two of the three components. It is a two-part question which loads to both components, so it can be argued that it is not a good candidate. The CAMS item “I try to notice my thoughts without judging them” was eliminated because it did not load above .4 on any component. It can be argued that because it includes a non-judgemental attitude as well as noticing, this item potentially taps two very different constructs, whereas good psychometric testing requires items that only tap one construct. Some 24 items remained in the item pool.

Principle axis factoring with direct Oblimin rotation produced a clear and stable three-factor solution (Table 2). A three-factor principal component analysis without rotation examined the items’ fit to a higher-order Observing factor. Item-totals ranged from 0.30 to 0.62. Loadings on the first principal component were from 0.36 to 0.66. The same item was lowest for both of these. Removing this item did not reduce the internal consistency of the Emotion Awareness dimension (to which it loaded primarily). This item states “It’s easy for me to keep track of my thoughts and feelings.” This item appears as a double-barrel. It was decided to remove the item, after which, the solution showed a good fit for a higher-order construct measuring Observing. All other items loaded from .44 to .66 on the first principal component, item-totals ranged from 0.42 to 0.63, and showed high internal consistency with Cronbach’s alpha of .91, thus meeting the criteria for measuring the same construct, and meaning a valid total score can be calculated. For the final stage, principle axis factoring direct Oblimin rotation of the remaining 23 items was conducted. The factor loading matrix for the final solution (henceforth named Observing Scale) is presented in Table 2.

Table 2 Factor loadings and corrected item-total correlations based on an exploratory factor analysis with Oblimin rotation for the observing item pool (N = 219)

The three factors were given labels based on a conceptual fit which emphasized those items with the highest primary loadings on each dimension. Factor 1 was labelled Body Observing, as the items loading on this factor emphasize observing of bodily sensations. Factor 2 was labelled Emotion Awareness, as the items loading on this factor primarily concern emotions, and the wording of the items is mostly “I am aware of…” rather than the “I notice…” or “I pay attention to…” wording of the other factors. Factor 3 was labelled External Perception, as the items loading on this factor all reflect perception of external stimuli. The alphas were high: 0.86 for Body Observing (10 items), .84 for Emotion Awareness (6 items) and .86 for External Perception (7 items). No increases in alpha for any of the dimensions could have been achieved by eliminating more items.

To create a shorter unidimensional measure of the Observing Scale, four items with the highest loadings on the first principal component from each dimension were next included in an unrotated principal components analysis which resulted in a unidimensional 12-item scale. This Brief Observing Scale was found to be reliable (α = .82). Item loadings were from .434 to .689. The solution is presented in Table 3. In separate analysis, principal components analysis with Oblimin rotations of FFMQ Observing indicated two factors underlying the eight items, the first (46% of variance) being noticing of external stimuli, and the second (13% of variance) being noticing of internal experiences, see Table 4. In this analysis, two items of the latter factor are cross-loading.

Table 3 Factor Loadings and corrected item-total correlations based on principal components analysis for a brief observing scale (N = 219)
Table 4 Factor loadings based on principal components analysis with oblimin rotation for the FFMQ observing items and corrected item-total correlations (N = 219)

Correlations

To assess the size and direction of linear relationships between FFMQ Observing, the Observing Scale, the three factors identified by the exploratory factor analysis, the Brief Observing Scale, mindfulness and psychological symptoms, a bivariate Pearson’s product-moment correlation coefficient (r) was calculated, results of which are presented in Table 5. Among the three factors themselves there were moderate to large correlations. The Observing Scale and the Brief Observing Scale both correlated strongly with each other and with the MAAS. They correlated significantly and negatively with trait anxiety, which had the highest correlation with the MAAS. Unlike the other factors and FFMQ Observing, the Emotion Awareness factor correlated significantly and negatively with all psychological symptoms—worry, stress, trait anxiety and somatic anxiety, and had the highest correlation with the MAAS.

Table 5 Correlations among FFMQ observing, observing subscale, the three observing factors, the brief observing scale, the MAAS and worry, stress, somatic anxiety and trait anxiety (N = 219)

Correlations of the three factors, the two FFMQ Observing factors and the Brief Observing Scale with psychological symptoms and the MAAS, computed separately for meditator and non-meditator groups, indicated differences between these groups, results of which are presented in Table 6. Body Observing correlated positively with worry and somatic anxiety for non-meditators, whereas for meditators they did not correlate significantly. Emotion Awareness correlated negatively with psychological symptoms, more strongly for meditators (all symptoms) but also for non-meditators (three of four symptoms). External Perception correlated negatively with stress and somatic anxiety for meditators, but did not correlate with any psychological symptoms for non-meditators. Correlations with the MAAS were higher for Emotion Awareness than the other two factors in both meditators and non-meditators. For the FFMQ factors, correlations between the External and Internal factors with psychological symptoms were consistently negative and stronger for meditators, whereas for non-meditators they either did not correlate or correlated positively. Correlations of the Brief Observing Scale with psychological symptoms were consistently negative and stronger for meditators, but did not correlate significantly for non-meditators.

Table 6 Correlations among the three observing factors, the two factors of the FFMQ (External and internal), the brief observing scale, the psychological symptoms and mindfulness for meditators (N = 50) and non-meditators (N = 169)

Fisher’s r-to-z Transformation

Statistically significant differences were found between meditators and non-meditators on all three factors for certain variables as follows. Overall, the correlations of the three Observing factors with psychological symptoms were consistently stronger for meditators than for non-meditators, and were consistently negative for meditators. On the Body Observing factor, the highest difference was for somatic anxiety (z = 2.9, p = .004), with differences also for stress (z = 2.3, p = .023), worry (z = 2.2, p = .028) and trait anxiety (z = 2.2, p = .026). On the Emotion Awareness factor, the difference for worry was z = 2.2, p = .026. On the External Perception factor, the difference for stress was z = 3.4, p = .001.

Discussion

This study examined the construct validity and reliability of FFMQ Observing together with Observing Construct items from other mindfulness questionnaires. The study aim was to explore why FFMQ Observing had unexpected relationships with psychological symptoms and measures of mindfulness in samples other than meditators, and whether alternative solutions could better reflect the Observing Construct. The study found the FFMQ Observing lacked items which asked about awareness of emotions, yet it was these items which produced the expected relationships with the psychological symptoms of worry, stress, somatic and trait anxiety and the strongest relationships with the MAAS, even in non-meditators. This suggests that lack of Emotion Awareness items may be a reason for the anomalous FFMQ Observing results.

An exploratory factor analysis of the Observing Construct item-pool using principal axis factoring with Oblimin rotation produced a clear-cut solution identifying three well-defined, stable factors—Body Observing, Emotion Awareness and External Perception. The resulting Observing Scale, and a 12-item brief version, were both internally consistent, related well to each other, to the MAAS, and had negative relationships with a psychological symptom – trait anxiety. The Emotion Awareness factor had negative relationships with worry, stress, somatic anxiety and trait anxiety, whereas Body Observing and External Perception had no such relationships except in meditators. The FFMQ did not have items which loaded to Emotion Awareness. Despite having two items including the word “emotions,” these were perceived as asking about bodily sensations, not emotions. Emotion Awareness had a stronger relationship with the MAAS than Body Observing and External Perception (see Table 5).

The labelling of the Emotion Awareness factor acknowledges that the items concern emotions but also ask about observing in a different way. The conceptualization and semantics of these items differ overall from the items in the other factors. These items (several of which read “I am aware of…”) all come from the PHLMS, in which mindful attention is defined as a state of awareness. It is a stance open to, accepting of, and continuously monitoring wide experience rather than directing one’s attention in narrow focus on one object. This awareness equates to adaptive experiential (mindful) self-focus (Watkins and Teasdale 2004) which is typically operationalised in terms of direct, intuitive experiencing. The other factors’ items tend to describe observing in a different way to this. They are mostly worded ‘I notice…’ or ‘I pay attention to…’ This latter terminology is cited by Bergomi et al. (2013) as representing an effort to pay attention that may contribute to FFMQ Observing results. Several of the Body Observing items are phrased in a way which measures analytical (ruminative) self-focus (Watkins and Teasdale 2004), which is operationalised in terms of thinking about the causes, meanings and consequence of one’s experiences. For example, both the FFMQ items involve causal relationships. Analytical self-focus has been linked to maladaptive psychological functioning (Watkins and Teasdale 2004), which may to some extent explain the anomalies found for the observing facet. High levels of observing may also in certain contexts be characterized by self-critical ruminative self-focus rather than mindful acceptance and non-judgement (Lilja et al. 2013). Unfortunately, this hypothesis cannot be directly tested in this study, which looked only at one facet, and excluded double-barrel items, thus excluding analysis of the role of acceptance and non-judgement from this study. Use of the word awareness for the Emotion Awareness factor does potentially “clash” with one of the FFMQ’s other facets labelled Acting with Awareness. However, in that case, the facet is measuring acting with awareness/automatic pilot/concentration/nondistraction (Baer et al. 2006), and thus a different concept than awareness of emotions.

FFMQ Observing was developed to include noticing of emotions, and is widely believed to do so (e.g. Taylor and Millear 2016). However, Baer et al. (2006) speculated that the poor fit of FFMQ Observing might result from its items’ emphasis on external stimuli and bodily sensations, rather than other content including emotion. Observing emotions is described as a core aspect of mindfulness (e.g. Kabat-Zinn 2013) and has been shown to be a key beneficial aspect of mindfulness (Baer 2011). Content regarding emotions is present in all mindfulness measures, and the CHIME and the PHLMS both include items which focus on observing of emotions. Taking time to observe our emotions enables us to learn to manage them (Levitt et al. 2004). Paying mindful attention to emotions builds emotional intelligence (Baer 2011). Beyond mindfulness research, the central importance of emotions to mental wellbeing is also acknowledged (Gross and Muñoz 1995). The inclusion of Emotion Awareness improved relationships with another measure of mindfulness, the MAAS. Emotion Awareness’s large association with the MAAS was notably stronger than the other factors. All the Emotion Awareness items came from the PHLMS Awareness subscale. Interestingly, this subscale (including all aspects of the Observing Construct) has been shown not to correlate with other mindfulness facets and measures (Siegling and Petrides 2016). Thus, just including Emotion Awareness items may not resolve all inconsistencies. However, this study indicates that inclusion of Emotion Awareness results in improved relationships which can support the centrally important Observing Construct remaining in the FFMQ, resulting in a credible five-facet FFMQ. This finding challenges the use of four-facet models (e.g. Baer et al. 2006; Soysa and Wilcomb 2015) and the conjecture that a four-facet questionnaire might be a more suitable mindfulness measure (Siegling and Petrides 2016).

This study indicates that inclusion of Emotion Awareness merits consideration as an element in resolving the unexpected FFMQ Observing results alongside the most-cited explanation for these results available to date: differential interpretation by meditators and non-meditators. This common explanation is linked to the findings that FFMQ Observing ‘works’ for meditators but not others (Baer 2011; Baer et al. 2004, 2006, 2008, 2011; Bergomi et al. 2013). Inclusion of Emotion Awareness items resulted in expected relationships with psychological symptoms and the MAAS for both meditators and non-meditators, although those relationships were stronger for meditators than non-meditators (Table 6). Even with Emotion Awareness items, non-meditator responses could be argued to be problematic. In the Brief Observing Scale, non-meditators had very weak relationships with psychological symptoms, whereas meditators had stronger and negative relationships. Two new items included in the observing item pool did not add to the study.

This study confirmed that FFMQ Observing does not have strong relationships with other measures of mindfulness, and has inconsistent relationships (some positive) with psychological symptoms (see Tables 5 and 6). This aligns with previous findings (Bergomi et al. 2013).

Recommendations for Future Research

This study is preliminary. Future research on the Observing Construct may also investigate alternative classifications such as analytical, experiential and external forms of observing. Analytical observing would be expected to correlate positively with psychological symptoms. Therefore, future research on mindful observing may investigate experiential self-focus by including more items that refer to experiential, rather than analytical forms of observing. Further research on other samples, including confirmatory factor analysis and Item Response Theory methods (e.g. Rasch analysis), is required to explore the proposition that there may be three factors underlying the Observing Construct. Predictive validity of the three factors should be examined while controlling for other FFMQ facets to rule out third variable explanations of the effects of these Observing factors on psychosocial outcomes. Further research is required to determine which items best measure the Observing Construct and needs to examine whether a proposed Observing Construct correlates with the other facets of the FFMQ. To allow examination of how the Observing Construct is interpreted according to meditation-status, two groups should be recruited: meditators and non-meditators. Additionally, the role that the Body Observing and External Perception items play in producing outcomes need to be examined, as should the role that wording of items plays. Future studies should gather data on depression, as research has shown inconsistent results between FFMQ Observing and this major psychological disorder. A balanced-gender sample would enable researchers to examine whether the factor structure is invariant across gender. Active selection of participants is needed to mitigate the fact that females are generally more likely to respond to questionnaires. Mixed-methods research, including qualitative evaluations, could progress understanding of construct validity of the Observing Construct. It would be interesting in future studies to use the three factors comprising the Observing Construct in person centred analysis to examine latent profiles of individuals on mindfulness facets using cluster analysis (Lilja et al. 2013) and latent profile analysis (Bravo et al. 2016).

Future research should consider whether the Observing Construct includes enough items about observation of thoughts, given its inclusion in the original conceptualization of the Observing Construct by Baer et al. (2006) and the central position that awareness of thoughts has in mindfulness (Kabat-Zinn 2013; Segal et al. 2002). Researchers should consider including items from the CHIME, which has been adjusted so the non-meditation population can understand it, for which an English version is currently being validated.

Limitations

The significant misbalance in distribution of meditation-status limits the generalisability of these findings. Consistent reports of different interpretation of the Observing Construct by meditators and non-meditators underscore that this is a major issue for performance of all Observing Construct items. Another limitation is that the study relies on self-report of meditation status, as well as the fact that self-ratings of mindfulness may not equate to mindfulness (Grossman 2008). An additional limitation is that this study cannot examine FFMQ Observing’s relationships with the other FFMQ facets of mindfulness, as the full FFMQ was not used. A further limitation is that the order of scales should have been alternated in half the questionnaires to avoid method effects. Additionally, it is questionable how legitimate the partial BAI is as a measure of somatic anxiety, given that it has not been validated. Finally, the item pool only contained one statement successfully addressing observation of thoughts, so there were not enough items for it to be a candidate for factor analysis.

Implications

Understanding the FFMQ Observing results has significant implications for improvement of multi-dimensional mindfulness measures. Better understanding of what explains unexpected results in this dynamic area of research contributes to improved assessment, research and interventions, as well as to the difficult task of refining the conceptualization of mindfulness (Baer 2011). The study’s contribution to the understanding of anomalies in the performance of the FFMQ Observing reduces the risk of distortion of the meaning of mindfulness through definition by “faulty” measures which could adversely affect development of interventions (Grossman 2011). For instance, the FFMQ Observing link with anxiety precipitated questioning of the body scan for use with anxious populations (Desrosiers et al. 2013) whereas research shows that anxious individuals benefit from becoming aware, through sensitive teaching, of their bodily sensations (Levitt et al. 2004). Another example of where a more accurate measurement of the observing facet may be useful is evaluation of the efficacy of MBI’s. Several studies have found that increasing mindfulness increases measures on the observing facet (e.g. Garland et al. 2013). However, another study has found no significant relationships between mindfulness practice and the total or subscale scores of the FFMQ (Manuel et al. 2017). It is possible that the subscale proposed by this study may be better suited for such assessments. Findings from this study could be used to more accurately test theories of mindfulness which propose models which explain how mindfulness works, particularly with regard to the skill of observing. Monitor and Acceptance Theory (Lindsay and Creswell 2017) proposes that mindful acceptance skills combined with enhanced awareness of one’s experiences result in the beneficial consequences of mindfulness. This may be a useful approach to ascertain whether an accepting attitude, as proposed by Lilja et al. (2013), plays a role in the way the Observing Construct functions. Other theories that could be tested are the Shapiro (2009) model of mindfulness in which decentreing plays a role, or the Buddhist Psychological Model (Grabovac et al. 2011), which might be more relevant.