Over the past two decades, there has been substantial and growing interest in the construct of mindfulness. Mindfulness is often defined as “paying attention in a particular way: on purpose, in the present moment, and nonjudgmentally” (e.g., Kabat-Zinn 1994, p.4). Researchers seek to understand the benefits of mindfulness by using interventions that teach people practices that enhance their ability to be mindful. These interventions come in a variety of formats (e.g.,15-min-guided meditation, 8-week mindfulness-based stress reduction course, 3-month meditation retreat) and, overall, appear to be useful to improve physical and psychological functioning (e.g., Chiesa and Serretti 2009; Goyal et al. 2014; Grossman et al. 2004; Hilton et al. 2017; Howarth et al. 2019). Another way researchers seek to understand the benefits of mindfulness is through self-report measures of trait mindfulness, or one’s dispositional, day-to-day, level of nonjudgmental present moment attention and awareness. Trait mindfulness is associated with positive physical and psychological health and well-being, such as improved sleep quality, eating habits and physical activity (Murphy et al. 2012), higher positive affect, life satisfaction, emotional awareness, basic psychological needs satisfaction, self-esteem, (Brown and Ryan 2003), self-compassion (Baer et al. 2012) as well as lower negative affect, depression, anxiety, stress, and neuroticism (Brown and Ryan 2003). Despite promising findings of self-reported mindfulness, continual testing and strengthening of the psychometric properties of various trait mindfulness measures, including the recently validated interpersonal mindfulness scale (Pratscher et al. 2019), will lead to more reliable and valid research.

More recently, there has been an accumulating interest in the effects of mindfulness on interpersonal and social functioning. Whereas some studies have examined the impact of romantic couples participating in a mindfulness-based intervention (MBs) together (e.g., Carson et al. 2004), much of the research has focused on the association of dispositional mindfulness on social and relationship functioning (e.g., McGill and Adler-Baeder 2019). For example, self-reported mindfulness has been associated with lower levels of social anxiety (e.g., Dekeyser et al. 2008) and cardiac reactivity during stressful marital conflict (Kimmes et al. 2018), and higher levels of empathy (e.g., Dekeyser et al. 2008), social cognition (Campos et al. 2019), active listening (Jones et al. 2016), friendship quality (Pratscher et al. 2018), romantic relationship satisfaction (e.g., McGill et al. 2016), relationship stability (Khaddouma and Gordon 2018), and perceived partner responsiveness (e.g., Adair et al. 2018). These initial findings support the connection between mindfulness and social-oriented outcomes.

Measuring and understanding interpersonal mindfulness, or the extent to which one is mindful during interpersonal interactions, is important because the construct may help explain how the internal practice of meditation or the ability to be mindful in daily life can influence social interactions and relationships. One can be mindful in a variety of contexts (e.g., walking, eating, cleaning dishes), but the specific context of interacting with other people is an everyday occurrence where one can be mindful to varying degrees. For example, while interacting with another person, nonjudgmental and nonreactive attention and awareness can be more or less focused on internal experiences (bodily sensations, thoughts, reactions, mood, etc.) and/or external experiences (one’s own and another’s verbal and nonverbal communication) occurring in the moment. Being completely present and aware with another person, rather than stuck thinking in one’s head while another is talking, presumably influences the quality of the interaction and, ultimately, the quality of the relationship. One issue is that almost all items from the most commonly used dispositional mindfulness scales tap into intrapersonal experiences of being mindful (e.g., “I do jobs or tasks automatically, without being aware of what I’m doing”). It is likely that items that assess the qualities of being mindful within social contexts and during interpersonal interactions (e.g., “When a person is talking to me, I find myself thinking about other things, rather than giving them my full attention.” reverse-coded) are more relevant to understand the associations between mindfulness and relational and interpersonal outcomes. Thus, the interpersonal mindfulness scale (IMS) was created to measure the frequency that individuals are mindful during interpersonal interactions.

The 27-item IMS was developed across multiple independent samples through an iterative process of item generation and item deletion and has shown good psychometric properties (Pratscher et al. 2019). The IMS instructions ask participants to consider how frequently they have each experience using follow response options: 1 = Almost never, 2 = Infrequently, 3 = Sometimes, 4 = Frequently, 5 = Almost Always. Validation using exploratory factor analysis (principal axis factoring with promax rotation) resulted in four subscales of the IMS that were labeled Presence, Awareness of Self and Others, Nonjudgmental Acceptance, and Nonreactivity. Presence refers to being in the present moment during interpersonal interactions, with a focus on paying attention while another is talking (i.e., mindful listening). Awareness of Self and Others includes noticing and observing one’s own internal experiences (e.g., moods, emotions, bodily sensations) and being aware of the moods, intentions, and nonverbal cues of others during an interaction. Nonjudgmental Acceptance includes items about listening without evaluative judgment and accepting experiences as they are. Nonreactivity refers to taking time and pausing before responding, rather than thoughtlessly reacting during an interpersonal interaction. Multigroup confirmatory factor analysis showed that these four subscales loaded onto an overarching latent factor of interpersonal mindfulness with the following standardized loadings: Presence = 0.39, Awareness of Self and Others = 0.79, Nonjudgmental Acceptance = .90, and Nonreactivity = .87. The subscales and total scale have demonstrated good internal consistency reliability with Cronbach’s alpha ranging from 0.71 to 0.90 and 1-month test-retest reliability with intraclass correlations ranging from 0.67 to 0.87. Furthermore, known group validity was exemplified through positive associations between length and frequency of meditation practice and scores on the IMS in a sample of experienced meditators. The IMS is a new measure and only one study of its psychometric properties has been published, which requires further psychometric investigations by applying the most robust methodology such as Rasch analysis. The Rasch methodology has been applied in many psychometric examinations across different areas and proved suitable to investigate and enhance psychometric characteristics of assessment instruments (Norquist et al. 2004; Tennant and Conaghan 2007).

Ordinal scales have low accuracy and violate parametric statistical assumptions, which limits both reliability and validity of the results obtained using such measures (Allen and Yen 1979; Stucki et al. 1996). The are two major problems affecting accuracy of an ordinal scale such as the IMS. First, the differences in latent trait (e.g., interpersonal mindfulness) are not the same between different response options (e.g., “Almost never” and “Infrequently” vs “Infrequently” and “Sometimes”). Second, different items contribute different amount of information about latent trait to the total score, which refers to item difficulty in Rasch analysis. These problems cannot be resolved using traditional psychometric methods such as Classical Test Theory (Allen and Yen 1979). Rasch analysis can overcome limitations of ordinal scales by estimating interval-level scores from ordinal scale responses by accounting for sample abilities and difficulty of scale items using the same metric in logit units (Bond and Fox 2003; Rasch 1961). When a scale fits the Rasch model, changes on a latent trait (e.g., interpersonal mindfulness) represented by ordinal scores will be precisely reflected by interval-level scores derived from the Rasch model estimates that are similar to any other interval measure such as temperature or speed. Therefore, the Rasch analysis enhances the ordinal scale version because its adherence to the fundamental principles of measurement (Thurstone 1931) guiding transformation of ordinal scores into interval-level data (Rasch 1960; Bond and Fox 2007). The Rasch model requires that sample groups are invariant in responding to items of a scale (Tennant and Conaghan 2007). To ensure that there is no item bias in the IMS, we tested differential item functioning (DIF) by gender and whether or not a participant is engaged in the romantic relationship because these personal factors may potentially influence responding to the IMS items.

The Partial Credit Rasch model (Masters 1982) was developed for polytomous items (with three or more response options) and estimates thresholds for each response category (e.g., “Infrequently”), when the probability to choose one of the two closest response categories is the same. The Partial Credit model allows variations of distances between thresholds within an item and across items and used if these distances are significantly different (Masters 1982; Tennant and Conaghan 2007). The Rating Scale Rasch model developed for polytomous items earlier (Andrich 1978) is only used when distances between thresholds of individual items are uniform across all scale items.

Rasch model fit can be affected by residual correlations between items that refers to local dependency (Medvedev et al. 2017a, 2018a). Local dependency is observed when residual correlations exceed magnitude of 0.20 compared with the mean of all residual correlations (Christensen et al. 2016). Local dependency can occur because of response to one item influences responses to one or more other items or due to multidimensionality or local trait dependence. This can be verified using the super-item approach. If combining locally depended items into super-items by adding individual item scores achieves a good Rasch model fit and unidimensionality, it provides evidence for local response dependency because trait-depended items and super-items cannot fit the unidimensional Rasch model (Medvedev et al. 2018b).

The aim of the present study was to evaluate the psychometric properties of the IMS and compliance with fundamental measurement principles using modern Rasch methodology that involves creating super-items. The recently recommended standard for publishing evaluations using Rasch analysis requires generation of ordinal-to-interval transformation tables along with other criteria (Leung et al. 2017). This recommendation is considered by the present study aiming to develop such conversion tables to enhance precision of the IMS if permitted by satisfactory Rasch model fit.

Method

Participants

Participants were 751 undergraduate students recruited from a large midwestern university in the USA. These data comes from a larger study examining interpersonal mindfulness, stress, and relationship functioning, and all participants included in this study were in a romantic relationship for at least 3 months. Of the 751 participants who completed the study, 260 of these individuals were both members of a couple (n = 130 couples). Participants who missed more than one of the three attention check items embedded throughout the survey (e.g., Choose Almost Always for this item; n = 167) were excluded from the analyses. The final sample included 584 participants (n = 113 couples). The average age of the participants was 19 years (M = 18.69, SD = 1.03, range = 18 to 30). Participants were predominantly female (63.7%) and White/Caucasian (85.4%), followed by Black/African American (6.7%), and Asian/Asian American (2.6%).

Procedure

Introductory psychology students were recruited for an online 20-min “Couples’ Study” survey to complete in exchange for partial fulfillment of research requirements for their undergraduate psychology course. Participants were asked to provide the email address of their romantic partner and, upon completion of the survey, romantic partners were sent an email with a link to the study. Partners were told that having information from both members of a couple would help advance research on individual and couple well-being and were offered the chance to win one of four $25 Amazon gift cards for completing the survey. The research was approved by the local University institutional review board, and informed consent was obtained for all participants before beginning the study.

Measures

The IMS (Pratscher et al. 2019) is a 27-item self-report measure of interpersonal mindfulness. Previous research identified four factors of the IMS: Presence, e.g., “When I am conversing with another person, I am fully engaged in the conversation,” Awareness of Self and Others, e.g., “I am aware of others moods and tone of voice while I am listening to them,” Nonjudgmental Acceptance, e.g., “I listen carefully to another person, even when I disagree with them,” and Nonreactivity, e.g., “I take time to form my thoughts before speaking.” Participants reported on the frequency that their experience corresponds to each item on a 5-point Likert scale of 1 (Almost never) to 5 (Almost always). Items that were negatively worded (5, 10, 13, 17, and 21) were reverse-coded prior to data analysis.

Sociodemographic variables of age and sex, as well as relationship variables of number of months dating, were included to test for measurement invariance. Additional measures were collected as part of a larger study on interpersonal mindfulness and couple functioning but were not included in the current study.

Data Analyses

Descriptive and reliability analysis employed IBM SPSS v.26 and Rasch analysis used RUMM2030 software (Andrich et al. 2009). Prior to the main analysis, likelihood ratio test was applied to the initial IMS items to determine an appropriate Partial Credit Rasch model for the current dataset (Masters 1982). This study used methodological advances of Rasch analysis (Medvedev et al. 2017b; 2018b; Lundgren-Nilsson and Tennant 2011; Lundgren-Nilsson et al. 2013) and resolves the problem of local response dependency by combining scores of dependent items into super-items. Rasch analysis was conducted iteratively (Medvedev et al. 2016a) until the overall and individual item fit met expectations of the unidimentional Rasch model, which requires nonsignificant chi-square fit statistics of item-trait interaction. An ideal fit to the Rasch model needs the overall person and item fit residuals to have a mean close to 0.00 and a standard deviation around 1.00 and individual items fit residuals range between − 2.50 and + 2.50. Good targeting requires a sample mean to be within the range of ± 0.50 when the items mean is set to zero.

There should be no differential item functioning (DIF) by personal factors such as sex or romantic relationship in this study (i.e., measurement invariance across sample groups). DIF occurs when individuals with the same trait level, but from different categories (e.g., female vs male), score differently to an item (Andrich and Hagquist 2013). DIF in Rasch analysis is tested by dividing the sample into class intervals corresponding to different levels of a latent trait, and average scores are computed for each class interval and for each sample sub-group. DIF analysis involves comparing average scores aggregated by each class interval between groups (e.g., male vs female) and for each scale item employing ANOVA. If a group factor has a significant effect for an item, the relevant item characteristic curve (ICC) showing mean scores for each class interval and group is visually examined. If consistent mean differences are found between groups across observed class intervals, DIF is considered as uniform. Inconsistent differences are considered as nonuniform DIF (Andrich and Hagquist 2013).

Reliability of measures in Rasch analysis is estimated using the person separation index (PSI), which reflects how accurately persons are spread along the scale defined by its items (Fisher 1992). The PSI is a proportion of true variance (real differences between persons) to the total variance in the data including true variance and error variance computed using person estimates of the Rasch model. PSI is comparable to Cronbach’s alpha numerically and estimates ability of the measure to differentiate among individuals at different levels of the latent trait. PSI is computed using nonlinear transformation of the ordinal scores and can be conducted with random missing data.

We tested unidimensionality using Smith’s (2002) method explained in details elsewhere (Medvedev et al. 2016b). When satisfactory fit to the Rasch model was evident, the resultant distribution of person-item thresholds was examined to determine how well item thresholds of the scale sample target sample abilities on a latent trait (e.g., interpersonal mindfulness). Lastly, ordinal-to-interval conversion algorithms were generated that permits users to convert ordinal scores into interval-level scales.

Results

Likelihood ratio test was significant (p = .001) and rejected the Rating Scale Rasch model (Andrich 1978) meaning that the unrestricted Partial Credit Rasch model (Masters 1982) was appropriate for the current data.

Summary of the overall Rasch model fit statistics presented in Table 1 shows that the initial analysis of the IMS (A1) demonstrated poor model fit reflected by a significant interaction between individual items and the latent trait of interpersonal mindfulness (p < 0.01). This indicates a lack of consistency across item scores at different levels of interpersonal mindfulness, which is evident by individual item fit statistics included in Table 2. Significant misfit to the Rasch model was found for items 5 “When a person is talking to me, I find myself thinking about other things, rather than giving them my full attention,” 13 “When interacting with someone I know, I am often on autopilot, not really paying attention to what is actually happening in the moment,” 21 “I give the appearance of listening to another person when I am not really listening,” 23 “When I am interacting with another person, I get a sense of how they are feeling,” and 25 “Rather than being distracted, it is easy for me to be in the present moment while I am interacting with another person” indicated by elevated fit residuals and/or chi-square values. There was no evidence of unidimensionality with 8.4% of significant t tests for comparisons between individual estimates computed for two items groups, one with the lowest and the other with the highest loadings on the first residual principal component after excluding the interpersonal mindfulness factor (see Table 1). However, the scale showed good discrimination between persons with person separation index (PSI) of 0.90.

Table 1 Summary of the overall fit statistics for the initial and the final Rasch analyses of the IMS (n = 584)
Table 2 Rasch model item statistics for the IMS items before and after creating 3 super-items including locations, fit residuals and chi-square (n = 584). *Significant misfit to the Rasch model (p < 0.001)

To maintain conceptual integrity of the IMS, we considered removing items to improve model fit as the last resort and used strategies involved creating super-items, which permit reducing measurement error introduced by individual items (Medvedev et al. 2017b; 2018b; Lundgren-Nilsson et al. 2013). Both the overall Rasch model fit and dimensionality of the total scale may be impacted by local dependency between individual items that is typically found for items within each domain or subscale, that can be combined into super-items to resolve this problem (Lundgren-Nilsson and Tennant 2011; Wainer and Kiely 1987). Similarly, this was reflected by residual correlations found between subscale items of the IMS. For instance, residual correlations found between items 5, 10, 13, 17, and 21 of the Presence subscale had magnitude above the cutoff point of 0.20 indicating local dependency (Christensen et al. 2016).

Instead of deleting the items above the cutoff point, the items of each IMS subscale were combined into super-items which resulted in substantial improvement of the overall model fit reflected by the lower but still significant chi-square value for item-trait interaction and unidimensionality (Table 1, A2). At this stage, local dependency was found between individual super-items and after locally dependent Presence and Nonjudgmental Acceptance super-items were combined into one super-item labeled as Nonjudgmental presence, the best model fit was achieved (Table 1, A3 final). There was no local dependency between individual items, the scale showed good reliability (PSI = 0.76) and was strict unidimensional and invariant across sex and romantic relationship factors.

Supplementary Figure S1 illustrates person-item threshold distribution for the final analysis of the best Rasch model fit, where item mean is set as zero and individual item thresholds are located under the horizontal axis. Sample threshold estimates are plotted above the horizontal axis and person mean is located above items mean (zero) but within an acceptable range of ± 0.50 that reflects a good targeting of the sample abilities by item thresholds of the IMS with no ceiling or floor effects evident. The sample mean is positive indicating the overall higher levels of interpersonal mindfulness in the current sample

Conversion from Ordinal to Interval Measures

The IMS demonstrated good fit to the unidimensional Rasch model, which permitted producing ordinal-to-interval transformation tables based on person estimates of the Rasch model presented in Table 3. The IMS ordinal scores range from 27 to 135 and corresponding interval-level scores computed based on Rasch model person estimates in logit units (Logits) are presented on the right hand side. For convenience, we have also rescaled these logit scores into the original scale range that accurately preserves parameters of the interval scale presented in logits and permits observing ordinal scale bias by finding a corresponding interval-level score (Scale) on the right hand side. This Rasch transformation provides valid interval scores reflecting the overall interpersonal mindfulness construct by adjusting for the unique contribution of each individual component to the overall construct. This conversion table is user-friendly and can be applied using the instruction for the conversion of ordinal scores into interval-level data as follows: all items are scored from 1 to 5 and numbered according to Pratscher et al. (2018). Negatively worded items 5, 10, 13, 17, and 21 have to be reversed coded prior calculating the total score. Sum ordinal individual item scores and find the corresponding interval-level score in the conversion Table 3 on the right hand side labeled as “Scale.” This transformation is not suitable for respondents with missing data. For example, an individual that have an ordinal score of 50 will have the interval-level score of 57.94 within the same scale range and individual score of 100 will correspond to the interval score of 78.23.

Table 3 Converting from ordinal to interval-level scores for the total 27-item IMS

Discussion

The current study used Rasch analysis to examine psychometric properties and compliance with fundamental measurement principles of the IMS, developed to assess mindfulness as it occurs during interpersonal interactions, using a sufficiently large sample of 584 participants. Our results demonstrated that the IMS meets expectations of the unidimensional Rasch model with minor modifications that involved combining related items of the three domains including nonjudgmental presence, awareness of self and others, and nonreactivity into 3 super-items using the methodology of Lundgren-Nilsson et al. (2013). The modified IMS demonstrated good reliability (PSI = 0.76) and unidimensionality and was invariant across sex and romantic relationship variables. This permitted to generate ordinal-to-interval conversion tables based on person estimates of the Rasch model that are included here. Our findings support robust psychometric properties, reliability, and internal validity of the IMS. Transformation of the ordinal IMS responses into interval-level data using Rasch conversion tables published here enhances the accuracy of measurement and suitability of data for parametric statistical tests without violating their fundamental assumptions.

In this study, we have used advanced approach of the modern Rasch methodology of Lundgren-Nilsson et al. (2013) that was successfully applied to the widely used Five Facet Mindfulness Questionnaire (FFMQ) (Medvedev et al. 2017b) and to the UK Functional Assessment Measure, which is the principal outcome measure in the UK (Medvedev et al. 2018b). The main feature of this method is using super-items that combine scores of psychometrically related items together, which reduces measurement error introduced by individual items that may relate to item wording, structure, or other elements of method effect. Essentially, creating super-items is a hypothesis-testing procedure to verify the reason why some items are related to each other, that is typically reflected by residual correlations between items exceeding the cutoff point of 0.20 (Christensen et al. 2016). Residual correlations between items often create spurious factors that violate the assumption of unidimensionality of the Rasch model (Medvedev et al. 2017a, 2018a). The super-item approach permits determining whether residual correlations between items reflect local response dependency when a response to one item influences response to another similar item or local trait dependency, which suggests multidimensionality of the scale. If a scale with super-items demonstrates good fit to the Rasch model and unidimensionality, then a hypothesis of trait dependency or multidimensionality is rejected and a scale complies with principles of fundamental measurement (Thurstone 1931). Our results show that the psychometrically enhanced IMS complies with fundamental measurement criteria.

In the initial stage of Rasch analysis, we identified four items showing significant misfit to the Rasch model. One of them is item 5 “When a person is talking to me, I find myself thinking about other things, rather than giving them my full attention” that is very important conceptually for the construct of interpersonal mindfulness and removing this item would be a disadvantage potentially affecting validity of the construct (e.g., Baldini et al. 2014; Pratscher et al. 2019). Similarly, misfitting items 13 “When interacting with someone I know, I am often on autopilot, not really paying attention to what is actually happening in the moment” and 21 “I give the appearance of listening to another person when I am not really listening” are core items for the construct of interpersonal mindfulness about being present while listening and we were reluctant to discard them (e.g., Brown et al. 2007; Jones et al. 2016; Karremans et al. 2017). Finally, item 23 “When I am interacting with another person, I get a sense of how they are feeling” is measuring emotional intelligence in the context of interpersonal mindfulness, which is again an important aspect of the construct. These items may show Rasch model misfit for different reasons such as local response dependency discussed above and other features of method effect (e.g., item wording or length). A simple way to achieve satisfactory Rasch model fit would be simply to delete these items from the scale. However, deleting conceptually important items may affect both construct validity and reliability of the scale that was undesirable in this study. We resolved this issue in an elegant way by using super-item approach resulting in a satisfactory fit of the IMS to the Rasch model, which supported the relevance of these items to the construct of interpersonal mindfulness. By using ordinal-to-interval conversion Table 3 presented here, the precision of the IMS can be enhanced up to interval-level scale. This means that transformed IMS scores can be used for parametric statistical tests and valid comparison with other interval measures such as neurophysiological recordings and biomarkers, which is especially beneficial by considering recent developments in the field (Singer and Engert 2018).

Limitations and Future Research

The current study demonstrates the adequacy of the IMS only with nonclinical community and college sample. As such, future research should explore whether the IMS can be used with the same scoring structure in pre- and post-intervention assessments as well as whether some items need to have their means adjusted to account for response shifts that result from the intervention (Krägeloh et al. 2017). Nevertheless, the IMS appears to be robust for use with nonclinical contexts, particularly with the empirically supported ordinal-to-interval conversion of the scale.