Introduction

Infertility is defined as the inability to conceive after at least 12 months of unprotected intercourse [1]. Due to factors such as unfavorable lifestyle and environmental pollution, the prevalence rate of infertility in China has been increasing [2, 3]. The infertility rate of couples of childbearing age in China has risen from 2.5% to 3% to around 12% to 15% in 20 years, and the number of patients has exceeded 50 million [3, 4]. Infertility (and its treatment process) could result in psychological distress, and could cause greater stressors in life [5, 6]. Furthermore, infertility affects a couple’s marital quality, sexual relationships, psychological wellbeing, and quality of life [7,8,9,10]. Infertility has become an important public health and social problem in China [2, 3, 11].

Health-related quality of life (HRQoL) is a comprehensive measurement that includes an individual’s physical, psychological, social function, and material state, which is a multi-dimensional concept that represents the patient’s overall perception of the impact of an illness and its treatment [12]. HRQoL can be assessed by using generic or disease-specific instruments, especially preference-based HRQoL, which has become an increasingly important outcome instrument in a particular form of economic evaluation cost-utility analysis (CUA) [13]. Based on the literature review, most of the previous studies measured HRQoL of infertility patients using generic or disease-specific instruments, such as Medical Outcomes Study 36-Item Health Survey (SF-36) and Fertility Quality of Life (FertiQoL) [14, 15], while no studies used the preference-based measurements among infertility patients.

Accurately measuring health state utilities plays a key role in CUA to ensure optimal health resource allocation [13]. A systematic review (for studies published before July 2018) concluded that although the quality of life and wellbeing of people having or having had fertility problems were reported, “none of the studies reported outcomes relevant for cost-utility studies” [16]. Since then, one study in the Netherlands elicited health state utilities for infertility and subfertility using time-trade-offs (TTO) from the general public recruited from an online panel company [17]. To our best knowledge, this is the first study to evaluate health state utilities among Chinese infertility patients.

Subjective wellbeing (SWB) is a measure of the overall ‘wellness’ of an individual, which is a broad category of phenomena that includes people’s emotional responses, domain satisfaction (e.g., health, work, social relationships), and global judgements of life satisfaction [18, 19]. Infertility is not only a health problem but also there is a negative association between having fertility problems and quality of life/wellbeing [16], and infertile couples who fail to conceive face pressure from family members and the community [10, 20]. Among infertile individuals, women usually had poor scores in HRQoL compared to men [21]. In the Chinese cultural setting, infertile couples were under greater psychosocial pressure. In particular, women were more likely to be blamed for their inability to conceive than men [8, 11].

It has been proposed that HRQoL instruments fail to capture SWB losses in some diseases [22, 23], and SWB should also be considered in health resource allocation [19]. Consequently, there are increasing numbers of studies aiming to investigate the relationship between health state utilities and SWB in patients of different diseases [24,25,26]. There is no evidence of the relationship between health state utilities and SWB in infertility patients.

This study aimed to investigate the health state utilities and SWB of infertility patients in China, and to evaluate the relationship between generic HRQoL and SWB instruments in infertility patients.

Methods

Participants and data collection

This study was conducted in the Hospital for Reproductive Medicine Affiliated of Shandong University between April 2019 and November 2019. The participants were diagnosed with infertility, including primary infertility or secondary infertility. Primary infertility couples are those who have never been initiated with a clinical pregnancy, and secondary infertility couples are those who are unable to establish a clinical pregnancy but have previously been diagnosed with a clinical pregnancy [1]. To ensure the accuracy of patients’ diagnosis information, clinical diagnosis information was obtained from the hospital information system. Informed consent was obtained from all participants after a detailed explanation of the study. This investigation was performed face-to-face by the interviewers. The interviewer, from Shandong University, explained the meaning of the survey and the requirements to fill in the questionnaire. Then participants completed the questionnaire on their smartphones. When the participants did not understand the questionnaire, the interviewer would give an explanation. The exclusion criteria were as follows: (1) being younger than 18 years old at the time of the survey, or (2) being unwilling to give informed consent, or (3) lack of clear clinical diagnosis of infertility or other gynecological diseases, such as premature ovarian failure or abnormal uterine bleeding.

Sample size calculation

The study was powered based on the health state utility of uncertainty around the estimates using Eq. (1) [27]:

$$n\; = \;\frac{{\sigma^{2} }}{{(\omega /1.96)^{2} }}.$$
(1)

According to the previous study [17], the standard deviation (σ) was assumed to be 0.25 in this study. The margin of error (ω) can be estimated with half of the 95% confidence interval (CI), and the previous study evaluated the Dutch primary infertility patient’s utility was 0.792 (95% CI 0.771, 0.813) [17]. Using Eq. (1) with σ = 0.25 and ω = 0.02 to estimate the sample size for the survey of 600 (n) infertility patients [27]. Furthermore, considering the rate of loss (20%), we aimed to recruit at least 720 participants.

Instruments

The research used two generic preference-based HRQoL and one SWB  measures, including the five-level EQ-5D (EQ-5D-5L) questionnaire, the Assessment of Quality of Life (AQoL)-8D, and the WHO-5 wellbeing index (WHO-5). The self-completed survey also involved the socio-demographic background of the respondents.

EQ-5D-5L

The EQ-5D-5L is an updated version of the most widely used three-level EQ-5D (EQ-5D-3L) instrument [28]. It has demonstrated reducing ceiling effects and improving sensitivity in comparison to EQ-5D-3L [29, 30]. The EQ-5D-5L consists of five dimensions (mobility, self-care, usual activities, pain/discomfort, and anxiety/depression) and a stand-along Visual Analog Scale (VAS), with each dimension having five response levels: no problems, slight problems, moderate problems, severe problems, and unable to/extreme problems [28]. The VAS with anchor points 0 (‘worst imaginable health state’) and 100 (‘best imaginable health state’) was used. The Chinese version of the EQ-5D-5L descriptive system was adopted [31]. The previous study demonstrated the measurement equivalence of English and Chinese versions of the EQ-5D-5L questionnaire [32]; the EQ-5D-5L has been widely used in both the general public and disease populations [33]. This study used the Chinese-specific EQ-5D-5L value set [34].

AQoL-8D

The Assessment of Quality of Life (AQoL)-8D is one of the most comprehensive preference-based HRQoL and it was developed to achieve increased sensitivity in psychosocial dimensions of health [35]. The AQoL-8D contains 35 items and defines 2.37 × 1023 possible health states [35, 36]. Three of these dimensions (independent living, pain, senses) could be combined to create a physical super-dimension and the other five dimensions (mental health, happiness, coping, relationships and self-worth) could be combined to create a psychosocial super-dimension [35]. Given it has more psychosocial dimensions, it could be a better measure for the HRQoL of infertility patients. The AQoL instruments have been used to measure HRQoL in the Chinese population [37, 38]. The Chinese version AQoL-8D was used, and without a Chinese-specific value set, so the original scoring algorithm incorporating Australian preference weight was used [39].

WHO-5

The WHO-5 was a 5-item measure that was designed to evaluate emotional wellbeing and psychological wellbeing [40]. The degree to which these feelings were presented in the last 14 days was scored on a 6-point Likert-type scale ranging from 0 (“at no time”) to 5 (“all of the time”). The raw score is calculated by calculating the summary score of the five items. The raw total score ranges from 0 to 25, with 0 representing the worst possible and 25 representing the best possible wellbeing; a total score below 13 indicates poor wellbeing and it is an indication for testing for depression under ICD-10 [41]. The Chinese version of WHO-5 was used in this study [41]. The WHO-5 has been applied to a wide range of study fields, which is among the most widely used questionnaires assessing subjective psychological wellbeing [42].

Data analysis

Descriptive analysis was presented as mean (standard deviation, SD) or median (95% CI) for continuous variables and frequency (%) for categorical variables. The normality test was used for the Shapiro-Wilk test. The nonparametric Kruskal-Wallis test was used to compare the diagnosis and socio-demographic sub-group scores. Since the dependent variable EQ-5D utility score exhibits a ceiling effect, a large proportion of subjects are in full health with a utility score of 1. We re-created a dummy variable to indicate whether respondents scored full health or not and used a logit model to study the associated factors of EQ-5D-5L scores. The ordinary least squares (OLS) regression was used to assess the associated factors of AQoL-8D scores and WHO-5 scores.

This study compared psychometric properties of the AQoL-8D and EQ-5D-5L scores in evaluating HRQoL among infertility patients. The floor or ceiling effects were considered to be present if more than 15% of the respondents achieved the lowest or highest possible score, respectively [43, 44]. The agreements between the two instruments were assessed employing the Bland-Altman plot and the intraclass correlation coefficient (ICC), with an ICC > 0.7 indicating a strong agreement [45]. The sensitivity of instruments to distinguish the diagnosis of infertility patients were studied by using the Cohen effect size, according to the following cut-offs: Cohen’s d < 0.2 = small; 0.2 < Cohen’s d < 0.5 = moderate; Cohen’s d ≥ 0.5 = strong, Cohen’s d ≥ 0.8 large [46].

To investigate the relationships between HRQoL and SWB measures, Spearman’s rank correlation coefficients were estimated. The strength of the correlation (r) was interpreted as follows: r > 0.7 indicates strong; 0.3 < r < 0.7 indicates moderate; r < 0.3 indicates weak [47]. Lastly, this study explored the complementary or substitute relationship between generic HRQoL and SWB instruments in infertility patients. Exploratory factor analysis (EFA) was conducted to examine the difference in descriptive systems between the three instruments, and compared with item-level responses for the HRQoL and SWB instruments. EFA was used to ascertain the number of unique underlying latent factors that were associated with the items covered by the three instruments [48]. Despite the conceptual origins of different instruments, it is a commonly adopted strategy to explore empirically whether different instruments measure similar content using EFA [49,50,51]. The Bartlett’s test of sphericity (p < 0.05) and a Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy reaching ≥ 0.50 would be considered appropriate to conduct EFA [52]. Because both of the instruments (AQoL-8D, EQ-5D-5L, and WHO-5) are scored on categorical scales, and items are analyzed as ordinal information [53]. The EFA was estimated using the maximum likelihood method, and the number of the factors to be extracted was determined according to the parallel analysis based on minimum rank factor analysis (PA-MRFA) [53], and the promax rotation was used to obtain the rotated factor loadings. Pearson correlation coefficients were used to examine the extent of the relationship between factors.

Except for EFA which was conducted using FACTOR 12.03.02 software for Windows [54], all other statistical analyses were conducted using STATA version 14.1.

Results

Participants’ socio-demographic characteristics

A total of 732 patients initially agreed to participate in this study. Among them, 49 patients had missing or incorrect medical record numbers, and 65 patients were non-infertility patients or with missing diagnoses from the hospital information system. Finally, we analyzed a valid sample of 618 infertility patients (84.4%). The average time to complete the questionnaire was 11.4 mins. Table 1 presents the characteristics of the participants. About 83.2% of the participants were female patients. More than one half (53.9%) of the patients have primary infertility. The mean age of the participants was 31.6 (SD: 4.8). More than one half (51.9%) of the participants have a university degree and above. About 68.5% of the participants were urban employees. The annual household income of 8.4% was more than 150,000 Chinese Yuan, and 29.6% was less than 30,000 Chinese Yuan.

Table 1 Characteristics of participants (N = 618)

Participants’ HRQoL and SWB

The mean scores for the total EQ-5D-5L, AQoL-8D, and WHO-5 were 0.96 (95%CI 0.96, 0.96), 0.80 (95%CI 0.79, 0.81), and 16.92 (95%CI 16.52, 17.31), respectively. The distribution of scores for each of the three instruments is plotted in Fig. 1. There existed a left-skewed distribution for all 3 instruments and the null hypothesis of normal distribution was rejected by the Shapiro-Wilk test. All three instruments found that patients with primary infertility had lower scores than secondary infertility; the differences were statistically significant for EQ-5D-5L (OR = 1.515), AQoL-8D (β = 0.028), and WHO-5 (β = 1.528), after controlling the socio-demographic characteristics. Furthermore, males tended to have higher HRQoL and SWB than females, but the difference was only statistically significant based on AQoL-8D. More details on sub-group comparisons and regression analysis are found in Supplementary Table 1 and 2.

Fig. 1
figure 1

Distributions of EQ-5D-5L, EQ-VAS, AQoL-8D, and WHO-5

Psychometric properties of EQ-5D-5L and AQoL-8D

Between two generic preference-based HRQoL instruments, EQ-5D-5L showed a higher ceiling effect with 47.7% of participants reported being in full health (i.e., utility = 1), whereas the ceiling effect of AQoL-8D is 2.4% (Table 2). The detailed frequency of responses to EQ-5D-5L dimensions is shown in Supplementary Table 3. Among the five dimensions, the proportion of participants reporting anxiety/depression problems was the highest (46.0%), followed by pain/discomfort (19.3%); for the left 3 dimensions, more than 98% of respondents reported no problems.

Table 2 Comparison of the EQ-5D-5L, EQ-VAS, AQoL-8D, and WHO-5

EQ-5D-5L and AQoL-8D had a poor absolute agreement in this study, with an ICC of 0.14 (95%CI -0.07, 0.34). Bland-Altman plots (Fig. 2) further showed that between the two health state utility instruments, the range of 95% limits of agreement (LOA) was 0.48. The mean difference between the EQ-5D-5L and AQoL-8D is 0.16 (95%CI 0.15, 0.17).

Fig. 2
figure 2

Bland-Altman plots of comparison among EQ-5D-5L and AQoL-8D utilities

AQoL-8D indicated a moderate effect size (Cohen’s d = 0.32/0.30) between the different diagnosis and gender of infertility patients, respectively, whereas EQ-5D-5L indicated a small effect size (Cohen’s d = 0.22/0.18). Furthermore, AQoL-8D indicated a larger effect size (Cohen’s d = 1.44) between poor wellbeing (WHO-5 scores < 13) and high wellbeing (WHO-5 scores ≥ 13) cut-offs, and it is higher than EQ-5D-5L (Cohen’s d = 0.67). These indicated that the AQoL-8D is more sensitive than EQ-5D-5L to measure changes in different characteristics of infertility patients’ HRQoL.

Relationships between HRQoL and SWB measures

Table 3 reports Spearman’s correlation coefficients between the WHO-5 and two HRQoL instruments. The AQoL-8D (r = 0.625) was more strongly correlated with WHO-5 than the EQ-5D-5L (r = 0.262). Among five EQ-5D-5L dimensions, pain/discomfort (r = -0.165) and anxiety/depression (r = -0.301) were significantly correlated with WHO-5 (both p < 0.01). For two super dimensions in AQoL-8D, the super psychosocial dimension (r = 0.630) was more strongly correlated with WHO-5 than the super physical dimension (r = 0.416).

Table 3 Spearman’s correlation coefficients between the WHO-5 and two HRQoL instruments

The KMO was 0.940 for pooled AQoL-8D, EQ-5D-5L, and WHO-5 items, Bartlett’s test of sphericity coefficient was 6921.2 (p ≤ 0.001), suggesting that the data were appropriate to conduct EFA [52]. The EFA based on the WHO-5, EQ-5D-5L, and AQoL-8D items is presented in Table 4. The degree of overlap was large when comparing the EQ-5D-5L with the AQoL-8D. Three factors were extracted based on the parallel analysis, and their correlations ranged from 0.416 (between factors 2 and 3) to 0.643 (between factors 1 and 2) in Supplementary Table 4. The degree of overlap was large when comparing the EQ-5D-5L with the AQoL-8D, and the five dimensions of EQ-5D-5L shared two common factors with the AQoL-8D (factor 1 and factor 3) in Table 4. Based on item loadings, these two common factors can be described as reflecting aspects of the psychosocial dimension (factor 1) and physical dimension (factor 3). All five WHO-5 items loaded on factor 2, a factor that was not shared by any EQ-5D-5L/AQoL-8D items (Table 4). The EFA result indicated that the two HRQoL instruments (EQ-5D-5L/AQoL-8D) and the WHO-5 are complementary measures as all five WHO-5 items were grouped into a standalone factor.

Table 4 Exploratory factor analysis comparing the WHO-5 and two HRQoL instruments

Discussion

This study evaluated Chinese infertility patients’ HRQoL and SWB based on EQ-5D-5L/AQoL-8D and WHO-5, respectively. This study demonstrated the psychometric properties of generic HRQoL and SWB instruments in infertility patients, as well as the complementary relationship between HRQoL and SWB instruments.

The mean score for the total infertility patients was 0.96 (SD: 0.05) based on EQ-5D-5L and was almost the same as the norm of the Chinese urban population (0.957, SD: 0.069) [55]. The mean score of Chinese primary infertility patients was 0.78 (95%CI 0.76, 0.79) based on the AQoL-8D, which was similar to the Dutch primary infertility patients’ mean utility value of 0.79 (95%CI 0.77, 0.81) based on TTO [17]. Furthermore, this study found that patients diagnosed with primary infertility had significantly lower HRQoL and SWB than patients diagnosed with secondary infertility, which was consistent with previous studies [56]. Existing research showed that primary infertility patients were more likely to suffer from greater levels of distress and depression than secondary infertility patients [57]. Women with primary infertility reported greater sensitivity to comments about their childlessness, and they experienced greater levels of fertility-related social concern (e.g., sense of social isolation or alienation) and decreased enjoyment of sex [58, 59].

This study found that the mean utility values (0.97/0.83) tended to be higher for males than females (0.96/0.79) based on the EQ-5D-5L and AQoL-8D, respectively. Previous empirical research has shown that the infertility of males’ coping ability and psychological adjustment were better than females’ [60]. The literature review has also shown that infertile women have a more intense impact than men on their (health-related) quality of life [21], especially Chinese women appear to undergo more of the blame for infertility [8, 11].

With regard to the psychometric properties of two generic preference-based HRQoL instruments, a poor absolute agreement was found. In particular, the EQ-5D-5L had a very high “ceiling effect” (47.7%). Among five dimensions, 46.0% of participants have anxiety/depression problems, which was higher than the Chinese urban population norms (26.85%) [55]. The possible reason is that infertility has negative effects on the psychological wellbeing and sexual relationships of couples [10]. Previous studies have also reported that infertility was likely to influence family relationships and marital relationships in Chinese culture [8, 11]. Although EQ-5D-5L is the most frequently used health state utility instrument in economic evaluations, it lacks items about psychosocial health [61] and is not sensitive to some diseases [62]. As a comparison, the AQoL-8D has 5 dimensions and 25 items related to psychosocial health [23, 35]. This difference may also explain why a much lower correlation was found between EQ-5D-5L and WHO-5 versus between AQoL-8D and WHO-5 wellbeing Index.

Regarding the relationship between HRQoL and SWB in infertility patients, the EFA showed that the dimensions measured by SWB changed into a separate factor that was different from those characterized in HRQoL instruments. This result indicated that the two HRQoL instruments (EQ-5D-5L/AQoL-8D) and the WHO-5 are complementary rather than substitutable. HRQoL generally picks up changes in certain health-related domains and focuses on deficits in functioning (e.g., pain). However, these domains may fail to pick up the broader impacts of healthcare in the experience of patient’s lives [19, 63]. Infertility is not just a reproductive dysfunction, and it also leads to psychological problems and influences psychological wellbeing [16]. Recent studies have reported that Chinese women undergoing frozen embryo transfer and repeated implantation failure patients have poor psychological status and quality of life [64, 65]. SWB covered a wider range of patients’ domains, among which health is one of the most important determinants of SWB [63]. Childbearing is a natural and essential part of married life in China’s traditional ideas, and children are an important part of maintaining family stability [66, 67]. Previous studies have shown that infertility usually affects patients’ SWB and family happiness, especially under the concept of Chinese family inheritance [68, 69]. SWB can help provide a more complete picture of the effects of healthcare [19]. The previous studies showed that the generic instruments (e.g., SF-36) were mostly used for assessing HRQoL in infertile couples but disease-specific instruments (e.g., FertiQoL) were rarely used [14, 15]. The disease-specific instruments have been proved a valid measure for the evaluation of infertility problems and their treatment effects [14]. Further research would be needed to use SWB and disease-specific instruments so as to measure the infertility patient’s health outcome comprehensively.

Our study has some limitations that deserve to be mentioned. Firstly, although all patients’ diagnoses have been verified from the hospital information system to ensure accuracy, limited clinical information was collected in this study. Consequently, any potential comorbidities were not included in this study. Future studies could validate the findings of this paper. Secondly, the scoring algorithm of AQoL-8D is based on preferences from Australians. However, empirical evidence from the literature suggests that using a country-specific scoring algorithm has only a minor impact on the results [70]. Thirdly, this study was conducted in one hospital, so it may not be representative of the Chinese infertility population. However, this hospital attracts infertility patients from other provinces of China for its reputation. Finally, based on the current cross-sectional study design, we are not able to explore the responsiveness of different instruments in infertility patients during the treatment or after successful pregnancy.

Conclusion

Patients diagnosed with primary infertility had significantly lower HRQoL and SWB than patients diagnosed with secondary infertility. Infertility females also tend to have poorer HRQoL than males. Poor agreement was found between two preference-based HRQoL instruments in infertility patients and the component of psychosocial health may explain the difference. The AQoL-8D which included more psychosocial items could be a better instrument to measure the HRQoL than the EQ-5D-5L, although both of them are complementary to the SWB measures by the WHO-5. More research is needed to explore the HRQoL and SWB among infertility patients in China.