Background

Pregnancy is a natural process that constitutes an important period of a woman’s life [1]. This process is characterized by physical, cognitive, emotional, sociocultural and anatomical changes in women’s body to maintain fetal growth and development and to prepare the mother for birth [2]. The signs and symptoms that occur due to these changes are physiological and should not be confused with diseases. Insufficient adaptation to the changes in pregnancy will have negative consequences to the quality of life in pregnant women [3].

World Health Organization (WHO) defines quality of life as “a person’s perception of his/her position in life within the context of the culture and value systems in which he/she lives and in relation to his/her goals, expectations, standards, and concerns” [4]. This definition portrays the quality of life as a concept that encompasses a number of states of existence. Within this context, health, together with life satisfaction and well-being, is one of the most valuable qualities of existence [5, 6]. As the first stage of human existence, pregnancy should be considered as a special period during which life satisfaction and well-being should be increased.

Prenatal care, which include the interventions to improve the quality of care for pregnant women and newborns, may increase positive pregnancy experiences. Existing studies suggested that rather than using the existing methods, new and period-specific instruments may be beneficial to measure positive pregnancy experiences [7].

International human rights law includes fundamental commitments of States to enable women to survive pregnancy and childbirth as part of their enjoyment of sexual and reproductive health rights and living a life of dignity [8]. The WHO envisions a world in which every pregnant woman and newborn receive high-quality healthcare during the perinatal period [9]. However, the WHO reported that about 295.000 women around the world died during and following pregnancy and childbirth in 2017. Besides, % 94 of these maternal deaths occurred in low-resource settings and most could be prevented [10]. Similarly, global number of stillborn babies in 2015 was about 2.6 million [11]. These statistical data reveal the need for healthcare interventions to reduce or prevent comorbidities and complications related with pregnancy and postpartum period. Subjective perceptions of women about health and the quality of life are key to measure the quality and effectiveness of maternal and child interventions.

Quality of life has both objective and subjective indicators. Objective indicators include daily life activities, self-care and satisfaction whereas subjective indicators are related to how people feel [6]. Various scales are used to evaluate the quality of life in health. Instruments measuring health-related quality of life (HRQOL) may vary according to the target population. These instruments may be grouped into the categories of generic and specific instruments. Generic instruments aim to measure HRQOL in general in various groups. While these instruments can help to have an idea of the overall conditions, they are not sufficient to detect clinically significant changes following treatment or interventions. On the other hand, specific instruments are necessary for specific fields and hypotheses or when generic instruments cannot measure a particular field of study [12] .

Generic instruments are frequently used to assess HRQOL [13]. Although SF-36 is the most frequently used instrument to measure HRQOL during pregnancy [14,15,16,17], other generic instruments, such as SF-12 [18,19,20], WHOQOL-BREF [21] and Nottingham Health Profile [22] are also used [13]. General measurement tools help to form an idea of the overall situation.

Although there is a great interest in the use of HRQOL instruments before, during and after pregnancy [14,15,16,17,18,19,20,21,22], the field covered by the existing instruments and their psychometric properties are a matter of debate. There is a number of problems with using HRQOL instruments during the pregnancy. The first problem is related with the conceptualization of HRQOL. While some of the scholars do not propose a definition for HRQOL [16, 23], others use more than one definition for the concept [24]. These problems may pose a risk to the validity and interpretability of the studies. Lack of a clear conceptualization may cause problems about operationalization and the instruments to be used for measuring the concept. Additionally, data obtained from studies without a clear conceptualization of HRQOL may not provide necessary information for the clinicians, policy makers and other stakeholders. Secondly, instrument development should not be solely on theory. Rather, opinion of experts and the target population should be taken into consideration and evidence from relevant fields should be used to increase the validity of the instruments [5, 13]. The final problem is that the generic instruments, which are performed to measure the quality of life during pregnancy cannot be verified on pregnant women [13, 14, 22]. Due to this reason, there is not enough evidence for the psychometric properties of instruments used during pregnancy.

The majority of the studies on the HRQOL in pregnant women mostly include only general psychometric properties [14,15,16,17,18,19,20,21,22] or focus on specific problems during pregnancy [25]. Evaluation of the period-specific HRQOL has become more important to investigate the effectiveness or preventive and therapeutic programs developed for women in pregnancy and postpartum periods. Valid, reliable and clinically usable HRQOL instruments with higher sensitivity and specificity are needed to appropriately measure the effectiveness of interventions during pregnancy and postpartum periods [6, 13] .

This study developed an instrument to measure HRQOL in pregnant women using an extensive literature review and expert opinions. This new instrument may be used by nurses, especially obstetrics and gynecology nurses, midwives and other health professionals, who have a critical role in providing healthcare to pregnant women, planning and implementing necessary interventions and evaluating health outcomes.

Methods

This study is an instrument development study aims to develop the Quality of Life in Pregnancy Scale (PREG-QOL) as a new instrument and psychometric testing of the content and construct validity, factor structure and reliability.

Study design

The PREG-QOL was developed in three stages: (1) creating an item pool, (2) preliminary evaluation of items, and (3) refining the scale and evaluating psychometric properties. Instrument development guidelines proposed by DeVellis (2021) and Carpenter (2018) were used to develop the PREG-QOL (Fig. 1).

Fig. 1
figure 1

Stages of the development of PREG-QOL

Phase 1. Generating the item pool

An extensive literature review and interviews were used to prepare the draft instrument. Firstly, the key words “pregnancy”, “quality of life”, “quality of life in/during pregnancy” and “HRQOL” were searched for in the online databases of MEDLINE/PubMed, Scopus, Web of Science and Google Scholar to review the literature on the factors related with the HRQOL during pregnancy. Frequently used expressions about the HRQOL in pregnancy were used to develop the items. In the next stage, we conducted in-depth face-to-face interviews with 20 pregnant women using semi-structured interview method. The interviews revolved around the issues of pregnancy, physical, social and emotional changes during pregnancy and relationship between spouses. Qualitative data collected during the interviews were analyzed using content analysis method. Item pool was generated based on the interviews and the guidelines suggested by the literature [26, 27] .

During the generation of item pool, statements of the participants were firstly reviewed by the researchers independently. Key expressions directly related with the quality of life in pregnancy were identified and their implicit and explicit meanings were analyzed. Expressions related with the opinions and feelings on the quality of life during pregnancy were used to develop the items. Secondly, five meetings were held by the researchers to decide on the items in the pool. The definition “situations that influence the quality of life during pregnancy” was used to identify the items to be included into the item pool. Two experts in obstetrics and women’s health and one expert in the field of measurement and evaluation attended to two of the meetings. Initially, 64 items were developed. Some of the items were revised in line with expert opinion and the number of items was reduced to 44. While developing the items, ambiguous, negatively worded and directive expressions were avoided and the items were worded in clear and concise way. a 5-point Likert scale was used to score the items.

Phase 2. Preliminarily evaluation of the items

Expert opinion

Expert opinion was received from 11 experts in the fields of obstetrics and gynecology nursing, psychiatric nursing, measurement and evaluation, statistics and linguistics. Davis method was used to analyze content validity. Item-content validity index (I-CVI) for each item and scale-content validity index (S-CVI) for the total instrument was calculated [28]. According to this method, I-CVI and S-CVI should be at least 0.80 in newly developed instruments [28]. Expert opinion revealed that I-CVI was higher than 0.80, S-CVI was 0.98, and 10 items had similar meanings. Although similar expressions increase the number of items and the reliability of the instrument, they will have negative consequences for the primary aim of developing the instrument since they will produce better results compared to other items [29]. Due to this reason, these items were removed from the scale. In conclusion, the draft instrument had 34 items.

Pilot study

The draft instrument was administered on pregnant women during face-to-face interviews. Participants were asked to first complete the draft instrument and then evaluate the instrument in terms of comprehensibility, readability and content of responses. They were also asked to comment on difficulties in completing the scale and to offer suggestions for improvement, including specifying any additional item statements they felt were missing or items that should be deleted [30, 31]. The participants expressed that comprehensibility of three items was weak so that these items were re-formulated after taking opinion of experts in measurement and and evaluation and linguistics. The final version of the instrument had 34 items.

Phase 3. Refining the PREG-QL and evaluating psychometric properties

Item reduction

Performance of each item was evaluated by computing corrected-total item correlation coefficient [32].

Construct validity

Exploratory factor analysis (EFA) (n = 350) and confirmatory factor analysis (CFA) (n = 150) were performed to evaluate construct validity [33]. Kaiser-Meyer-Olkin (KMO) coefficient and Bartlett’s test of sphericity were used to evaluate the fitness of data before performing EFA [34]. The number of factors were decided by scree plot and an eigenvalue greater than 1. Principal component analysis (PCA) and direct varimax rotation were used to determine the factor structure of the PREG-QL with EFA. PCA was preferred in EFA since it is one of the best prediction methods that can capture the highest variance and identify key factors by simplifying complex data [35]. A strong factor structure was obtained by eliminating cross-loading items, items with a factor loading lower than 0.30, eigenvalue lower than 0.30 and the items with a difference between factor loadings lower than 0.10. A factor loading over 0.30 is a sufficient criterion for the inclusion of the items to the instrument [29]. Due to this reason 0.30 was determined as the cut-off value.

Factorial structure revealed by EFA was tested with CFA. Fit indices were evaluated for a better model. Chi-square goodness of fit test, goodness-of-fit index (GFI), root mean square error of approximation (RMSEA), standardized root mean square residual (SRMR) and comparative fit index (CFI) were computed. After CFA, parallel test method was used to evaluate the correlation between PREG-QL and the SF-36 scale.

Reliability

Internal consistency of the scale was evaluated with Cronbach’s α coefficient. Test-retest method was used to validate the stability of scale over time. 15 days after the first administration, PREG-QL was re-administered on 40 pregnant women with similar characteristics to the sample and intraclass correlation coefficient (ICC) was calculated.

Setting and sample

Sample size in instrument development studies has been a contested issue and there is no universally accepted idea on the size of sample population [36]. Inadequate sample size may lead to instability of factors and prevent generalization. A large dataset is required to evaluate the factorial structure of instruments [37]. General consensus on the sample size is 10 participants per each item. Sample size independent of the number of items has also been proposed. The adequacy of sample size in instrument development studies has been evaluated as 50-very poor, 100-poor, 200-fair, 300-good, 500-very good; and 1000 or more-excellent [38]. A proper sample size is a prerequisite to develop an instrument with strong psychometric properties. Each stage of our study was conducted on different samples. The number of participants in item pool generation, pilot study, test-retest, item reduction and EFA and CFA stages were 20, 20, 40, 350 and 150 pregnant women, respectively. Total number of participants was 580. Inclusion criteria for the phases of the study were determined as follows: willing to participate to study, over 18 years of age, not having high risk factors during pregnancy. Potential participants were excluded if: they did not fill out questionaries fully and want to leave from the study.

Data collection

The study was carried out at a research and training hospital of a state university. In-depth interviews were conducted during the generation of item pool. Pregnant women were informed about the aim and scope of the study. Interviews were conducted with voluntary participants at a suitable time and place and were recorded by a tape recorder. Interviews took approximately 25 min.

During the stages that the factorial structure and psychometric properties of the instrument were evaluated, the researcher visited pregnant women in the obstetrics and gynecology ward, informed them about the aim and scope of the study, and asked voluntary participants to complete the PREG-QOL. The self-administered instrument was completed in about 15–20 min and collected by the researcher. Data were collected between April 2021 and August 2021.

Data analysis

IBM SPSS version 24.0 was used for the evaluation of the data obtained from the study and AMOS version 26 were used for CFA data analysis. Qualitative data obtained from the interviews were evaluated using content analysis. Number, percentage, mean and standard deviation were used for descriptive analysis. Content validity was evaluated using Davis (1992) method and I-CVI and S-CVI were computed. Quality of items were evaluated using item-total correlation coefficient. Construct validity was evaluated using EFA, CFA and parallel test method. Validity of factors was tested using chi-square goodness of fit test, GFI, RMSEA, SRMR and CFI. Internal consistency was evaluated using Cronbach’s α coefficient whereas test-retest method (ICC) was computed to evaluate the stability of the scale over time.

Results

General characteristics of participants

Ages and body mass index (BMI) of the participants ranged from 18 to 44 years (mean = 28.1, SD = 5.21), and 18 to 40 (mean = 28.3, SD = 4.36), respectively. Besides, %38.9 of the participants were in the third trimester, %40.6 had one pregnancy in total, %40.9 were graduates of high school and %63.1 had income exceeding monthly expenses (Table 1).

Table 1 General characteristics of the participants

Content validity

Using Davis method, expert opinion was obtained from 11 experts in the fields of statistics, linguistics, obstetrics and gynecology nursing, psychiatric nursing, and measurement and evaluation. I-CVI scores ranged from 0.91 to 1.00 whereas S-CVI was 0.98.

Item reduction

Items with an item-total correlation coefficient ≤ 0.30 were re-evaluated [29]. Item-total correlation coefficients of the 34 items ranged from 0.30 to 0.52.

Exploratory factor analysis

KMO coefficient (0.84) and Bartlett’s test of sphericity (χ2 = 2961.62, df = 325, p < .001) indicated that data were fit for factor analysis. Scree plot and eigenvalue over 1 criterion were used to determine the number of factors. Scree plot illustrated that eigenvalues of six factors were higher than 1 and the eigenvalues decreased significantly after the sixth factor. The findings indicated that the PREG-QOL was a six-factor model.

8 items that were cross-loaded or had a factor loading lower than 0.30 and eigenvalue lower than 0.30 were removed with EFA to obtain a strong factor structure. EFA, which was recomputed after the removal of these items, revealed a six-factor PREG-QOL with 26 items. Factor loadings of the PREG-QOL ranged from 0.41 to 0.90. The first factor was composed of 10 items (10,5,6,7,9,11,7,8,22,29,1), the second factor had 4 items (18,16,15,17), the third factor 4 items (24,26,32,23), the fourth 3 items (3,2,20), the fifth 2 items (27,28), and the sixth factor had 3 items (13,14,34), respectively. These six factors explained %56.17 of total variance. Variances explained by each factor were %24.24, %10.38, %6.77, %5.56, %4.98 and %4.23 respectively (Table 2).

Table 2 EFA and reliability analysis of PREG-QOL (n = 350)

Confirmatory factor analysis

Six factors and 26 items suggested by the EFA were tested using CFA. Modification indexes were evaluated to improve model fit. Binary correlations, which were analyzed for binary items, did not exceed 0.90. Tukey’s test for non-additivity was performed to evaluate additivity of the scale. A non-additive p value below 0.50 (p < .50) indicates that the scale is non-additive. Since the PREG-QOL had a non-additive structure (F = 56.74; p < .001), we performed first level CFA, which revealed that factors 3 and 5 constructed similar subjects on themes. Due to this reason, these two factors were combined under the factor of physical domain and the CFA of a five-factor model was performed (Fig. 2). Given the content of the items, this combination was theoretically and logically appropriate. In the next step, factors were labelled according to the content of the items. These factors were perception of general satisfaction, emotional domain, physical domain, health support systems and social domain, respectively. A CFA model was developed based on the five-factor structure. Fit indices of the final model were GFI = 0.822, CFI = 0.872, CMIN = 455.692, DF = 286, CMIN/DF = 1.593, RMSEA = 0.063, and SRMR = 0.072 (Table 3).

Fig. 2
figure 2

Standardized path coefficients for PREG-QOL

Table 3 Factor loadings and indices of fit for the PREG-QOL (n = 150)

Parallel test method

Using parallel test method, we calculated the correlation between PREG-QOL and SF-36 scale. The analysis showed that most of the correlation coefficients were statistically significant (p < .05), positive but very weak or weak (Table 4).

Table 4 Analysis of the relationship between the dimensions of the PREG-QOL and the SF-36

Reliability analysis

Cronbach’s α coefficients of the six factors were 0.883, 0.654, 0.727, 0.705, 0.827, and 0.622, respectively. Cronbach’s α of the PREG-QOL was 0.885. Test-retest method was used to evaluate the stability of the PREG-QOL over time. The PREG-QOL was administered twice on 40 pregnant women with a 15-day interval. The ICC was computed to compare the scores obtained from test and retest. The ICC scores for the perception of general satisfaction, emotional domain, physical domain, health support systems and social domain were 0.98 (%95 GA = 0.96–0.99, p < .001), 0.97 (%95 GA = 0.94–0.98, p < .001), 0.98 (%95 GA = 0.96–0.98, p < .001), 0.98 (%95 GA = 0.96–0.98, p < .001), and 0.97 (%95 GA = 0.95–0.98, p < .001), respectively (Table 5).

Table 5 Intraclass correlation coefficient of the PREG-QOL (n = 40)

Final instrument

The PREG-QOL was composed of 26 items in 5 factors, namely perception of general satisfaction (10 items), emotional domain (4 items), physical domain (6 items), health support systems (3 items) and social domain (3 items). The instrument was developed to evaluate quality of life during pregnancy and items were self-scored on a 5-point Likert scale. Scoring system was based on the calculation of mean scores for each factor, which ranged from 1 to 5. Items 11,12,13,14,15,16,19,20,21,22,23,25 and 26 were reverse-scored. Total score was not calculated. Higher scores obtained from the factors indicated a higher quality of life during pregnancy (Appendix 1).

Discussion

Although traditional methods, such as pregnancy-related mortality and morbidity rates, are the primary indicators of pregnancy and postpartum outcomes, they are no more sufficient on their own. Population health is important not only on the basis of saving life but also in terms of improving the quality of health [39]. Indicators associated with pregnancy should be evaluated based on evidence and the period of pregnancy should be effectively reported, monitored and evaluated. We have not found any instruments specifically measuring the quality of life during pregnancy. This study provided evidence on the validity and reliability of the PREG-QOL, which was developed to evaluate the quality of life during pregnancy.

Psychometric properties of PREG-QOL

Content validity of the PREG-QOL was evaluated using the classification proposed by Davis (1992). I-CVI and S-CVI were higher than acceptable lower limit (> 0.80) [40]. The scores indicated that the items sufficiently represented the structure. Although the EFA and then the CFA were recommended while analyzing content validity in instrument development studies, performing parallel tests can strengthen the instrument [41]. In this study, KMO coefficient (> 0.70) and Bartlett’s test of sphericity (p < .001) indicated that the data had normal distribution and sample size was fit for factor analysis [40] .

The EFA results indicated that total variance explained by the six-factor PREG-QOL was within desired range (%50-%60) for multifactor scales [42]. Elimination of factor loadings lower than 0.30 improved the representativeness the instrument and variance levels [29, 35]. The CFA was performed on a different sample and factors 3 and 5 were combined under a single factor, namely, physical domain. Parallel test results showed a statistically significant relationship between the factors of SF-36 and PREG-QOL (p < .05) though the correlation levels were not high. Low level of correlation may be a consequence of the differences between SF-36 and PREG-QOL in terms of structure, instructions, number of items and factors. These findings indicate that the five-factor structure of the PREG-QOL was adequate. The results of EFA, CFA and parallel showed that the PREG-QOL was a valid instrument.

Reliability of the scale was evaluated using Cronbach’s α coefficient and test-retest method. Cronbach’s α is suggested to be > 0.60 for factors and > 0.70 for the total instrument [43]. Cronbach’s α values of the PREG-QOL and the factors were higher than these limits. The ICC results showed that the instrument was stable over time [44]. These findings indicated that the PREG-QOL was a valid and reliable instrument.

Scale content

Developed to evaluate the quality of life during pregnancy, the PREG-QOL was composed of five factors, namely, the perception of general satisfaction, emotional domain, physical domain, health support systems and social domain. The instrument reflected the domains that influence the quality of life in women during pregnancy.

Factor 1 was labelled as the perception of general satisfaction since the items in this factor reflected general health perception that interact with the quality of life during pregnancy. This factor is parallel to a number of studies, which found that perception of general satisfaction was an important variable influencing the quality of life during pregnancy: satisfaction with physical appearance may be associated with perception of general satisfaction whereas gaining weight may have a negative effect on general satisfaction [45, 46] .

Having 4 items. factor 2 was labelled as the emotional domain since the items were related with positive and negative feelings and opinions of pregnant women. Expressions relevant with the concerns about pregnancy, birth and the health of their babies may be shown as examples to emotional domain. Various studies found that health status of babies had emotional effects on pregnant women [47, 48] .

Factor 3 was labelled as the physical domain since the items in this factor were related with physiological changes and problems during pregnancy. A number of studies also found that physical problems, such as backpain, nausea, vomiting, weakness and fatigue affected daily life of pregnant women, which, in turn, had a negative effect over the quality of life during pregnancy [14, 22, 49] .

Factor 4 was called as the health support systems since it included items about access to and satisfaction with healthcare and information provided by health professionals. This factor included statements about satisfaction with and ease of access to follow-up, care, diagnosis and treatment services, which are more needed during pregnancy, thus influencing the quality of life. The WHO recommended at least eight antenatal care (ANC), one in the first, two in the second and five in the third trimesters, to reduce perinatal mortality and improve satisfaction of pregnant women. The need for ANC increased as the trimester progressed [50] .

Finally, factor 5 was labelled as the social domain since its five items were related with personal relations, social support, recreational activities and leisure. Other studies also found that social interactions had an effect on the quality of life during pregnancy [21, 51] .

Limitations

Although we followed the steps proposed in the literature to develop an instrument with strong psychometric properties, the study a major limitation. The PREG-QOL was administered on Turkish pregnant women so that the findings may not be generalizable. Due to this reason, further studies should evaluate psychometric properties of the instrument in different countries. Besides, since the administration of the PREG-QOL in different cultures may lead to conflicting results, findings should be interpreted carefully.

Conclusion

This study, which was conducted in three stages and six steps found that the newly developed PREG-QOL was a valid and reliable instrument. The 26-item instrument was composed of five factors, namely, the perception of general satisfaction (10 items), emotional domain (4 items), physical domain (6 items), health support systems (3 items) and social domain (3 items). With its good psychometric properties, the PREG-QOL may be used to evaluate multiple factors of the quality of life during pregnancy.