Introduction

The World Health Organization estimates that by the year 2030, there will be 27 million new cases of cancer, 17 million deaths from cancer, and 75 million people living with cancer each year. The greatest effect of this increase will be felt in underdeveloped and developing countries [1]. Breast cancer is the most common cancer in women. The global incidence has increased from 641,000 cases in 1980 to 1,643,000 cases in 2010, a yearly growth rate of 3.1 % [2].

A breast cancer diagnosis brings great anxiety and distress, and during surgical or conservative treatment, the patient can suffer physical limitations such as reduced shoulder movement and psychological changes such as depression and low self-esteem [1, 35]. Considering the high incidence of breast cancer and the effect that it can have on women’s lives, greater emphasis has been placed on the research on quality of life of women with breast cancer in the last few years [6, 7].

Quality-of-life assessment has become more prominent as a measurement of the results of treatment in medicine. It is basically conducted by administering questionnaires, most of them written in English and geared toward an English-speaking audience. Quality-of-life assessment in cancer has also followed that trend and today a large number of specific questionnaires can be found in the literature, such as the European Organization for Research and Treatment of Cancer breast cancer-specific quality-of-life questionnaire (EORTC QLQ-BR23) [8], the Functional Assessment of Cancer Therapy-Breast (FACT-B), and the Functional Assessment of Cancer Therapy-Breast plus Arm Morbidity (FACT-B+4) [9], among others. However, for these questionnaires to be used in a language other than the source language, they must be translated and cross-culturally adapted into a new equivalent version of the original questionnaire.

The aim of the cross-cultural adaptation process is to produce a new version that is semantically and idiomatically equivalent to the original version [10]. Because of language and cultural difference, a simple translation is not sufficient [10]. To this date, there are several existing guidelines [1113] describing an adequate procedure of translation and adaptation of measurement instruments. However, after the cross-cultural adaptation, it is crucial to test the psychometric properties of the adapted questionnaire in the target population to ensure that the new version is reproducible, valid and responsive [12, 1416].

It is clear that there is an increasing number of questionnaires that assess quality of life in breast cancer. However, choosing a questionnaire can be difficult due to the methodological gaps and deficiencies in the processes of translation, cross-cultural adaptation, and assessment of measurement properties. Therefore, the objectives of the present study were: (1) to identify, through a systematic review, the breast cancer-specific questionnaires that have been cross-culturally adapted and (2) to critically analyze the quality of the translation, cross-cultural adaptation, and evaluation of measurement properties of the versions found in the literature.

Methods

Inclusion and exclusion criteria

To be included in the analysis, the studies had to assess breast cancer-specific quality-of-life questionnaires translated into a language other than the source language without restriction of year or language of publication and applied exclusively to women with breast cancer. Studies were excluded if they were duplicated in another database, if the title and abstract were not related to the topic, or if the questionnaires were in the source language. We also excluded notes or letters from the editor, systematic reviews, conference papers, books, dissertations and theses, as well as articles not found in libraries or not provided by the authors after email contact.

Search strategy

The search for articles was performed in the databases MEDLINE, EMBASE, CINAHL, and SciELO. The terms used in the search were based on the MeSH descriptors in English and DeCS in Portuguese. For example, in the databases MEDLINE, EMBASE, and CINAHL, some of the terms used were: “Questionnaires,” “Quality of life,” “Breast cancer,” “Translate,” “Validity,” and “Cross-cultural.” In SciELO, the terms were: “Questionários,” “Qualidade de vida,” and “Câncer de mama.” The terms were interconnected by the search operators “OR” and “AND.” The only limit used was “Human.” A detailed description of the search strategy is provided in “Appendix.” First, the resulting studies were analyzed based on the information in the title and abstract, and then, the remaining studies were read in full. The last search date was June 20, 2013.

Data extraction and methodological quality assessment of eligible studies

The description of the analyzed studies included language, year of publication, time of application and sample size. Data were extracted to describe all cross-cultural adaptation procedures (i.e., how the translation procedures were performed) and all measurement properties from each included study. Additionally, the translation, cross-cultural adaptation procedures, and the measurement properties were rated by the Guidelines for the Process of Cross-Cultural Adaptation of Self-Report Measure [12] and by the Quality Criteria for Psychometric Properties of Health Status Questionnaire [14], respectively. All the data extraction was conducted independently by two assessors, and the assessment of the methodological quality was determined by consensus.

The process of translation and cross-cultural adaptation evaluated in this study included initial translation, synthesis of translation, followed by the back translation, expert committee review, and test of pre-final version. The measurement properties evaluated in this study were data quality (ceiling and floor effects), construct validity, internal consistency, reproducibility (agreement and reliability), and responsiveness. These procedures are described with more detail in Table 1.

Table 1 Recommended criteria for translation, cross-cultural adaptation, and measurement properties according to the guidelines [12, 14]

Both the Guidelines for the Process of Cross-Cultural Adaptation of Self-Report Measure [12] and the Quality Criteria for Psychometric of Health Status Questionnaire [12] not only evaluate the quality of the translation, adaptation, and the evaluation of each measurement properties but also the methodological quality of each included study, such as number of translators required, adequate sample size, test–retest interval, and others (Table 2). This methodological quality has already been used in previous systematic reviews [17, 18].

Table 2 Quality assessment of the translation and cross-cultural adaptation process and of the measurement properties of health status questionnaires [12, 14, 17, 18]

Results

A total of 2,561 studies were found in the databases MEDLINE, EMBASE, CINAHL, and SciELO. Four [1922] of those studies would have been eligible because they included the relevant information in their abstracts; however, they were not found in libraries and were not provided by the authors after a request was sent by email. Therefore, only 24 articles assessed quality-of-life questionnaires that specifically evaluated breast cancer (Fig. 1). Most of the articles (13 in all) assessed the translation and measurement properties of the instrument EORTC QLQ-BR23, and eight articles assessed the FACT-B. Of the 24 eligible articles, 11 contained some information on the process of translation and cross-cultural adaptation and 23 contained information on the assessment of measurement properties, with 10 containing information on both the translation and adaptation process and the measurement property assessment.

Fig. 1
figure 1

Flow diagram of the selection process of the articles included in the analysis

Table 3 summarizes the analyzed studies and characteristics related to the language and year of publication, measurement properties examined in each study, time of application and sample size. Table 4 shows the list of six breast cancer-specific questionnaires assessed according to the guidelines for translation and cross-cultural adaptation of questionnaires [12]. Translation and back translation were the most tested stages, with 10 articles [2332] completing these parts of the translation process. In contrast, synthesis of the translation was the least frequent stage, having been described by only four articles [23, 24, 28, 31]. More than half of the articles, 14 in all [3346], used translated versions of the instruments, but did not provide information on the stages of translation and cross-cultural adaptation.

Table 3 Characteristics of included studies
Table 4 Evaluation of breast cancer-specific quality-of-life questionnaires, according to the guidelines for the procedure of translation and cross-cultural adaptation [10, 12, 13]

Regarding the languages of the new versions of the breast cancer-specific instruments, the most common was Chinese, with five articles [38, 4245]. However, none of these articles included information on the process of translation. There were also versions in German [24, 25] and in Indian languages [28, 29]. The EORTC QLQ-BR23 was the instrument with the most cross-cultural adaptations and the most tested measurement properties.

Regarding the measurement properties, Table 5 shows a list of the 24 articles (6 questionnaires) analyzed according to the criteria proposed for measurement properties of health status questionnaires [14]. Internal consistency was the most tested property in all of the eligible articles and showed doubtful classification in 15 articles [25, 26, 28, 3235, 3743, 46]. In contrast, no information was found on agreement. Construct validity was adequately tested in three studies [29, 31, 41] that used the FACT-B [31] and the EORTC QLQ-BR23 [29, 41]. Reliability results were obtained from eight articles [33, 35, 38, 39, 4245], with four showing positive classification [35, 38, 42, 44]. Of these, three articles [39, 43, 45] measured reliability using correlation and, despite the acceptable design, were classified as questionable [39, 43, 45], and only one article [33] showed negative classification, with intraclass correlation coefficient (ICC) <0.7. Responsiveness was tested in four articles [23, 27, 30, 36], and ceiling and floor effects were tested in only three articles [33, 35, 41].

Table 5 Evaluation of breast cancer-specific quality of life questionnaires, according to the guidelines of measurement properties of health status questionnaires [1113]

Discussion

The present study investigated the process of translation and measurement properties and assessed the translation and adaptation procedures and measurement properties of breast cancer-specific quality-of-life questionnaires. Most of the studies assessed the instrument EORTC QLQ-BR23. Based on the data for methodological quality, it is clear that the breast cancer-specific instruments with the most tested translations and measurement properties were the FACT-B and EORTC QLQ-BR23.

This review was based on current, internationally recognized guidelines [12, 14]. However, it is worth noting that there are other methods of assessing translation, cross-cultural adaptation, and measurement properties that do not include all of the stages tested in the present study, such as in the recommendations of the EORTC (European Organization of Research and Treatment of Cancer) followed by some of the articles. Indeed, Hui and Triandis proposed a dimension for cross-cultural validation [47, 48]. They postulated four dimensions of equivalence when attempting to internationally measure a construct such as quality of life. These dimensions are functional equivalence (adequacy of translation), scale equivalence (comparability of response scales), operational equivalence (standardization of psychometric testing procedures), and metric equivalence (transferability of scoring results from one culture to another) [47, 48]. Cross-cultural validation and measurement properties of a quality-of-life questionnaire are very difficult to be handled even at present. The current, internationally recognized guidelines [12, 14] are critical and useful to estimate functional equivalence and operational equivalence, but not perfect ones, because it is difficult to investigate scale equivalence and metric equivalence. Namely, there is still no perfect method (gold standard) for cross-cultural validation, and researches around these issues are still going on.

Considering translation and cross-cultural adaptation, only 11 articles [2332, 40] described this methodological process, and none of the questionnaires showed a complete description of the quality. The FACT-B was incompletely assessed in four of the articles included in this study. Deficiencies in the process of translation and cross-cultural adaptation were observed on all the stages [24, 28, 31, 40]. The EORTC QLQ-BR23 was described by seven articles included in this study [23, 2527, 29, 30, 32]. The cross-cultural adaptations of this instrument followed the recommendations of the EORTC [1517, 19, 20, 22]; however, only five articles [23, 26, 29, 30, 32] included some description of these adaptations. Despite this description, the process of translation and cross-cultural adaptation of the EORTC QLQ-BR23 was incomplete in all the studies. The eligible studies that evaluated the other questionnaires included in this systematic review (FACT-B+4, IBCSG, LSQ-32 and QLICP-BR) did not provide any information about the process of translation and cross-cultural adaptation.

Regarding year of publication, 20 studies [2326, 28, 29, 3136, 38, 4046] were published after the year 2000, when the guidelines for translation and cross-cultural adaptation were published [12]; however, none of the studies followed precisely the recommendations already available in the literature. Additionally, 14 studies [3346] did not provide any information related to the translation. As none of the studies followed the guidelines, it is difficult to apply the translated questionnaire in a breast cancer population, because there is no guarantee that this questionnaire is semantically and idiomatically equivalent to the original version [10].

In the assessment of measurement properties, 23 articles included some information on breast cancer questionnaires, but none of them tested all of the measurement properties. The criteria for measurement properties were published in 2007 [14]; however, 13 of the articles [24, 2732, 3639, 43, 45] included in this systematic review were published in 2007 or previous years. This explains the large number of inadequate or incomplete analyses in these studies. Still, 11 articles [23, 25, 26, 3335, 4042, 44, 46] were published after the criteria were published, and none of them followed the steps adequately and thoroughly, suggesting a lack of interest by the researchers in conducting high methodological quality studies.

The internal consistency of the questionnaire EORTC QLQ-BR23 received a doubtful or negative classification in all the eligible studies [23, 2527, 29, 30, 3234, 38, 41, 43, 46] because this property has not been adequately assessed in accordance with the guidelines. This classification is usually due to the absence of factor analysis or an insufficient sample size. The evaluation of factor analysis is important to determine the dimensionality of a questionnaire and consequently affects the internal consistency of an instrument [14]. Construct validity was classified as positive in two studies [29, 41]. In these two studies, the authors formulated hypotheses, described them in their methods, and at least 75 % of the results were within the expected values [14]. These hypotheses are essential and must be specified, since it would be easier to explain the low correlations than to conclude that the questionnaire is not valid [14].

The aim of this review is to fully analyze reproducibility through reliability and agreement, but although agreement is easy to interpret because it is expressed in the instruments’ measurement units, the eligible studies only tested reliability. Reliability of the EORTC QLQ-BR23 was classified as positive in only one study [38]. Responsiveness and floor and ceiling effects received a doubtful or negative classification. For the questionnaire FACT-B, the same scenario is observed. Only one translated version received a positive classification for reliability [42]. In the same way, a translated version of FACT-B+4 and of QLICP-BR received a positive classification for reliability [35, 44], and a translated version of the questionnaire IBCSG received a positive classification for responsiveness [36].

After analyzing all of the articles and the quality classification of the translation, cross-cultural adaptation, and measurement properties, we can conclude that there is a deficiency in the methodological quality of the breast cancer instruments used around the world as a result of inadequate testing and assessment of translation, adaptation, and measurement properties. The choice of the best breast cancer-specific quality-of-life questionnaire depends on attending explicit quality criteria that was evaluated in this systematic review [14]. In some cases, the translated version can show different measurement properties than that observed in the original version, because of the existence of cultural differences between distinct populations. Therefore, new studies are needed to fully assess these measurements in a clear and adequate fashion so that the best instrument can be used in this population. Other systematic reviews assess the same criteria as the present study in other instruments and confirm the presence of methodological deficiencies [17, 18, 4951].

Other studies that assessed the measurement properties and cross-cultural adaptations of breast cancer quality-of-life questionnaires [50, 52] used different forms of assessment and some methodological aspects that need improvement that was also in the present study. For example, the assessment of internal consistency must be improved, the assessment of construct validity must be related to pre-defined hypotheses, reliability must be calculated using acceptable measurements such as ICC and Kappa, and the process of translation must be reviewed to avoid low methodological quality that could affect the validity of the questionnaire. The main limitation of our study was the loss of some articles that might have contained important information, but that were excluded because they were not found in libraries and were not provided by the authors, despite request via email.

This systematic review aimed to follow the internationally recommended guidelines for translation and cross-cultural adaptation and for measurement property testing, as well as show the importance of conducting tests adequately to determine the best instrument to be used in clinical practice or scientific research. In conclusion, the available evidence on measurement properties and cross-cultural adaptations is very limited; therefore, it is recommended that caution be exercised when using breast cancer-specific quality-of-life questionnaires that have been translated, adapted, and tested. Finally, studies with high methodological quality are necessary to compensate for the insufficient and inadequate assessment of cross-cultural adaptations and measurement properties.