Abstract
The current chapter serves as an introduction and guide to the methodology for performing systematic reviews of the measurement properties of Patient Reported Outcome Measures (PROMs). The aim of this chapter is to inform clinicians of the most commonly used terms, definitions and processes in the field, in order to enable them to participate meaningfully in any relevant research projects, bearing in mind the limitations discussed at the chapter. A step-wise approach is followed, initially informing the readers about the definitions related to PROMs, but most importantly, explaining what the measurement properties entail. Following this, the methodology for performing a systematic review, is discussed. The chapter’s authors have opted to follow the methodological recommendations that have been proposed by the COSMIN initiative (Consensus-based Standards for the selection of health Measurement Instruments), who has produced significant publications in the field, providing detailed guidance for each step. All steps are reviewed and discussed, with particular focus on the evaluation of content validity, internal structure and the remaining measurement properties. Examples and tables of the necessary steps to perform the aforementioned assessments are presented throughout. Lastly, the process of how the results of this evaluation are amalgamated in order to produce the systematic review is presented.
Access provided by Autonomous University of Puebla. Download chapter PDF
Similar content being viewed by others
Keywords
- Patient Reported Outcome Measures (PROMs)
- Quality of life
- Measurement properties
- Content validity
- Construct validity
- Reliability
- Internal structure
- Responsiveness
- Interpretability
- COSMIN initiative (Consesus-based Standards for the selection of health Measurement Instruments)
- Systematic Review
Patient Reported Outcome Measures can be assessed by evaluating their Measurement Properties.
A systematic review can be performed in order to compare and evaluate PROMs, to make recommendations regarding their use, and to identify any gaps or the need for the design of a new instrument.
The COSMIN initiative (Consensus-based Standards for the selection of health Measurement Instruments) has provided thorough methodological guides for performing such a systematic review.
This involves a step-wise approach, to assess separately content validity, internal structure and the remaining measurement properties.
Following the current advancements and increased scientific interest in research relating to quality of life, particularly with the use of patient reported outcome tools, clinicians are frequently involved in relevant studies.
A clinician may be interested to investigate which tool is more appropriate for their practice, and this is the purpose of this methodological overview.
Nevertheless, although a clinician can massively benefit from a more in-depth understanding of this methodology, it is strongly advised that such studies should be undertaken in close collaboration with Epidemiologists and Biostatisticians.
Introduction
Aim of the Chapter
This chapter aims to discuss and present the currently used methodology for performing studies and systematic reviews on the measurement properties of PROMs.
It aims to initially provide some insight into the most common terms utilised in the fields of designing and interpreting reported papers and results on PROMs.
The process of PROMs design, and generation of a new PROM is beyond the scope of this chapter and is only discussed as part of the assessment and evaluation of studies for a systematic review.
What Are Patient Reported Outcomes (PROs) and Patient Reported Outcome Measures (PROMs)
Patient-reported Outcomes (PROs) have long been established in current medical research, as both primary and secondary outcomes of studies.
According to the FDA, a Patient-Reported Outcome (PRO) is any report of the status of a patient’s health condition that comes directly from the patient, without interpretation of the patient’s response by a clinician or anyone else [1].
As Patient-Reported Outcome Measures (PROMs) or, alternatively PRO instruments, we define the instruments that are utilised to measure PROs or capture PRO data, such as questionnaires that are completed by patients [1].
In the relevant literature, when referring to a PROM or a PROM instrument, authors may be discussing a questionnaire as a whole or single question.
What Are the Measurement Properties of PROMs
Μeasurement properties are essential criteria in the design and evaluation of a PROM.
Broadly, these are Validity, Reliability, Responsiveness and Interpretability. Detailed definitions will be discussed below.
Why Perform Systematic Reviews on Measurement Properties of PROMs
Provided that PROMS, looking at an area of interest, exist already (developed and/or validated), a systematic review may be performed, in order to compare the measurement properties of these PROMs, evaluate the quality of each PROM, identify advantages and disadvantages of each PROM, and ultimately, recommend which PROMs should be used in future studies.
In addition, if the results indicate a rather low quality of the available PROMs, or inadequate measurement of the area of interest, then the systematic review may inform and guide the design of a new PROM.
Current Methodology: The COSMIN Initiative
The vast majority of guidance and tools on PROMs interpretation, has been provided by the Consensus-based Standards for the selection of health Measurement Instruments (COSMIN) initiative [2].
The COSMIN initiative, after initially identifying the lack of clear definitions and widely accepted methodology [3], has specified the definitions of the measurement properties of PROMs [4], and also provides comprehensive guidance for performing a systematic review of outcome measurements, as well as handbooks for the interpretation and assessment of each measurement property in PROMs.
Definitions and Taxonomy
In order to perform a systematic review on measurement properties of PROMs, the researcher must be familiar with the measurement properties, and their definitions.
As mentioned previously, the COSMIN initiative, following a Delphi study, has recommended definitions for the measurement properties [4].
Most importantly, the initiative agreed on a taxonomy, incorporating the measurement properties [4].
According to this taxonomy, COSMIN identifies three main domains of measurement properties in assessing the quality of a PROM; Validity, Reliability and Responsiveness with Interpretability being considered as a fourth domain (Fig. 4.1 and Table 4.1). A fourth domain, Interpretability, is also considered [4].
Performing a Systematic Review
General
A systematic review on measurement properties of PROMs shares some common methodological features with any other systematic review. We will focus more on discussing the process of assessing the measurement properties.
The COSMIN initiative has provided summarising guidelines for performing a systematic review [5] as well as a more detailed user manual, describing the methodology in more depth [6].
In this section, we will present and discuss the processes recommended in these documents. All tables and figures are adopted from these sources.
The overall process and the steps that need to be followed, can be shown in the following flowchart [5].
As shown in the flowchart, a systematic review consists of three stages (Fig. 4.2).
Initially, as per routine practice, a literature search is performed followed by a thorough assessment of the measurement properties. Finally, recommendations can be exported and formed, and the review is reported.
Literature Search
The initial stage consists of the standard steps (steps 1–4) for performing systematic reviews.
-
Step 1: Formulating the aim
When deciding and developing the aim of the review, the four key elements that need to be included are the construct of interest, the population, the type of the instrument and the measurement properties of interest.
-
Step 2: Formulating the Eligibility Criteria
Not all studies mentioning the PROMs of interest are to be included. Eligible studies should fulfil the aforementioned four key elements. Most importantly, given the large amount of studies on different PROMs, the main focus should be studies looking at the assessment and evaluation of one (or more) of the measurement properties of the PROM, and certainly not studies just using the PROM as an outcome measurement.
-
Step 3: Performing the literature search.
Standard Cochrane methodology should be followed for performing the literature search. The four key elements of the aim need to be included, as can be shown in the following flowchart, depicting the search strategy and terms, as described by the COSMIN initiative [5] (Fig. 4.3)
-
Step 4: Selection of abstracts and full-text articles
Selection and review of the abstracts and full texts is performed in a routine manner with the general recommendation for this to be performed by two reviewers independently.
Evaluation of Measurement Properties
As demonstrated in the flowchart in Fig. 4.2, this is done in three main stages. Given the significance of content validity and internal structure, these are assessed separately, followed by assessment of the remaining properties.
-
1.
Content Validity
-
2.
Internal Structure
-
3.
Remaining Properties (Reliability, Measurement error, Criterion validity, Hypotheses testing for construct validity, Responsiveness)
Evaluation of Content Validity
The COSMIN initiative, given the significance and complexity of the evaluation of content validity, provides a separate user manual, with the relevant methodology [7].
According to the COSMIN recommendations, there are three aspects of content validity in a PROM:
-
Relevance
-
Comprehensiveness
-
Comprehensibility
In order to assess these, COSMIN recommends ten criteria for good content validity, which have been formulated following a Delphi study [8], as shown in Table 4.2.
To assess the above, we are using a stepwise process:
-
Step 1—Evaluation of the quality of the PROM development
-
Step 2—Evaluation of the quality of content validity studies on the PROM
-
Step 3—Evaluation of the content validity of the PROM
A more detailed description of the steps is provided below, but not in its full length and detail. For each step, COSMIN has very comprehensively provided relevant boxes, summarising the process in a rather succinct manner. These will also be presented below.
Step 1: Evaluating the Quality of the PROM Development
This step is further subdivided into steps 1a and 1b.
In step 1a, the quality of the PROM design is assessed (evaluating relevance).
In step 1b, the quality of any cognitive interview studies or pilot studies assessing the PROM, are examined (evaluating comprehensibility and comprehensiveness) (Table 4.3).
To perform the above steps, a number of items/questions need to be answered, as per the flowchart shown below (Fig. 4.4).
This describes 13 items/questions for Part 1a, and 22 items/questions for Part 1b. The detailed items are not presented here, and we would recommend reading the full manual, where the items are presented, along with further explanations and examples.
Step 2: Evaluating the Quality of Content Validity Studies on the PROM
In this step, we assess how patients and professionals were asked about the relevance, comprehensibility and comprehensiveness, either as part of the PROM design process, or as a separate content validity study (Table 4.4).
This can also be widely separated in Steps 2a, 2b and 2c (asking patients about relevance, comprehensiveness and comprehensibility), and steps 2d and 2e (asking professionals about relevance and comprehensiveness), as shown in the respective flowchart. Overall, there are 31 items/questions to be assessed (Fig. 4.5).
For Steps 1–2
As mentioned previously, the exact items that are utilised in each step are not presented here.
What is important to note is how ratings are provided for each item. A 4-point rating scale is utilised, as shown here.
-
Very good
-
Adequate
-
Doubtful
-
Inadequate
For each item, the COSMIN manuals provide detailed examples of what criteria should be fulfilled to achieve is rating. Below we provide an example, of Item 5, from step 1a (Table 4.5).
To ensure high quality, COSMIN recommends using a ‘worst score counts’ method, where the lowest rating is utilised as an overall rating.
For Step 1, the lowest rating in the respective items will correspond to the overall rating for the PROM development.
For Step 2, the lowest rating in the respective items will correspond to the overall rating of the content validity studies on the PROM.
Step 3: Evaluating the Content Validity of the PROM
For this step, content validity of the PROM is evaluated by examining the quality and results of already performed studies on the PROM. This, again, is further subdivided in three steps.
For step 3a, ratings need to be provided for relevance, comprehensiveness and comprehensibility, using the ten criteria for good content (presented previously), for three different aspects, as per the table shown below.
-
Methods and results of PROM development study
-
Content validity studies on the PROM
-
Reviewers’ own ratings of the PROM (Table 4.6)
Essentially, the ratings for the methods and results of the PROM development studies, and the content validity studies, are the ones already assessed in steps 1 and 2, according to the respective COSMIN boxes, and are utilised in this table.
With regards to the potential ratings of each criterion, these can be:
-
Sufficient (+): ≥85% of the items of the PROM (or sub-scale) fulfil the criterion
-
Insufficient (−): <85% of the items of the PROM (or sub-scale) does fulfil the criteria
-
Indeterminate (?): No(t enough) information available or quality of (part of a) the study inadequate
After ratings have been provided for each criterion, a final rating can be generated for relevance, comprehensiveness and comprehensibility. These three ratings are then combined to provide the Overall Content Validity Rating.
For these processes, COSMIN provides further tables and guidance in the manual, which are not presented here.
Importantly, given the individual importance of relevance, comprehensiveness and comprehensibility, it is recommended to report on them separately, if found relevant (different ratings/different importance), and not only as an Overall Content Validity Rating.
For step 3b, a qualitative summary of available studies is performed, providing a rating for relevance, comprehensiveness and comprehensibility, resulting in an overall rating for each domain, which will be added in the respective boxes of the aforementioned table.
Lastly, for step 3c, the ratings achieved from step 3b, are assessed with regards to the quality of the evidence that generated them, to determine how reliable these ratings are.
To do this, the GRADE approach is, as shown in the table below [9] (Table 4.7).
Summary of Content Validity Assessment
In summary, as per the COSMIN guidelines and the methodology to assess content validity, a structured and step-by-step approach was presented.
Sequentially, a number of aspects are being examined systemically, and the relevant outcomes need to be reported in a systematic review:
-
Quality of PROM development process (step 1)
-
Quality of content validity studies on the PROM (step 2)
-
Overall ratings for relevance, comprehensiveness and comprehensibility, as well as a summative overall content validity rating (step 3)
Evaluation of Internal Structure
When evaluating internal structure, the properties that need to be assessed include structural validity, internal consistency and cross-cultural validity, as defined previously.
As per the definition of internal structure, at this stage, reviewers need to evaluate if the items in a scale or sub scale are appropriately correlated manifestations of the same one underlying construct. Subsequently, this step is relevant for studies based on such a reflective model (not formative).
COSMIN recommends three steps for assessing internal structure, which are summarised in the following table (Fig. 4.6).
In the first step, the COSMIN Risk of Bias Checklist is utilised, by answering the relevant boxes for structural validity, internal consistency and cross-cultural validity/measurement Invariance, as demonstrated below [10].
In the second step, data extraction is performed from studies on PROMS, focusing on patient characteristics, methods and timings of administration, interpretability, feasibility and results on measurement properties.
COSMIN provides the relevant tables that can facilitate and guide this data extraction (Fig. 4.7).The outcomes of theses will be evaluated against the criteria of good measurement properties (Table 4.8).
In the third step, reviewers should perform a quantitative pooled analysis or qualitative summary, and evaluated against the criteria for good measurement properties. Lastly, as described previously, grading of the evidence with the GRADE criteria, needs to be performed (Table 4.9).
These tables are presented as examples, with the intention to provide the research with an initial overview of the process. The thorough and extensive work done by the COSMIN initiative has given us a very precise methodology, which we would be duplicating if we were to describe these processes in more detail. Therefore, we strongly recommend that researchers refer to the relevant manuals and checklists, as cited throughout the chapter—that can also be found on the COSMIN website.
Evaluation of Reliability, Measurement Error, Criterion Validity, Hypotheses Testing for Construct Validity and RESPONSIVENESS
The remaining measurement properties, are once again assessed in a similar process, with the use of the respective COSMIN Risk of Bias Checklist boxes, which are indicatively shown below (Tables 4.10, 4.11, 4.12, 4.13, and 4.14).
Report and Selection of Most Suitable PROM
This final stage consists of evaluating interpretability and feasibility, formulating the recommendations and reporting the systematic review.
-
Evaluation of Interpretability and Feasibility (Fig. 4.8)
These are assessed with the use of the relevant tables
-
Formulation of Recommendations
COSMIN suggests dividing PROMs into three categories, according to the quality of evidence. In that way, the reviewers can assess and define which of the PROMs they assessed would be recommended for further use in the field, which require further studies and improvements, and which should not be used.
The categories are shown below.
-
(A)
Recommended
PROMs with evidence for sufficient content validity (any level) AND at least low quality evidence for sufficient internal consistency
-
(B)
Further research required
PROMs categorised not in A or C
-
(C)
Not recommended
PROMs with high quality evidence for an insufficient measurement property
-
(A)
-
Reporting the Systematic Review
Reporting should be performed following PRISMA guidelines [11], and it is suggested it follows the flowchart that was presented initially (Fig. 4.9).
Limitations and Considerations
We have chosen to present the COSMIN methodology as a roadmap for performing systematic reviews on measurement properties of PROMs, mainly due to the structured approach and detailed recommended process.
Researchers that are interested in performing a systematic review on measurement properties of PROMs, need to be aware of potential limitations, prior committing to following this methodology.
On a recent article by McKenna and Heaney, several points have been raised and we consider it useful to briefly mention them here [12].
According to this, the authors claim that there is lack of evidence to support the COSMIN recommendations. It is discussed that the guidelines have been produced based on empirical evidence, and the experience of the COSMIN steering committee.
In addition to that, while performing Delphi studies to agree and produce recommendations in a scientifically robust manner, there may be concerns about the inclusivity of the participating professionals.
A further point raised, concerns the omission of several aspects in the assessment of the PROM, that the authors consider significant, such as the construct theories, the fundamental measurements, unidimensionality, item generation and reduction.
Moreover, it is identified that there has been no actual evaluation of the COSMIN guidelines themselves. As an overall concept, the critique concludes that the COSMIN guidelines and recommendations are not evidence-based.
Lastly, the most significant point relates to who utilises and attempts to follow the COSMIN methodology.
As the vast majority of the researchers performing these reviews are clinicians, and given the complexity of the COSMIN guidance, it may be extracted that they lack the necessary expertise and ability to interpret and evaluate the relevant information, hence producing inaccurate reviews and recommendations.
Overall, we feel that through this chapter, a researcher may be introduced to the basics of performing systematic reviews on measurement properties of PROMs, and the COSMIN methodology and guidelines can be used as they introduce a step-wise approach and thorough approach.
Nevertheless, the limitations discussed bear some value—particularly with regards to the researcher’s expertise and background in the field. These should be meticulously taken into account, and the research team should certainly consider the involvement of professionals with a strong background in measurement, psychometrics, statistics and health-related quality of life research.
References
Chapter 18: Patient-reported outcomes. Cochrane training. Available at https://training.cochrane.org/handbook/archive/v6/chapter-18. Accessed 4 Oct 2022.
COSMIN. Improving the selection of outcome measurement instruments. Available at https://www.cosmin.nl/. Accessed 4 Oct 2022.
Mokkink LB, et al. Evaluation of the methodological quality of systematic reviews of health status measurement instruments. Qual Life Res. 2009;18:313–33.
Mokkink LB, et al. The COSMIN study reached international consensus on taxonomy, terminology, and definitions of measurement properties for health-related patient-reported outcomes. J Clin Epidemiol. 2010;63:737–45.
Prinsen CAC, et al. COSMIN guideline for systematic reviews of patient-reported outcome measures. Qual Life Res. 2018;27:1147–57.
Mokkink LB, Prinsen CAC, Patrick DL, Alonso J, Bouter LM, de Vet HCW, Terwee CB. COSMIN manual for systematic reviews of PROMs COSMIN methodology for systematic reviews of Patient-Reported Outcome Measures (PROMs) user manual. 2018.
Terwee CB, et al. COSMIN methodology for assessing the content validity of PROMs. Available at https://www.cosmin.nl/wp-content/uploads/COSMIN-methodology-for-content-validity-user-manual-v1.pdf. Accessed 4 Oct 2022.
Terwee CB, et al. COSMIN methodology for evaluating the content validity of patient-reported outcome measures: a Delphi study. Qual Life Res. 2018;27:1159–70.
GRADE Home. Available at https://www.gradeworkinggroup.org/. Accessed 4 Oct 2022.
Mokkink LB. COSMIN Risk of Bias checklist [PDF File]. The Amsterdam Public Health Research Institute; 2018. p. 1–37.
Page MJ, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ. 2021;372:n71.
McKenna SP, Heaney A. Setting and maintaining standards for patient-reported outcome measures: can we rely on the COSMIN checklists? 2021;24:502–11. https://doi.org/10.1080/13696998.2021.1907092.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Argyriou, O., Chatzikonstantinou, M., Patel, V., Athanasiou, T. (2023). Methodology for Systematic Reviews on Measurement Properties of Patient Reported Outcome Measures (PROMS). In: Athanasiou, T., Patel, V., Darzi, A. (eds) Patient Reported Outcomes and Quality of Life in Surgery. Springer, Cham. https://doi.org/10.1007/978-3-031-27597-5_4
Download citation
DOI: https://doi.org/10.1007/978-3-031-27597-5_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-27596-8
Online ISBN: 978-3-031-27597-5
eBook Packages: MedicineMedicine (R0)