Introduction

Preimplantation genetic testing with in vitro fertilization, specifically when testing for monogenic conditions (IVF+PGT-M), reduces the risk of having a child affected by a known heritable condition. Patients undergoing the IVF process who are carriers for heritable conditions can have embryos screened before implantation with 97–100% diagnostic accuracy [1,2,3]. IVF+PGT-M facilitates conception of unaffected children. Despite the potential benefits of IVF+PGT-M for people with heritable conditions or traits who wish to have biological children, PGT-M is used in fewer than 2% of total IVF cycles [4, 5]. The barriers to IVF+PGT-M uptake by families with elevated genetic risk include cost, ethical concerns, emotional strain, and lack of awareness among potential users [6,7,8].

Since the first use of artificial reproductive technologies (ART), physicians have struggled to support patient awareness and understanding of their options [9]. Most patients at high risk of passing on genetic conditions are unaware of IVF+PGT-M [9,10,11,12]. A systematic review of studies among adults with known hereditary risk for cancer reported a cumulative 35% awareness of PGT-M as a reproductive option [9]. Patient education materials (PEMs) are a way to fill this knowledge gap. However, 43% of Americans (93 million people) have basic or below basic literacy skills [13] and most PEMs are written without attention to patient understanding and at reading levels that are inaccessible to this population [14,15,16,17,18,19,20,21,22,23].

When the mismatch between information provided in PEM and what patients understand was identified [24], scales were developed to assess facets of the literacy demand of PEM. Quantitative tools, like the Simple Measure of Gobbledygook (SMOG), measure readability through syllable and sentence count. Readability scores given by tools such as the SMOG are scaled to estimate the American school grade level required to understand the material [25]. Qualitative tools use agree/disagree questionnaires to determine the presence or absence of characteristics shown to improve patient comprehension. The Agency for Healthcare Research and Quality created and validated the Patient Education Materials Assessment Tool (PEMAT) to measure PEM understandability [26, 27], and the CDC created and validated the Clear Communication Index (Index) to measure PEM clarity and comprehensibility [28, 29].

The Joint Commission and the Institute of Medicine have both examined the negative effects of low health literacy on patient care and provided recommendations for assessment and improvement [30, 31]. The Joint Commission indicates that PEM should be written at or below a 5th grade reading level [32], but across medical disciplines, significant deviation from this recommendation occurs [14,15,16,17,18,19,20,21,22,23].

Whether PEM about IVF+PGT-M meets readability guidelines issued by the Joint Commission, understandability standards set by the Agency for Healthcare Research and Quality (AHRQ), or clarity/comprehensibility metrics created by the CDC is unknown. The purpose of this study was to assess the readability, understandability, and clarity/comprehensibility of existing PEM about IVF+PGT-M. We hypothesized that IVF+PGT-M information would fail to meet the Joint Commission, AHRQ, and CDC standards [26, 28, 32].

Material and methods

This study was considered exempt by the Johns Hopkins Institutional Review Board and occurred in two stages. First, we performed an environmental scan to identify PEM. Environmental scanning is a validated method for performing qualitative needs assessments. The technique applies a rigorous, systematic search protocol to academic databases and patient-facing sources, such as consumer search engines [33]. Second, we assessed the PEM that met study inclusion criteria using three validated tools, the SMOG, PEMAT, and Index.

Environmental scan to identify patient education materials

The environmental scan was designed using previously published approaches to this method [34,35,36,37]. One member of the study team (ME) performed the environmental scan using a search protocol designed in consultation with a Johns Hopkins School of Medicine’s Welch Medical Library informationist. The methods are summarized in Table 1. To identify peer-reviewed literature, searches were performed in EMBASE, PubMed, PubMed Central, CINAHL, EBSCO Health, Cochrane Library, SCOPUS, and Web of Science [34,35,36]. PEM was identified from the websites of professional organizations related to ART and the five most common genetic conditions for which PGT-M is used: balanced translocations, sickle cell disease, cystic fibrosis, Huntington’s disease, and Duchenne muscular dystrophy [38]. Known databases containing decision aids and PEM were browsed or searched [34]. We also reviewed materials identified among the first fifty [39] results for several queries on Google, a consumer search engine. To decrease the chance that the environmental scan overlooked widely used, high-quality materials or databases, we contacted ten genetics counselors from the Johns Hopkins University McKusick-Nathans Institute of Genetic Medicine and the Division of Maternal-Fetal Medicine to learn their most frequently used IVF+PGT-M resources; three of the genetics counselors provided answers for the group.

Table 1 Search strategy for environmental scan

PEM inclusion/exclusion criteria

ASRM practice committee guidelines indicate that PGT-M for monogenic conditions became an established intervention after 2004 [40]. Therefore, identified materials were included if they were published or updated after 2004. Materials were also required to meet the following inclusion criteria: written or translated in English, described in vitro fertilization and preimplantation genetic testing (or preimplantation genetic diagnosis), targeted patients as the primary audience, and accessible as a free resource or via academic subscription. Materials were excluded if any of the following criteria were met: communicated the majority of content through audio or visual format, failed to describe in vitro fertilization process, or mentioned PGT/PGD in passing without describing the purpose or procedure of the technology.

Patient education material evaluation

The SMOG [25], PEMAT [27], and Index [28] are publicly available and intended for use by untrained reviewers [27, 28]. A previous study used this combination of tools [14] because they evaluate different aspects of literacy demand and provide a multi-faceted assessment of how PEM content and format affect a patient’s ability to comprehend the presented information. We selected these three scales to holistically evaluate multiple aspects of the identified materials. Two members of the research team (ME, PK) scored the PEMs independently. Final scores were reached as described below. Table 2 provides an overview of the purpose, format, and standards associated with each scale.

Table 2 Reference table for evaluation scales

Simple Measure of Gobbledygook [25]

The SMOG provides a quantitative estimate of the grade level in the American public education system one would have to complete to understand a PEM. The assessment is performed by counting the number of words with three or more syllables in thirty representative sentences from each material. The scale is reported to have a standard deviation of precision of 1.5 grade levels. Reviewers count syllables following standardized, published procedures. Existing guidelines do not provide formal guidance on how to count abbreviations. Abbreviations are monosyllabic terms that are usually used to simplify multi-syllabic phrases. In PEMs that use many abbreviations, how abbreviations are scored can affect outcomes. Most research teams calculate SMOG scores by counting the number of syllables in the words the letters represent [41, 42]. The abbreviations IVF, PGT, and PGD occur frequently in the PEM evaluated in this study. Using typical scoring assumptions, these would, respectively, count as one (fertilization), two (preimplantation and genetic), and three (preimplantation, genetic, and diagnosis) polysyllabic words. To determine whether defining abbreviations as monosyllabic or polysyllabic would change the PEM reading level, the two graders used different assumptions: grader 1 (PK) counted all abbreviations as monosyllabic and grader 2 (ME) counted the abbreviations as the words the abbreviations represent.

Patient Education Materials Assessment Tool [26, 27]

The PEMAT qualitatively evaluates PEM understandability using a series of agree/disagree questions related to content characteristics, such as word choice, formatting, and illustrations. The PEMAT includes two free-standing sections, actionability and understandability. The phrasing of actionability questions is more appropriate for PEM encouraging behavior change. Since PEMs for IVF+PGT-M are not encouraging behavior change, only understandability was assessed. Scores are reported as percentages.

CDC Clear Communication Index [28, 29]

The Index is used to evaluate clarity and comprehensibility through questions that evaluate the presence of a main message and clear discussion of risk, among other characteristics known to support patient understanding. Scores are reported as percentages. PEMs that optimize clarity/comprehensibility approach 100% [27]. The Index “passing score” is 90% [28].

Scoring

For the quantitative scale measuring reading level, the SMOG, the final score was determined by taking the average of each reviewer’s independent score. For the qualitative scales, two graders (ME and PK) each read the PEMAT and Index’s user’s guides [27, 28], and independently evaluated the materials. The reviewers then reached consensus on each item of the PEMAT and Index in order to improve trustworthiness of results, specifically the confirmability [43]. Final scores for the PEMAT and Index were calculated according to the respective user’s guides after the reviewers reached consensus on each line item in the scales.

Data analysis

The SMOG grade level for each material was calculated as the arithmetic mean of the two graders’ scores. Trends regarding readability, understandability, and clarity/comprehensibility were reported as median scores with IQR for the sample of 17 materials. Inter-rater reliability was calculated for each scale using the appropriate statistical test: Cohen’s kappa for the categorical PEMAT and Index and Spearman’s rho for the ordinal SMOG scores.

RESULTS

An environmental scan yielded 17 materials that met inclusion criteria (Table 3). Among these, four were from patient education databases, six were from the consumer search engine queries, five appeared on the webpages of professional organizations, and two were identified by following links on professional organization webpages. Key informant input from genetics counselors did not lead to the identification of additional materials. All materials were free online; four were also available in PDF form. PEM ranged in length from 594 words to 3405 words. Most (n = 14/17) provided visual aids including graphics (n = 10), photos (n = 9), or videos (n = 2).

Table 3 Title, producing organization, and search strategy approach for identified materials

Readability was evaluated using the SMOG. The median SMOG reading level was 14.5 grade (IQR, 13.9–15.5 grade). Of the 17 materials, 14 scored at or above the collegiate reading level (13+ grade) and two scored at the 12th grade reading level, or the level of an American high school graduate (Table 4). Inter-rater reliability yielded a Spearman’s rho of 0.88, indicating a “very strong correlation” [44]. On 14 materials, grader 1 and grader 2 reported the same grade level or one grade level apart, an insignificant difference. On three materials, grader 1, who was counting abbreviations as monosyllabic, reported reading levels two grades lower than grader 2, who was counting abbreviations as the words they represented.

Table 4 Readability, understandability, and clarity/comprehensibility scores for PEM about IVF+PGT-M

Understandability was evaluated using the PEMAT. The median PEMAT score was 74.2% (IQR, 67.3–82.8%) (Table 4). Prior to consensus-building, kappa inter-rater reliability was 0.51, indicating “moderate” agreement [45]. Most or all PEM used clear numbers (n = 17/17), employed visual cues (n = 16/17), stated the purpose of the material clearly (n = 14/17), and used appropriate titles or captions with visual aids (n = 11/14). Only one provided a summary. About half used common/everyday language (n = 8/17), omitted content that distracts from the purpose (9/17), or appropriately defined medical terms (n = 9/17). Figure 1 shows the percent of materials that met selected itemized PEMAT evaluation standards.

Fig. 1
figure 1

Composite performance on questions for PEMAT (P) and Index (I) scales

Clarity/comprehensibility was evaluated using the Index [28]. No PEM received a “passing” Index score of 90%; the median score was 73.3% (IQR, 51.6–80.3%) (Table 4). Prior to consensus-building, kappa inter-rater reliability was 0.54, indicating “moderate” agreement [45]. Most PEM clearly stated and explained numbers (n = 17/17), explained risks and benefits (n = 17/17), and organized materials into chunks with headings (n = 15/17). Few “always” selected words the audience uses (n = 2/17), employed visual cues (boldface, font size, boxes) to support the main message (n = 5/17), or included illustrations or graphics that supported the main message (n = 6/17). Figure 1 depicts the percent of materials that met selected Index standards of evaluation.

CONCLUSIONS

Among the 17 patient education materials about IVF+PGT-M identified using a rigorous environmental scan protocol, none met the standards set by the Joint Commission [32] or the CDC [28, 29]. The median PEM reading level of 14.5 (IQR, 13.9–15.5) grade indicates these materials are accessible only to readers with some college education. These materials are inaccessible to the almost 40% of Americans who never seek higher education [46]. Even when abbreviations were counted as monosyllabic words, materials were written at the 12th grade level or above. All PEMs for IVF+PGT-M failed to meet the Joint Commission requirement that PEM to be written at or below a 5th grade reading level [32].

The median PEMAT was 74.2% (IQR, 67.3–82.8%), and the highest PEMAT score was 88%. A 70% passing score was used in the initial PEMAT validation study, but there is no established PEMAT score goal and effective material approach 100% [26]. Five of the reviewed materials scored less than 70% and are therefore considered “not understandable.” Two of these flunking materials were from an academic medical center and a professional society.

According to the CDC, the Index passing score is 90% [28]; no evaluated materials met this score. The evaluated PEM had a median Index score seventeen points below the recommendation at 73.3% (IQR, 67.3–82.8%). The highest score was 86.7%, which still fell below the CDC’s PEM performance threshold.

These results indicate that PEMs about IVF+PGT-M are not produced at a literacy demand level that most patients can understand, which is consistent with findings reported in the literature related to other materials from a variety of medical disciplines. Studies involving PEM regarding neurosurgery, otolaryngology, sickle cell disease, and others have uniformly demonstrated poor readability, understandability, and comprehensibility [14, 16,17,18, 20]. In obstetrics and gynecology, studies of PEM addressing labor analgesia, overactive bladder, and general obstetrical and gynecological queries have reading levels from ninth to seventeenth grade and PEMAT and Index average percentages ranging from 60 to 80% [15, 21,22,23]. We suspect that frequently appearing, polysyllabic vocabulary associated with IVF+PGT-M, including “artificial insemination” and “preimplantation,” may explain why the PEMs we evaluated had even higher reading levels than those reported in other obstetrics and gynecology materials. The field must find alternative ways to explain these complex topics in order to educate all patients sufficiently.

Based on the data from this study, existing PEMs usually failed to meet the following criteria: providing a summary, using common language, omitting tangential information, and supporting the main message with visual cues and illustrations (Fig. 1). Previous studies reporting question-by-question PEMAT and Index scores for PEMs from diverse fields also found that materials performed poorly on these characteristics [16,17,18, 20, 47]. Among these studies, some PEMs also demonstrated low performance on appropriately captioning visuals and stating the purpose of the material clearly [16,17,18, 20], but the IVF+PGT-M materials in this study scored well in these domains. Organizations, clinics, or individuals developing new PEMs can improve their product by addressing these common shortcomings. The CDC webpage “Plain Language Materials & Resources” offers helpful resources to those interested in improving their communication.

Clinical implications

The high literacy demand of PEM about IVF+PGT-M may have consequences for patients and families. In a pair of nationwide studies, 95.3% of 535 neurologists and psychiatrists [48] and 90.7% of 220 internists [49] felt underinformed to discuss PGT-M with their eligible patients. Yet doctors who manage chronic genetic conditions, like neurologists, psychiatrists, and internists, are well-positioned to introduce patients to reproductive options and refer them to reproductive endocrinologists or genetics counselors. A readable, understandable, clear pamphlet could support physicians as they introduce this technology to patients. Even physicians who are uncomfortable discussing IVF+PGT-M or lack time during their clinic visits to do so could provide patients with this information using this written education tool. Alternatively, if patients had access to understandable PEMs independent of their physicians, through patient portals, online tools, or community organizations, they could be more pro-active in self-referring for this intervention.

Women of every education level should be aware of comprehensive reproductive options and have access to standards of care. Most patients who use ART to achieve pregnancy are more likely to be white and have higher levels of education and income than those with infertility who did not seek treatment [50, 51]. Patients eligible for ART who have lower education levels are less likely to be counseled regarding or referred for ART [52]. Certainly, high cost of treatment is a significant barrier to care. However, disparities persist in settings of equal access, such as the military or states with ART insurance coverage mandates [50, 53]. Patients’ difficulties in understanding diagnoses and treatment options have been implicated in preventing less educated or literate patients from seeking treatment [51]. Assessing and improving patient education material to match the literacy levels of all eligible patients (i.e., by matching the literacy level of the adult US population) would remove one barrier to ART use.

Strengths and limitations

Among the strengths of this study is the rigorous study design. The search strategy was systematic and included peer-reviewed sources, gray literature, databases, and a consumer search engine. This systematic and reproducible approach identified materials from diverse sources. Second, this study assessed literacy demand using three assessment tools. While some studies limit PEM evaluation to scales that consider sentence and word length, this study included quantitative and qualitative assessments of PEM, providing a more thorough and thoughtful assessment of PEM quality. This kind of thorough evaluation facilitates concrete suggestions for improvement.

Another strength of our study was the high inter-rater reliability. Inter-rater reliability for the SMOG scale was “very strong” according to standards in the statistics literature with a Spearman’s rho of 0.88 [44]. Inter-rater reliability for PEMAT and Index, Cohen’s kappa scores of 0.51 and 0.54, respectively, was considered “moderate” based on standards in the literature [45]. To provide context, the original study validating the PEMAT had a kappa of 0.57 [27]. We did not identify a study that reported a Cohen’s kappa value for Index scores, likely because the Index was developed recently. Additionally, after scoring independently, the two graders reached consensus on each individual item of the PEMAT and the Index. This consensus-building process forced deeper thought and deliberation, which increases the trustworthiness of the scores [54].

The study also has several limitations. We have no certainty that these materials are used by patients or in the clinical setting. However, the environmental scan methodology included society websites, a consumer search engine, and key informant interviews to maximize the likelihood of capturing materials used in clinics. Reading levels may be over- or under-estimated, based on our handling of abbreviations, as described above. Even with this adjustment in scoring technique, these materials would still have a 13th grade reading level median, eight grade levels about the Joint Commission’s recommendation [32]. Developing PEM for IVF+PGT-M is challenging. Avoiding polysyllabic words (such as embryo, uterus, and preimplantation) to explain IVF+PGT-M may not be possible, making the 5th grade reading level as estimated by SMOG an inappropriate target.

Future directions

Assessment of the accessibility of multimedia education platforms for patients and physicians was beyond the scope of this work. Multimedia education platforms are important and increasingly popular [55]. However, many patients prefer to learn from printed education materials [56]. Paper and audio-video formats contribute uniquely to patient understanding and future studies could compare literacy-appropriate PEM to multimedia platforms to assess how each improves patients’ understanding. Our study focused specifically on PGT for monogenic conditions. Future studies could apply a similar assessment to the challenge of explaining PGT for aneuploidy. Others could assess the effect of presenting a literacy-appropriate PEM to eligible patients on attitude towards use of IVF+PGT-M as a reproductive option. Finally, issues of cost and coverage, which differ by state, medical indication, and insurance status can change abruptly. Future studies could assess patient preferences for when and how they learn about the costs of reproductive technologies.

In this systematic study of 17 PEMs for IVF+PGT-M, reading level and comprehensibility significantly exceed the literacy level of average Americans and failed to meet Joint Commission or CDC guideline recommendations. Materials that communicate information about IVF+PGT-M at a level that can be understood by general patient populations are needed.