Dear Editor,

In the Journal’s recent issue, Qorbani et al. [1] conducted a systematic review and meta-analysis to determine if a dietary diversity score is associated with cardio-metabolic outcomes. Given the importance of evidence synthesis for nutrition policy and practice, robust review methodology is necessary to avoid spurious conclusions about the health impacts of a mixed diet.

Dietary diversity is a long-standing nutritional concept for whether a diet is mixed of different types of foods or food groups. There remains in fact no standardized methodology to assess dietary diversity, although this heterogeneous literature typically uses two broad approaches: (1) Between-group diversity (also known as total dietary diversity)—a score for the total diet measuring counts, or proportions, of different food items, or food (sub)groups; and (2) Within-group diversity—a score for a given food group measuring counts, or proportions, of different food items or subgroups. Similar methods are often described using diverse nomenclature—at least 13 score names (Table 1)—often interchangeably in one study. The interchangeable use of terms complicates evidence synthesis of this topic since relevant studies will either use the same methodology to assess dietary diversity (e.g. count of different food groups consumed) with different exposure descriptions (e.g. DD, DDS, or DVS) [2,3,4], or use different methods to assess diversity (e.g. count vs proportion of foods) for the same exposure (DDS). The diversity of the nomenclature and methods used in this field further begs the question why this review did not use more comprehensive inclusion criteria nor justified their exclusion of relevant exposure measures.

Table 1 Different terms used to define total dietary diversity in the literature

This topic warrants a depth of nutritional knowledge and nuance of nutritional epidemiology that appears to be limited in this review, adding to our concern this review did not adhere to its PROSPERO protocol. The authors chose only two terms for their search strategy, leading to significant selection bias in their included studies that were all predominantly cross-sectional in design. Hence, this review cannot provide evidence to support causal inference which is needed to link dietary diversity with risk of diabetes. By searching for only ‘dietary diversity score’ and ‘food variety score’, this review omitted at least seven other studies on diabetes (Table 2) including a robust and rare longitudinal paper measuring dietary diversity by food groups [2]. The review’s selection bias is further exacerbated by the absence of EMBASE and conference publications per their pre-specified search strategy. Selection bias has largely affected the internal validity of this study where the authors conclude null association between dietary diversity and most of the cardio-metabolic risk factors, whereas excluded studies show mixed results for total dietary diversity (Table 2) with two longitudinal studies reporting greater diversity within vegetables significantly reduces the risk of diabetes [1].

Table 2 Studies on total dietary diversity and diabetes omitted from the review

Another methodological and conceptual concern about this review relates to the incoherence between its stated aims and inclusion/exclusion criteria. The authors say in both the abstract and main text that they consider dietary diversity as one of the aspects of diet quality. However, dietary diversity/food variety is only a measure of diet quality in the context of developed countries. There is a large body of scientific literature on diet diversity as a measure of child malnutrition in low-income settings where food penury is a public health concern not diet quality. Hence, the same term lacks construct validity across the development spectrum, and, therefore, it would not be methodologically appropriate to combine effect sizes from studies in different settings. Unfortunately, the current review lacked this specification in their search criteria and thus conclusions may be invalid due to inappropriate inclusion of some studies.

Finally, one of the unique insights of evidence synthesis is the quality appraisal of included studies and this requires that the appraisal tool and its purpose are appropriate and clear for those insights to be meaningful. Herein lies our greatest source of concern with this review. Although the authors reported they assessed the ‘risk of bias’ of included studies, they employed the Newcastle–Ottawa Quality Assessment Scale (NOS) that is not a tool for that purpose. Rather, NOS assesses the quality of papers only without directly determining the level of bias; the scoring system—scaled to 10 for cross-sectional studies and 9 for cohort studies—is based on the asterisk symbol, with some questions valued by two asterisks. The arbitrary cut-point of 6 is usually used to determine if the final NOS score indicates high (≥ 6) or low quality of evidence (< 6). Even if we accept the authors’ claim that NOS assesses bias, the high number of affirmative responses in their Table 5 would indicate there is a high bias in the included studies; however, the authors reported these studies were assessed as “having low risk of bias across domains”. The distinction between quality and bias seems relevant here. Another methodological concern is the unexplained alterations that were made to the NOS criteria. For example, the NOS has an additional question for cohort studies (n = 8) than cross-sectional (n = 7), but the authors indicated they appraised study quality using eight criteria regardless of study design. Moreover, the authors changed the NOS asterisk approach to yes/no scoring that obscures how affirmative responses to quality criteria can enable the authors to adequately and robustly grade the evidence. The authors also did not appear to use their quality appraisal in their data interpretation and evidence synthesis of the non-significant results that they attributed to inconsistency across studies. The reason for inconsistency should have been discussed based on NOS as, for example, the lack of pooled associations might be due to null findings of low-quality studies that have had comparability issues due to not statistically adjusting for quantity of food intake.

Evidence synthesis through meta-analysis is necessary to improve public health nutrition strategies, but a review can only add value when the scientific literature is carefully and transparently searched, extracted and quality appraised. This review missed available longitudinal data, had a poor exposure definition and lacked appropriate quality appraisal. Caution is, therefore, warranted when interpreting the review findings as they may offer more limited insights into a topic that remains understudied.