Keywords

8.1 Introduction

8.1.1 What Is Translational Research

Translational medicine (TranM) , according to the European Society for Translational Medicine definition, is “an interdisciplinary branch of the biomedical field supported by three main pillars: bench-side, bed-side and community.” The goal of TranM is to combine disciplines, resources, expertise, and techniques within these pillars to promote enhancements in prevention, diagnosis, and therapies of diseases, including oral diseases. Our own interpretation of TranM is that “TranM is a branch of science where what has been learned from the laboratory bench is tested in the clinical settings and finally is applied to the population to reduce human diseases.” Thus, translational research (TranR) pertains to all the research activities preceding TranM including laboratory assays and their applications to individual patients and the populations at large. In the application steps, it is vital that at all steps of the principles of epidemiology, especially the evidence-based principles, are adhered to.

Even as early as 400 BC, Hippocrates recognized the importance of precise fact-finding methodology in medicine. To assess scientific facts precisely, one needs to utilize appropriate methodologies which often require accurate laboratory procedures which generate reliable, consistent, and reproducible data to measure biomarkers or other pertinent molecules. While both simple/classical and state-of-the-art technologies can meet the prerequisites of robust result acquisition, they are both, nevertheless, subject to the inherent need to utilize the principles of the evidence-based approach . Using elaborate laboratory experiments that employ sophisticated technology does not negate the necessity of recognizing and applying the principles of the evidence-based approach. These principles include:

  1. 1.

    Development of a scientifically sound rationale

  2. 2.

    Establishing a causal relationship

  3. 3.

    Controlling for other competing factors (confounding)

  4. 4.

    Minimizing biases both in human subjects and instruments

  5. 5.

    Reducing measurement errors

  6. 6.

    Utilizing the appropriate statistical methods

Traditionally, not all laboratory scientists are well-informed in these areas of evidence-based methodology, and those who are may frequently overlook these principles as they become weakened/obscured by various competing interests. Therefore, evidence-based principles remain critical and invaluable component of the translational research that demand within the context of this chapter elaboration. In this chapter we will detail the importance of an evidence-based approach in dental TranR and highlight some examples where these principles were applied successfully and not so successfully. Also, we hope to guide laboratory and bench researchers, who, while well versed in complex molecular pathways, are not fully aware of these principles so that this information can be carried and applied to the clinical setting.

For example, our own cross-sectional study “Salivary immunoglobulins and prevalent coronary artery disease” reported an association between salivary immunoglobulin A (sIgA) and coronary artery disease (CAD) [1]. The word “prevalent” coronary artery disease in the title suggests that it is a cross-sectional study where the predictor and outcome were assessed at the same time. Therefore, this study could not have assessed causality. Prevalence is disease occurrence at one time point without the consideration of the time when it occurred. Therefore, the disease could have occurred before the predictor assessment. Meanwhile incidence refers to an occurrence of new cases starting from the study beginning when the predictor has been assessed. Therefore incidence rate indicates a longitudinal assessment of the relationship of predictor and the outcome. However, longitudinal relationship does not automatically establish causality. We will discuss the causality establishment in detail in the Sect. 8.3.

Although our IgA study adjusted for all pertinent confounding, the cross-sectional study design prevents the inference of causality [1]. Nevertheless, this study is quite important in other ways. If infections are considered a causal driver of inflammation involved in atherosclerosis, the assessment of the infection should be done on the mucosa where pathogens first contact the host. Thus, IgA, a marker of mucosal immunity, is an appropriate marker for infection. The importance of the appropriate immunoglobulin usage was ignored by many renowned investigators [2,3,4], but only few recognized [1, 5]. For this reason, Chlamydia pneumoniae IgA was significantly associated with coronary heart disease [5] while Chlamydia pneumoniae IgG was not [4]. Our study has proven that oral mucosal immunity measured by salivary IgA was positively while systemic immunity measured by salivary IgG was inversely associated with CAD [1].

In non-causal association, the risk is usually reported by the odds ratio (OR) , while the causality study usually reports incidence rate (IR). However, incidence rate does not always establish causality. For example, a newly published study reported that canakinumab , an interleukin-1β inhibitor, decreased the incidence of lung cancers [6]. Although canakinumab lowered the number of new cases of lung cancer, this does not mean inflammation inhibited by interleukin-1β is a causal factor for lung cancer. Because lung cancer has a long (30–50 years) latency [7], this suggests that the true cause had initiated pathogenic processes 30 plus years ago. Thus, what was observed in 2 years cannot be a causal factor, although it is possible that canakinumab could have modified the disease progression after the initiation of pathogenesis. Because the cancer microenvironment creates a low immune milieu to avoid the immune system’s detection and destruction of cancer cells [8]. Cancer might have been caused by other factors, but inhibiting the inflammatory process might have decelerated lung cancer manifestation. In summary, dental and medical researchers need to be critically aware of differentiating between association and causality.

8.2 Development of Sound Rationale

This criterion is of foremost importance. No matter how well other criteria are fulfilled, without a sound rationale, the end results can be misleading or may impose potentially serious undesirable diversion of the health resources. This oversight can be quite subtle, and even very experienced scientists may fail to notice the problems in the ill-conceived rationale at the study initiation stage. If the study involved is a randomized clinical trial (RCT) , the repercussions are even greater, because RCTs are considered the gold standard, and it is assumed that all the confounding factors will be balanced across the groups under comparison. However, the bias resulting from an ill-conceived rationale cannot be corrected by the study design or statistical analyses.

One prime example is the Women’s Health Initiative (WHI) group’s estrogen replacement therapy (ERT) and cardiovascular diseases trial [9]. Everything in this trial was done following the principles of RCT epidemiology. However, the investigators failed to recognize irreversible changes that occur with age and included women 70 years or older in the study. Older women not only have subclinical vascular changes that precede cardiovascular events but also thrombotic tendency dramatically increases with age [10].

Thus, with additional estrogen which increased the thrombotic activities and resulted in the increased risk of cardiovascular disease (CVD) seen in that study. Consequently, it is not clear whether estrogen is the culprit or old age is the cause for the increased risk for CVD. The uproar following the publication of the results of this trial forced the WHI investigators to reanalyze the data and published the results of post hoc subgroup analyses including only those women who were given estrogen immediately after menopause. The results of post hoc analyses showed no detrimental effects of ERT given at the right time to appropriate cohorts and may even have had some beneficial effects on various inflammatory diseases [11,12,13,14]. The key point is that age is an effect modifier in ERT, and specific time and appropriate cohort should be considered.

Another important example is the clinical trials as well as their meta-analysis using bisphosphonates (BISPs) alone or with adjuvant as chemo-/hormonal therapy in cancer patients which revealed inconsistencies in results regarding whether BISPs have anticancer effects or not [15,16,17,18,19,20,21]. In a more recent report of clinical trials, it was found that at doses used for osteoporosis, neither alendronate nor zoledronic acid reduced the risk of breast cancer [22], contrary to reports of a protective effect seen in several observational studies [18, 19]. Furthermore, data analysis from adjuvant bisphosphonate trials showed no effect on local recurrence or contralateral breast cancer incidence [23]. Hence, over the several years, many clinical trials were conducted, yet the inconsistencies remain to date. Multiple deficiencies in studies were observed:

(a) Ill-designed rationale.

(b) Lack of clear understanding of pharmacokinetics of the drug.

(c) Impact of performing clinical trials with mixed patient background such as postmenopausal (ages ~50–70) and premenopausal (ages 35–50), as well as pooling those who are undergoing hormonal therapy or without hormonal therapy.

(d) Predetermined biased attitudes of the investigators based on the findings from animal experiments, which showed beneficial effects were at least ~100-fold higher than that of the maximum possible dose for human use.

(e) Different drug dose usage in different clinical trials and variations in length of time for observations.

(f) Different analytical methods used for evaluation of the outcome of drug action.

8.2.1 Preconditions for Mendelian Randomization

A currently popular longitudinal study equivalent is using genetics to determine the subsequent risk of disease occurrence called Mendelian randomization . This method was hailed as an alternative to longitudinal study to circumvent a long follow-up, confounding, and biases without conducting a traditional randomized trial. Because genes are present at birth, genes will always precede any disease that can occur later in life. However, many studies ignored the fact that the disease of interest must be under strong genetic influence. One recent study reported that the gene loci associated with obesity such as FTO, MC4R, and TMEM18 did not predict periodontitis [24]. In our opinion, periodontitis is influenced by epigenetic and lifestyle factors such as aging, smoking, diabetes, and general immune dysfunction. This study showed that genetics play a minor role in the relationship between periodontitis and obesity.

Only 20% of BMI can be explained by genetics [25]. The underlying causes for obesity include complex interactions between genetic traits, low physical activity levels, excess caloric intake, and type of diet that encourages certain microbial growth in the gut, as well as environmental factors such as access to affordable, healthy food, and socioeconomic status [26]. Some twin studies report that 60–70% of BMI can be explained by genetics [27, 28]. However, it should be noted that a cohort of twins cannot be considered as an independent population, and this result should not be applied to heterogeneous populations. Moreover, sophisticated gene sequencing cannot overcome a misguided study rationale. Conversely, it could be said that epigenetics and lifestyle factors such as smoking, physical activity, diet quality, and caloric intake may have stronger influence in developing high BMI or periodontitis than genetics. These epigenetic factors are modifiable risk factors, and further understanding of epigenetic mechanisms may help prevent burgeoning BMI and/or periodontitis.

In another study, even the leptin receptor gene predicted only small portion of body weight in a genetically homogeneous population [29]. If lifestyle factors have stronger impacts on disease phenotype, then the genes associated with BMI such as FTO (rs1121980), MC4R (rs17782313), and TMEM18 (rs6548238) will not show any association with periodontitis. Consequently, null results will not clarify whether “BMI is truly not related to periodontitis” or “BMI to periodontitis relationship is not strongly affected by genes” and merely pointing an inappropriate study rationale. One example in an inappropriate study rationale can be found in “studying sexually transmitted disease in nuns” for obvious reasons.

Another early example of misguided use of Mendelian randomization was the first report of this kind regarding the causal role of C reactive protein (CRP) in CVD. Two studies reported null results discrediting CRP in causal relation to CVD [30, 31]. Our argument is not based on the fact that whether CRP is a cause for CVD or not. Rather, we question the validity of examining the genes to determine CRP’s role in CVD. CRP levels change largely due to epigenetic and metabolic influence such as increased BMI [32,33,34,35] which is a modifiable CVD risk owing to an imbalance in caloric intake and expenditure. As was reported only 20% of BMI can be explained by genetics [25] and CRP, which is a BMI-driven inflammatory marker, and the gene associated with CRP may not show any relationship to CVD. In other words, these studies missed 80% of CRP’s role in CVD pathogenesis.

8.3 Criteria for Causality

To reduce human disease, it is necessary to identify the factors that cause the disease and clarify how to minimize the exposure to these causative risk factors. In 1965, Sir Austin Bradford Hill, an English epidemiologist, suggested a set of criteria that may suggest a potential causal relationship between the factors. These six criteria are:

  1. (a)

    Temporality

  2. (b)

    Strength

  3. (c)

    Consistency

  4. (d)

    Specificity

  5. (e)

    Biological gradient

  6. (f)

    Biologic plausibility

8.3.1 Temporal Relationship

This criterion is the most important of all criteria for causality establishment and must always be satisfied. In other words the cause must precede the outcome at all time. What happens after the disease manifestation cannot be the cause for the disease. This seems obvious, but often the disease has long latency, and subclinical pathology is going on for a long time, and it can be difficult to determine whether the predictor is the cause or the result of the disease. For example, self-reported periodontitis recently found to be associated with non-Hodgkin’s lymphoma in a prospective follow-up study [36]. Non-Hodgkin’s lymphoma (NHL) in this study included several slow-growing lymphatic malignancies such as chronic lymphocytic leukemia, small lymphocytic lymphomas, diffuse large B-cell lymphomas, and follicular lymphomas. Certainly, the temporality requirement has been satisfied, i.e., predictor periodontitis was assessed before the diagnosis of NHL. However, causality is not as clear in this case because non-Hodgkin’s lymphoma has a long asymptomatic latency which accompanies low immunity [37]. Thus, immune dysfunction prior to the cancer diagnosis is quite possible due to many immature lymphocytes which cannot generate strong immunity crowding the circulatory system. As such, periodontitis may be one manifestation of low immunity originating from yet to be diagnosed NHL in this case. In fact, anemia and leukemia manifest in the periodontium as periodontitis and gingivitis [38]. Therefore, reverse causation is quite possible in the relationship of periodontitis and non-Hodgkin’s lymphoma [39].

Due to the temporality requirement, cross-sectional studies which assess the predictor and the outcome at the same time cannot prove a causal relationship. Unfortunately, in dental research, this causality consideration is often neglected. The caveat is that a significant predictor-outcome relationship even in a longitudinal study does not certify causality [40]. All other confounding variables must be controlled, and the rationale has to be sound and biologically plausible.

8.3.2 Strength (Effect Size)

Although small effect size does not preclude causality, a large effect size is more likely to suggest a causal relationship. For example, if smokers are eight times (800%) more likely to have periodontitis than nonsmokers, then smoking may be a causal risk factor for periodontitis [41]. On the contrary, if the association has only a 20% increase in risk as in the case of having periodontitis and the risk of future CVD [42], it contains a high likelihood of having a non-causal relationship such as due to residual confounding, measurement errors, or even chance occurrence.

8.3.3 Consistency (Reproducibility)

If different scientists at different time periods report similar findings, this suggests the likelihood of a causal relationship. However, this assumption of reproducibility as a marker for causality must be interpreted with caution. It is possible that if several groups used the similar flawed methodology, the consistency does not support causality. Rather, it supports the theory that flawed methods consistently generate similar erroneous conclusions. One example refers to a study where a questionnaire was used to assess periodontitis and tested whether having periodontitis increased the risk of CVD. They observed no relationship (null result) [43]. A second study used exactly the same questionnaire and found similar null results [44]. A subsequent meta-analysis has proven that using an imprecise questionnaire in predictor assessment caused underestimation of the relative risk due to non-specific misclassification [42]. Non-specific misclassifications will move the results toward the null: in other words, the contrast between the compared groups will diminish due to the mix-up in the categorization of the exposure.

8.3.4 Specificity

Causation is more likely if the association occurs in a specific population and specific tissues or organs with no other overlapping factors. One negative example is C-reactive protein (CRP) . Minor CRP increases (2 mg/L) are observed in about 50% of the US population [45] and associated with over 100 biological conditions including aging and strenuous physical activites [46, 47]. Minor increases in CRP are presumed to indicate cell stresses that may or may not be pathologic [47]. Thus, holding CRP responsible for one disease may be a difficult task because it is necessary to control for over 100 other comorbidities or pathologies. Similarly, IL-6 is a pleiotropic signaling molecule involved in many biologic actions. It plays an important role in the immune response, hematopoiesis, inflammation, oncogenesis, and other transcription factor expressions. Thus, IL-6 is not specific enough to prove its role in one disease or in one pathway.

8.3.5 Biological Gradient

This is also called dose-response. Lower level exposures would generate less serious outcomes, while greater exposures will bring about more severe outcomes. Dose-response does not always mean causality. In some disease, there may be a distinct threshold rather than a dose-response, and yet, the predictor may be a causal risk factor. For example, some causal risk factors show significant risks in the top quartile but no increased risks in the lower levels.

8.3.6 Biological Plausibility

Many bench scientists can conjure up biological plausibility . However, we must consider other parallel possibilities. For example, recent theorem that trimethylamine N-oxide (TMAO) , a metabolite of the gut microbiotas of choline increased the CVD risk, generated considerable interest [48]. Several reasons prevent us from getting overly excited about the role of TMA or TMAO in human diseases. First, there are 100 trillion microbiotas in a human body with complex interactions involving the huge quantities and diverse range of microbes. Thus, identifying one or several microbes in a disease relationship is nearly impossible. Second, the gut microbiome is not readily accessible without special procedures. Many researchers use the fecal microbiome to estimate the alteration in gut microbiome. This is a gross violation of the temporality requirement of causality. The fecal microbiome is at the terminal end of the alimentary track and does not precede gut microbiome. Thus, the fecal microbiome cannot be the cause for the biologic activities in the gut. Third, many foods generate TMAO, and the results were too non-specific. Consequently, the biologic plausibility appears to be weak. One recent study actually reported that TMAO analyses may be biased: in stroke patients, TMAO levels were lower than asymptomatic persons and presented dysbiosis showing more opportunistic pathogens, such as Enterobacter, Megasphaera, Oscillibacter, and Desulfovibrio, but fewer commensal or beneficial genera including Bacteroides, Prevotella, and Faecalibacterium [49].

All these criteria may not be present, but causality is still possible or vice versa. In other words, satisfying all six criteria does not assure the relationship is causal nor does satisfying some of the criteria preclude causality. However, the foremost minimal criterion is that temporality must be satisfied in a causal relationship. This means the cause must occur before the outcome in all causal relationships. But we must keep in mind that satisfying the temporal relationship does not ensure causality [40]. Rather, the temporality criterion is the minimum requirement, but causality has to be evaluated in each case by carefully adjusting competing factors.

Another case in point deserves consideration: a popular topic in research at present is fecal microbiome analyses to determine the causative microbiota for inflammatory bowel diseases (IBD), such as Crohn’s disease or ulcerative colitis. Are alterations in the fecal microbiome the cause for inflammatory bowel disease or the consequence of it? Anatomically, feces come after the gut and cannot be the cause for the pathology in the gut. However, many prominent scientists analyze fecal microbiome to evaluate the cause for IBD.

Fecal analyses suggested that fecal bacteria that produce butyric acid are associated with health, and human colonic butyrate producers are predominantly Gram-positive Firmicutes but are phylogenetically diverse. The most abundant groups that generate butyrate are Eubacterium rectale, Eubacterium ramulus, and Roseburia cecicola. These bacteria were enriched in healthy individuals [50]. However, other studies reported that Firmicutes were increased in obesity [51,52,53]. Does this mean obesity is a sign of health? The main question is “are these bacteria bringing health?” or “are they the results of health?” Certainly, examining fecal microbiome could not answer this causality question. Microbiome diversity changes according to the diet [54]. Thus, the eventual causal factor may be the diet. And yet, millions of health research dollars go to fecal microbiome sequencing studies.

Here we list the inconsistencies in fecal microbiome sequencing studies : Backhed et al. reported that germ-free mice were protected from developing obesity [55]. The mechanism includes (1) decreased absorption of glucose, (2) generation of short-chain fatty acids from the gut lumen, (3) the associated reduction in hepatic lipogenesis, (4) increase in fatty acid oxidation, and (5) decrease in deposition of triglycerides in adipocytes. The same group reported after gastric bypass surgery, the patients’ fecal microbiome had changed independent of BMI. When these patients’ feces were transplanted to germ-free mice, the mice microbiome promoted less fat deposition [56]. This indicates certain microbiotas may be associated with obesity, and weight loss may be due to forced dietary changes post gastric bypass surgery, and microbiome may be the consequence of these dietary changes. Again, the temporality of diet, gut microbiome change, and obesity has to be determined to identify the true cause for obesity. Others, however, reported that obesity caused spontaneous endotoxemia, i.e., elevated serum lipopolysaccharides (LPS) level and subsequent microbiome alteration [54, 57]. Thus, diets that induced obesity appear to initiate this cascade. These sequences of events and jumbled cause-effect relationship in the role of diet, obesity, microbiome, and metabolic inflammation need to be elucidated in the future.

One other baffling example of biologic plausibility in causal context is dysbiosis. Dysbiosis can be defined “An alteration of microbial community composition from a normal healthy state.” It has been suggested that dysbiosis may cause periodontitis [58, 59]. However we must prove dysbiosis precedes periodontitis to be a causal risk factor. So far, we have not seen a longitudinal assessment of oral dysbiosis causing periodontitis. Let us be reminded of Hippocratic comment that “Conclusions which are merely verbal cannot bear fruit, only those do which are based on demonstrated fact.”

8.4 Controlling for Confounding

“Confounding ” can be defined “other competing factors” that are related to both the predictor and the outcome. A prime example is smoking in the relation of periodontitis to CVD.

Smoking promotes periodontitis development via low immunity due to reduced interferon, antigen presenting cells, and immunoglobulin production [60] and is also a strong risk factor for CVD by itself. Therefore, we must control for the smoking effects in the relationship of periodontitis to CVD. By the same token, obesity and diabetes also increase the risk of periodontitis, and they themselves are directly increasing the risk of cardiometabolic diseases. Thus, the confounding must be controlled in the relationship of periodontitis to CVD, as is illustrated in Fig. 8.1 (described by a red dotted x).

Fig. 8.1
figure 1

It has been reported that periodontitis increases the risk of cardiometabolic diseases. It also has been proven that smoking or obesity increases periodontitis and that they independently contribute to cardiometabolic diseases. Thus, smoking or obesity becomes a confounding factor for periodontitis. Therefore, if we wish to establish the unbiased relationship between periodontitis and cardiometabolic diseases, the effects of smoking or obesity that coincides with periodontitis must be controlled

In a complex biological system such as human physiology, the permutations of confounding factors can determine health versus disease state and usually are enumerated with large individual variations. Therefore, while difficult to achieve absoluteness, there are means to reduce or eliminate some of their impacts as illustrated in Fig. 8.1 as well as use of cross-correlation approaches to optimize the final results.

Although we previously assumed that innate immune system is activated by invading pathogens only, as our knowledge expands, we now know that obesity and diabetes endogenously activate innate immunity and generate low-grade inflammation [61, 62]. Pischon et al. reported that periodontal treatment resulted in decreased e-selectin levels. Unfortunately, this study did not provide pretreatment characteristics of the cohort. We have no way of knowing whether metabolic inflammation could have biased the results. Although it was a “self as control” study design, metabolic inflammation would have altered the serum inflammatory markers. Thus, it is important to adjust some measure of metabolic inflammation [63].

In recent years, the gut microbiome was publicized as “a new organ” causing obesity [53]. Diet will provide substrate for gut microbiome and will alter the intestinal microbial composition. Indeed, African children who eat a high fiber diet showed a significant enrichment in Bacteroidetes and depletion in Firmicutes (P < 0.001), with an abundance of bacteria from the genus Prevotella and Xylanibacter. These bacteria are known to have genes that hydrolyze cellulose and xylan. Meanwhile, these findings were not observed in European children [64].

It appears two opposing theories are conflated suggesting a third factor may be involved in “diet drives microbiome change” or “microbiome alters dietary absorption” leading to obesity. A recent study explained that “microbes are highly varied between individuals and fluctuate within an individual.” [65] Furthermore, another study reported “no simple taxonomic signature of obesity in the microbiota of the human gut” [66]. In a meta-analysis, Sze and Schloss concluded that most of these sequencing studies are underpowered and used inappropriate statistical methods, and more importantly, they may show associations but not causality [67].

Often many dental researchers who are not knowledgeable in the concept of confounding combine the groups together like those who have diabetes and periodontitis or those who smoke and have periodontitis and claim that periodontal treatment improved CVD or glycemic control. In these cases, the confounding by diabetes or smoking must be controlled meticulously, or the results will be biased.

Another point that should be considered in data management is when smoking is dichotomized; it should never be smoking = “yes/no.” Even though CVD risk declines with increasing time from smoking cessation, past smokers are at an increased risk of having CVD, and this dichotomy wrongly places them in the “no” category. If one must dichotomize smoking, it should be “ever smoke = yes/no.” In this scenario, the past smokers and current smokers are grouped together which is more appropriate. Alternatively, a continuous measure of smoking exposure, such as pack-years, can be employed to distinguish those with little smoking exposure to those with heavy smoking exposure.

A recent classic example of a hidden confounder that has misled the biomedical research and clinical trials is the fact that at high-dose regimens of bisphosphonates (BISPs) for cancer patients and repeat doses over 3–5 years, the cumulative dose on the bone reaches to high enough level that it impacts bone osteocytes and bone lining cells. This approach in preclinical and clinical trials led the investigators evaluating the cancer bone metastasis and cancer bone burden to observe ~20–30% reduction. This was interpreted in terms of BISPs having direct anticancer effect. Remarkably, the impact of BISPs on the bone which reflects degeneration of the local bone cells and bone vitality becomes a confounding factor since the dead bone cannot support cancer colonization, and hence reduction in cancer bone burden was not directly attributable to the effect on cancer cells.

8.5 Minimizing Biases

Recently elaborate 16r RNA sequencing was done in the subgingival crevicular fluid of patients who have systemic lupus erythematosus (SLE) with and without periodontitis. They observed dysbiosis in the group with periodontitis [68]. The question still remains: “Is dysbiosis due to periodontitis?” or “Are both dysbiosis and periodontitis the phenotypes clustering immune-suppressive treatments of SLE?” In translational research or in any research activities, the ultimate goal is reducing diseases. To achieve this goal, we must decrease the exposure to causal risk factors. Therefore, it is of utmost importance to find causal factors if we wish to lower human diseases occurrence.

8.5.1 Simpson’s Paradox

Simpson’s paradox is defined as “the results indicate the reverse of the true relationship because a confounding factor is not considered.” A source of bias in some translational research originates from the lack of epidemiologic understanding among bench scientists. Some researchers reported, “Obesity alters gut microbial ecology” [51]. The same group also reported “gut microbiome contributes to energy harvest from the diet and energy storage in the host (i.e., caused obesity)” [69]. These two theorems have opposing cause-effect directions. Backhed et al. also reported that “introduction of a gut microbiota into adult germ-free mice caused a 57% increase in body fat” [69].

Alternatively, many researchers reported that diet-induced obesity alters gut microbiome [26, 57, 70,71,72,73] and this process involves toll-like receptor activation followed by cytokine production that is manifested in metabolic inflammation [74,75,76,77]. Utilizing antibiotics and changing the gut microbiome in leptin-deficient ob/ob mice suppressed metabolic endotoxemia, inflammation, and associated disorders [54]. The absence of CD14 in the same mouse group brought similar effects to antibiotics suggesting that innate immune sensing is involved in obesity and that CD14 acts as a co-receptor (along with the Toll-like receptor TLR 4 and MD-2) for the detection of bacterial lipopolysaccharide (LPS). We must remember that the innate immune system can be activated by both microbial and metabolic stimuli [61, 78,79,80]. Indeed, Fleissner et al. reported that absence of intestinal microbiota does not protect mice from diet-induced obesity [81], and another study refuted the highly cited claim by Turnbaugh et al. [53] that energy harvest from short-chain fatty acids by microbiota in the gut caused obesity [82]. Murphy and colleagues observed a progressive increase in Firmicutes which was confirmed in both HF-fed and ob/ob mice (we interpret this as the diet and obesity altered gut microbiome). But the changes in the microbiota were not associated with the marker for energy harvest [82]. At this time, it is not clear whether an obesogenic diet causes gut microbial changes or gut microbiome alteration caused obesity.

A potential Simpson’s paradox is possible in the case of childhood infections or antibiotic use causing preadolescent obesity [83, 84]. All these studies ignored the fact that infectious inflammation can be confounded by metabolic inflammation [61, 85]. In a recent longitudinal study, infection and antibiotic use in infancy were reported to be causally associated with obesity in the adolescent [86]. However, this study ignored the main culprits of obesity, namely, obesogenic diet, insufficient physical activities, and the resultant energy imbalance [26, 61].

The mainstream thesis on the cause for obesity is still the excess energy due to high caloric intake and lack of physical activity [26]. One study compared high and low Toll-like receptor 5 (TLR5) gene expression and obesity. The group with high TLR5 gene expression was obese because of the flagellin-producing microbiota detected by TLR5 [87]. However, a Simpson’s paradox may have occurred because those who have high expression of TLR5 gene were fatter at baseline (BMI = 30.6 vs 20.7, p = 0.04), much more insulin and leptin resistant. This finding is directly opposite to that of a highly touted animal study in molecular science [88]. In the latter study, Vijay-Kumar et al. reported that TLR knockout mice developed spontaneous obesity and metabolic syndrome [88] and the transfer of the fecal microbiome from TLR5-deficient mice to wild-type germ-free mice caused obesity and metabolic syndrome. We are not certain whether mice and humans will have the same innate immune response triggered by TLR5 activation or if there are some built-in biases in these studies. Clearly, more incisive review of the studies using TLRs in animals and humans are needed. Our opinion is that TLR activation is the result of obesity which also caused metabolic inflammation as we and others have reported [61, 89]. Numerous studies support this thesis that TLR activation is the result of obesity: TLRs were activated in nonalcoholic fatty liver disease [90] and other obesity-related cardiometabolic diseases [91, 92].

In another study, the third trimester (T3) stool of pregnant women showed the strongest signs of inflammation and energy loss. When their fecal microbiotas were transferred to germ-free mice, T3 microbiota induced greater adiposity and insulin insensitivity compared to that of first trimester [93]. This study ignored the fact that pregnancy is an immune-tolerant state, and as the fetus grows, inflammation increases due to more relaxed immunity not to reject semi-allogeneic fetus. Thus, it is plausible that T3 stool would display more prominent metabolic dysfunction and inflammation.

8.5.2 Conundrum in Microbiome Research

Lately, popular topics in research include gut microbiome and gene sequencing. These clinical and translational research studies hold significant potential impact for leading to improve understanding and ultimate application to dental disease and cardiovascular disease. But since this is a new area of research, it is fraught with many deficiencies arising from inappropriate methodology, misconceived study rationale, and misinterpretation of results. Consequently there are many conflicting reports. Beyond potential reverse causation due to using feces to estimate gut microbiome function, additional reasons for this disparity exist: the high functional redundancy in host-microbiome interactions, normal individual variation in microbiome composition, differences in study design, diet composition, the host system between studies, and inherent limitations to the resolution of rRNA-based microbial profiling [94].

Initial evidence for obesity-altered microflora came from an animal study that leptin-deficient ob/ob mice displayed a decrease in Bacteroidetes and a proportional increase in Firmicutes compared with lean siblings (ob/+) given the same diet [51]. Confounders in the relationship of gut microbiome and obesity are diet and the genetic lineage of the animals [71].

Some microbiome studies used germ-free mice to prove gut microbiota cause obesity. We question the validity of using germ-free mice and extrapolating the results to humans. Although infection can change metabolism, the germ-free state in humans is unnatural, and its clinical interpretability is limited. At birth, the gut of a human newborns is sterile but by passing through the birth canal, subsequent breast feeding, and introduction of solid foods, the infant’s gut is colonized with a microbial community [95]. This colonization has multiple benefits because microbiome educates the developing immune system [96] and trains it how to distinguish harmful pathogens from harmless commensals, or part of self, and to react accordingly [97, 98]. If this introduction of microbiota is disturbed, some autoimmune disease, such as Type 1 diabetes, may occur [99]. Another benefit of having well-colonized microbiome is breaking down indigestible food components, degrading potentially toxic food compounds like oxalate, and synthesizing certain vitamins and amino acids [100]. Additionally, a more powerful driver of obesity and metabolic syndrome is diet and physical activity balance [26]. Diet changes gut microbiome and intestinal permeability which allows some microbiota to translocate into the blood stream [57].

Additionally, the gut microbiome is usually assessed in the feces and is likely to be the results of obesity rather than the cause. For this reason, the fecal transplant from lean persons improved insulin sensitivity, and it is plausible because lean donors have the normal microbial community, which is not affected by obesity [101]. However, it is not clear whether these changes are permanent or if insulin resistance will return as soon as the recipients resume their obesogenic diets. Moreover, microbiome change is transient [102], and perhaps, it is prudent to make dietary change and increase energy expenditure by exercising more. As we have stated in the previous section, that fecal microbiome assessment is not appropriate for establishing the causal role of microbiota in the gut immunity. In addition, modulating gut microbiome by antibiotics appeared to improve insulin signaling and glucose tolerance by reducing circulating LPS levels and inflammatory signaling in mice [103]. However, this phenomenon was not duplicated in humans in a recent randomized trial [104]. Moreover, using antibiotic treatment to mimic germ-free state in an attempt to prevent obesity via changing gut microbiome in humans has some obvious problems such as the development of resistance to antibiotics. Additional problems using antibiotics are gut microbiome is necessary to protect the host from invading pathogens, energy extraction, and developing immune system [100]. Moreover, reduced exposure to important gut bacteria may result in higher incidence of human allergies and autoimmune diseases [105].

Current knowledge of the mucosa-associated bacterial communities in the intestine and colon is limited because the knowledge was largely based on fecal microbiome analyses. It was reported that the lumenal and fecal bacterial communities were significantly different [106, 107]. It was demonstrated that the cecum contained 100 times more bacteria than the terminal ileum [108]. Admittedly, collecting colonic samples is difficult because of their viscosity and the difficulties in ensuring adequate anaerobic conditions. This proves potential sources of discrepancy in aerobic fecal and anaerobic intestinal microbiome. Only intubation and pyxigraphy can be performed in healthy subjects, and both should be repeated to study the stability of the flora or the influence of various parameters on its composition [106]. Some microorganisms, such as the methanogens, represent <0.003–0.03% in the right colon, and the same bacteria are present at 5–12% or more of all bacteria in the feces. Strict anaerobes analyzed using probes specific for the Bacteroides (Bacteroides, Porphyromonas, and Prevotella spp.) and Clostridium groups (Clostridium, Eubacterium, and Ruminococcus spp.) revealed that these bacteria represented 44% of fecal bacterial rRNA and only 13% of cecal bacterial rRNA. These differences suggest that studying the right-sided colonic flora would be more appropriate than studying feces for the diseases involving the right part of the colon, such as ileocecal Crohn’s disease [106].

Many studies report the dietary intake shaping gut microbiome [109] as well as causing obesity [54, 110, 111]. However, equally numerous studies report gut microbiome causing obesity independent of dietary effects. Some studies, once scrutinized carefully, erroneously reported gut microbiome causes obesity although dietary factors precede alteration in gut microbiome [53, 94, 112]. Clearly, the lack of understanding of the causality principle, namely, the temporal relationship, made them to refer an imprecise association as causal relationship [53, 112]. Furthermore, the gut microbiome consisting of approximate 1000 species and their composition can change due to antibiotics, illness, stress, aging, bad dietary habits, and other lifestyle factors [113]. Gut microbiome evolves with the human development from germ-free state of newborn infants to approximating adults’ microbiome by the age of 1–3 years [114]. Incidental environmental exposures play a major role in determining the distinctive characteristics of the microbial community [114].

A recent randomized trial utilizing Lactobacillus rhamnosus GG has been shown to decrease neuropsychiatric disorders later in the childhood by stabilizing gut permeability and restoring epithelial barrier function by tight junction control, mucin production, and antigen-specific immunoglobulin A production [115]. The underlying pathology in many autoimmune or allergic disorders is the increased intestinal permeability that brings dysregulation of immune responses as well as dysbiosis in response to ubiquitous environmental antigens. It should be noted that obesity causes increased intestinal permeability [116]. Thus, finding the very first initial trigger may prove to be a causal factor.

8.5.3 Toll-Like Receptors in Infection and Metabolism

Germ-free mice [55] were protected from developing obesity even with a high-fat diet while Toll-like receptor 5 (TLR5) knockout mice became obese and hyperphagic [88]. These results prompted the conjecture that infection or the gut microbiotas may be at the root of obesity [88]. However, in an in vitro study, subcutaneous adipocytes cultured and exposed to saturated fatty acids expressed increased TLR4 and MyD88 and upregulated NF-kB activity with significantly increased secretion of IL-6 and TNF-α [117]. This suggests fatty acids caused TLR4 expression, and not TLR4 caused fat-related inflammation observed in obesity. This further proves that TLR4 can be activated by metabolic factors [61].

As we wrote in the previous section, a high-fat diet (HFD)-fed mice expressed increased LPS in the serum (metabolic endotoxemia) [57] and activated TLR4. TLR4, in turn, induced enteric neuronal apoptosis in a p-JNK1 dependent pathway [118]. The authors also observed that the HFD-fed mice had a statistically significant reduction in Bacteroidetes (P < 0.001) and a significant increase in Firmicutes, Bifidobacteria, and E. coli (P < 0.001) relative to mice fed a regular diet. When they supplemented the mice’s diet with oligofructose (prebiotic), the level of endotoxemia decreased in HFD-fed mice. The researchers interpreted prebiotics restored dysbiosis, but we consider that prebiotics prevented high-fat diet induced intercellular permeability which resulted in a lesser degree of dysbiosis. Again, it was proven that the high-fat diets initiated increased intercellular permeability, metabolic endotoxemia, and gut dysbiosis and also activated TLR4, which in turn induced intestinal neuronal apoptosis resulting in gut motility reduction [118]. It is important to recognize which factor initiated the sequence of the events and that factor should be considered as the cause.

A recent human trial largely refuted all the animal studies reporting that gut microbial colonization may cause obesity. Reijnders and colleagues manipulated gut microbiota by antibiotics (7-day administration of amoxicillin, vancomycin, or placebo) and observed host metabolism in 57 obese, prediabetic men. Vancomycin, but not amoxicillin, decreased bacterial diversity but did not affect tissue-specific insulin sensitivity, energy/substrate metabolism, postprandial hormones and metabolites, systemic inflammation, gut permeability, and adipocyte size. More importantly, energy harvest, adipocyte size, and whole-body insulin sensitivity were not altered at 8 weeks of follow-up, despite considerable alteration in microbial composition [104]. We interpret this as antibiotics, or the lack of innate immune sensor such as TLR5 may alter the gut microbiome but may not affect metabolism or obesity. This was also the view of an expert who first reported diet-induced endotoxemia, increased serum levels of LPS due to increased intestinal permeability [119].

Germ-free mice colonized with Bacteroides thetaiotaomicron had improved host nutrient absorption and thus potentially increases the possibility of developing obesity [120]. However, the multicomponent ileal/cecal flora produced no significant change in levels of either mRNA compared with germ-free controls [120]. We interpret these results as “germ-free mice being colonized with one or two microbes may introduce bias because it is acting as an infection while multi-microbial inoculation may have balancing effects among the microbes” and produced less detrimental impacts.

8.6 Reducing Measurement Errors

Traditionally, translational studies tend to have a lesser degree of measurement error than large epidemiologic studies. However, it is still possible that measurement errors may lead to paradoxical results. For example, the gut microbiome includes over 1000 microbial species [121], and identifying a few microbes that are causally associated with the disease of interest is truly a daunting task. This also applies to all the large-scale proteomics studies by mass spectrometry and/or protein arrays where definition of a biomarker for causal or diagnosis is evaluated.

Traditional culture-dependent methods have numerous drawbacks such as the time and money required, difficulties in identifying the different colonies grown in agar, the lack of sensitivity, predilection for the most common culture conditions favoring fast-growing and easy-growing species, and ignoring those in low concentration or requiring unusual culture conditions, such as anaerobic conditions. Conversely, the cutting-edge analytical method of 16S rRNA also has several limitations: Firstly, the accuracy of identification is directly dependent on the completeness of the reference database. Secondly, the identification power is lower at the species than higher taxonomic levels. Thirdly, many studies use a fragment of the gene, which restricts its discriminatory power even more. Fourthly, many bacterial species have more than one copy of 16S rRNA, and inter-copy sequence variations may be present [121]. In addition, the presence of microbe in the diseased tissue does not prove that microbe caused the disease. Some microbes have the unusual capability of slipping through intercellular spaces and are ubiquitously present in many diseased and non-diseased tissues. Some examples are Fusobacterium nucleatum (F. nucleatum) and Porphyromonas gingivalis (P. gingivalis). Whether they are innocent bystanders or truly causal microbiotas has yet to be proven. The reason for that is the majority of the studies have some methodological flaws. For example, oral gavage with P. gingivalis resulted in intestinal dysbiosis in mice [122]. This study provided a novel concept that orally ingested microbial species can cause gut dysbiosis linking the oral cavity to the gut microbiome. However, to prove that P. gingivalis is unique in causing gut dysbiosis, the control group should have been other microbiotas, such as Salmonella, Escherichia coli, or Staphylococcus. Using saline as control, they had proved that ingestion of “bacteria,” not specifically P. gingivalis, caused dysbiosis which is not unlike food poisoning. Additionally, ingesting P. gingivalis is not the same as P. gingivalis present in human periodontitis. To be a cause for an infection, microbiota must overcome several obstacles [123]: First, they must outcompete the huge number of commensal bacteria [124]; second, they must disrupt epithelial barrier function [125]. In the manipulation of epithelial barrier function, several mechanisms have been recognized. One is via over expression of IL-6 [126] or manipulation of the actin cytoskeleton [127]. Here we need to be reminded that obesity and metabolic inflammation overexpress IL-6 [128, 129] and also increase intercellular permeability.

Another example of the use of inappropriate control group is highly touted “Justification for the Use of Statins in Primary Prevention: An Intervention Trial Evaluating Rosuvastatin (JUPITER)” trial where in a cohort who were overweight, many of them smoked, and were hypertensive but had not yet developed heart disease, these subjects were given rosuvastatin and the results were compared to the control group who took placebo. Certainly this cohort needed to lower their body weight and decrease smoking and hypertension by lifestyle changes. Thus the appropriate comparison group should have been lifestyle changes comparable to pharmaceutical intervention. Moreover, the outcome of cardiac events included “hospitalization for unstable angina” in the arithmetic sum of all cardiac events. This means “hospitalization for unstable angina” had equal weight as did myocardial infarctions or cardiac deaths. This is clinically inappropriate. When we look at the major cardiac events only, the cardiac event rate in the rosuvastatin group was 83/8901 and numerically 0.009 which means less than 1% had a cardiac event. The placebo group event rate was 157/8901 which can be translated as 0.018 which is less than 2%. And yet, the relative risk decreased about 50% (0.009 vs 0.018) with rosuvastatin administration. Despite the low actual number of events involved in this cohort, now statin treatment for the asymptomatic population is accepted as a standard of care. Some minute improvement in outcome can be manipulated to be highly significant by increasing the sample size because the power of a study (i.e., the probability of obtaining a statistically significant result) is dependent on the sample size. P-value is calculated by a Z-score which indicates how many standard deviations the observed value is away from the mean.

Let us review how P-value is derived. In the standard normal curve, when a value is located about ±1.96 standard deviations away from the mean, that value is deemed to be significant because only 2.5% on each end (5% combined) of standard normal curve will assume this or similar values. (This is why we set the α-level at 0.05.) The Z-score is calculated shown in the equation below.

$$ Z=\frac{\overline{X}-\mu }{\sigma /\sqrt{n}} $$

X: sample mean; μ: population mean or true mean; σ: standard deviation (SD); n: sample size.

As we know X, sample mean, and σ, standard deviation, come from the experiment results which should not be changed. However, the sample size can be manipulated by recruiting a large number of participants. If the sample size (n) increases, the denominator becomes smaller because denominator is SD divided by the square root of the sample size. If the denominator is small, even minute changes in the enumerator (the outcome) can generate a large Z-score, and the P-value becomes significant. Thus, it is important to realize that although the P-value is significant, sometimes the results may not be clinically meaningful. Inevitably, with a large sample size, often the measurement will be done by using questionnaires or proxies that are less precise. Consequently, their results are often imprecise but highly significant. Nevertheless, journals and funding agencies tend to believe results from studies with large sample sizes. Hence, it is always important to ask whether statistical significance is actually clinically relevant. An extension of the above concept can be also found in the misleading conclusions of the experimental data being different by only 10–20% between the comparators with P-value <0.05 and hence touted as “statistically significant”. In most cases a difference of 10–20% between the compared groups, although it may be statistically significant, such data or changes are frequently not biologically significant or relevant.

8.7 Utilizing the Appropriate Statistical Methods

Translational research often involves small sample size because the elaborate laboratory methodology requires time and money to conduct. Also, the results are affected by the techniques used (mass spectrometry or polymerase chain reaction, etc.), researchers’ skills to perform the experiment, and the animal models or species used. In some research, using the appropriate animal model is important. For example, in short-chain fatty acids assessment, murine models may be of limited value, while pigs or dogs are much better to estimate the human relationship with short-chain fatty acids and gut microbiotas.

The pervasive problems in translational studies are sample sizes are too small and using inappropriate statistical methods. When the sample size is five or six in each group, we cannot expect that these data will have a normal distribution. However, many researchers use the t-test which assumes that the underlying data has a normal distribution. Also, multigroup comparisons often use ANOVA, but the ANOVA requires that each compared group must have a normal distribution and each group must have the same sample size. Even a total sample size of 32 (each group consists of N = 8 in four groups), the each group (n = 8) must be normally distributed. In addition, for ANOVA , homogeneity of variance assumption is crucial to obtaining valid statistical results. Particularly in laboratory studies involving count data, variance may increase exponentially with group means, which can be problematic. Especially, in some 16s rRNA sequencing, the usual sample size is less than ten due to the constraints of cost, time, and computing ability. So, the sample size issue has been raised with regard to the gut microbiota research in a recent meta-analysis, and the median classification accuracy for predicting obesity by the gut microbiome composition was very modest, being between 33.01% and 64.77% [67].

One of our own students conducted a four-group comparison where how various reagents affect microbial growth. The total sample size was over 200, but the underlying assumption of homogeneity of variance was violated, and several different experimental variations were involved such as the timing of adding reagents, different numbers of microbes added at the beginning of the experiment, and differing composition of microbes. Due to these limitations in study design and data distribution, we could not use parametric regression methods. Thus, we created each subgroup reflecting on the variation in the methodology and compared to appropriate reference via nonparametric methods. In addition to concerns about sample size and variance, problems arise when the underlying distributions under comparison are highly skewed. In most biological data where the groups being compared are highly skewed, it is generally more appropriate to utilize nonparametric testing.

Some randomized trials select egregiously poor reference groups to amplify the efficacy of their interventions. For example, if the intervention is giving milk to school children and assessing obesity outcome, the correct reference group should be drinking water [130] or diet soda [131]. If the chosen reference group is sugared soda which has been established as obesogenic [132], the results may not substantiate much health benefit from drinking milk.

Another example of using an inappropriate reference group can be found in a heart failure medicine trial. In a pharmaceutical company sponsored PARADIGM-HF trial [133], a new added ingredient LCZ696 to a previously marketed angiotensin receptor blocker valsartan was tested. Since the new product has a new added ingredient to valsartan, the appropriate reference group should be valsartan without LCZ696 and substantiate that added ingredient is safe and equally efficacious or better. However, they compared its efficacy to enalapril, an early angiotensin-converting enzyme inhibitor with well-known side effects of cough in many patients. To our opinion, since this drug will be given to advanced heart disease patients, the safety issue should be tested carefully. We do not comprehend why this point is not recognized by the leaders of the American Cardiologists group.

In conclusion, even in translational research, all of the epidemiologic principles such as developing scientifically sound rationale, establishing causal relationship, controlling for confounding, minimizing biases and measurement errors, and using appropriate statistical methods are of paramount importance.