Introduction

Epidemiologic research on religion and health has reached a certain state of maturity [1,2,3]. The evidence base on the topic has grown considerably, the study designs and methodology in this area of research have notably improved, and the number of religious-spiritual exposures that have been examined and the variety of health and well-being outcomes for which rigorous evidence is now available has dramatically increased [3,4,5]. However, the current research literature is still subject to a number of limitations. One of these limitations is that the vast majority of the research has been carried out in the United States. Much less is known about its applicability in other countries, cultures, or contexts. Do the associations persist in more secular countries? Do the associations vary across religious groups? Much less evidence is available on these questions. The paper by Ahrenfeldt et al. [6] makes a considerable contribution to the literature in this regard by examining some of these associations, using rigorous designs and methodology, in various countries in Europe. The paper does not contribute much to our understanding of these associations for non-Christian religious groups, but it does at least expand the geographic scope of large rigorous research studies for a number of exposures and outcomes to include Europe.

Religion and health in Europe

The Ahrenfeldt et al. paper [6] has a number of strengths. Most religion-health analyses have been with data from the United States; theirs is conducted with data from Europe. Most religion-health analyses are cross-sectional [1]; Ahrenfeldt et al. have 5 waves of longitudinal data. Most religion-health analyses that do have longitudinal data only have measures of service attendance; their study has several measures including prayer, and having received a religious education, in addition to standard measures on participation in religious organizations. In addition to examining each of these exposures separately they also examined two distinct patterns of religiosity: first, those who were religiously educated, still participated in a religious organization, and also prayed (what they called the “more religious”) and second, those who prayed only and neither participated in religious organizations nor were religiously educated (the “less religious”); they compared both of these groups to the non-religious. They hypothesized that for the former group, religious participation would have protective associations with health; but with the latter group they hypothesized adverse associations with health, with prayer among the “less religious” arising, perhaps at least sometimes, from a state of crisis. Ahrenfeldt et al. also examined a number of outcomes, including limitations in activities of daily living, self-rated health, long-term pain or health problems, and depression. Their study was further strengthened by the use of multiple waves of data and therefore the ability to control for baseline outcomes in examining associations between religion and health to attempt to rule out reverse causation e.g. that only those who are healthy are able to attend religious services [5]. By making use of the multiple waves of longitudinal data, examining associations with a number of mental and physical health outcomes, and using a variety of religiosity measures, the paper makes a notable contribution to the literature.

Many of the patterns that have been uncovered in prior research are manifest here also. Specifically, first, of the three religiosity measures examined separately, participation in a religious organization (religious service attendance) had the most consistent associations with health. In their primary analyses (Table 2, Model 3), participation in religious organizations had point estimates for all associations in a protective direction, with more substantial evidence specifically for protective associations for the global activity limitation index (OR 0.86, 95% CI 0.75, 0.98) and on depression (OR 0.80, 95% CI 0.69, 0.93). Second, there was little evidence that prayer, considered on its own, was associated with the various health outcomes, which likewise matches prior literature suggesting much weaker or null associations between private practice religious-spiritual measures and health [7, 8].

However, in addition to confirming, within Europe, some of the same patterns of association found in the United States, the paper also made a number of further novel contributions. Ahrenfeldt et al. examined associations with having received a religious education, an exposure that has been relatively unexamined in prior research, and found this was longitudinally associated with a lower risk of poor self rated health (OR 0.81, 95% CI 0.70, 0.93) and a lower risk of long-term health problems (OR 0.84, 95% CI 0.74, 0.95). Ahrenfeldt et al. furthermore looked at patterns of more integrated religiosity with consistent organizational religious involvement, and religious education, and prayer (the “more religious”) and found that this group, compared to others, longitudinally had fewer activities of daily living (ADL) limitations (OR 0.76, 95% CI 0.58, 0.99) and lower risk of depression (OR 0.77, 95% CI 0.64, 0.92). Ahrenfeldt et al. further hypothesized that those who only prayed, without religious organizational involvement or religious education (a pattern they referred to sometimes as “less religious” and sometimes as “crisis religiosity”), would subsequently have worse outcomes and indeed found some evidence of this at least for depression (OR 1.46, 95% CI 1.15, 1.86). An interesting open question, commented upon, but left unresolved by their data, was whether such “crisis religiosity” is simply a form of reverse causation with those in worse health or more problematic settings more likely to pray, or whether there might be mechanisms whereby prayer itself, detached from organizational involvement or religious education, might in fact itself causally lead to worse outcomes. Such poorer outcomes could conceivably occur, as a causal consequence of such prayer if, for example, such prayer led to increased rumination, or unjustified hope for relief, or perhaps avoiding medical care. Further research on such questions would be of interest.

Methodological challenges

The Ahrenfeldt et al. study attempted to address a number of challenging methodological problems. One of the strengths of the study was their use of multiple waves of data. This allowed for adjustment of baseline outcomes in order to try to rule out reverse causation [5]. This is especially important in research examining religion and health since it may be the case that only those who are relatively healthy can attend services. Indeed, there is evidence from longitudinal studies that those who become depressed are subsequently more likely to stop attending services [9,10,11], and likewise evidence that those with new physical disabilities are, at least in the short run, less likely to attend [12]. This is problematic for research on religion and health, because, without control for baseline outcomes, one might find a protective association simply because those who are unhealthy cease attending. Longitudinal data is needed to try to rule out this possibility and it is precisely that which Ahrenfeldt et al. [6] attempted to do by using multiple waves of data.

Another methodological challenge encountered by Ahrenfeldt et al. concerns appropriate statistical adjustment for multiple testing. Their study considered a number of different outcomes including activity limitations (three different assessments), self-rated health, long-term pain or health problems, and depression; their study also examined a number of different religious exposures. Considered separately, a number of their estimates suggested evidence for an effect of religious participation on various health outcomes, as judged by a 95% confidence interval for this estimate bounded some way away from the null. However, after correction for multiple testing, these associations did not, in general, pass a p = 0.05 threshold. That threshold itself is of course somewhat arbitrary [13,14,15,16], but does at least indicate some of the difficulties of assessing evidence under multiple testing. The standard Bonferroni correction is often very conservative, and the Ahrenfeldt et al. paper instead used the somewhat preferable Holm–Bonferroni method, but still the associations did not generally pass this threshold. Often, in practice, associations are evaluated with a single exposure and single outcome, one at a time per paper, and no adjustment is made, so effectively studies that examine multiple outcomes and/or exposures are essentially penalized in comparison to what emerges in the research literature when such associations, examined one at a time, are evaluated.

Better methods and metrics are needed for handling multiple testing in the presence of correlated outcomes. In Table 2, Ahrenfeldt et al.’s estimates that for the four comparisons for which they hypothesize a protective association (religious organization, religious education, more religious vs. others, and more religious vs. less religious), across the six outcomes, 22 of the 24 associations are in a protective direction, 1 null, and one in a slightly adverse (OR 1.01) direction. While many of the confidence intervals themselves contain the null, these confidence intervals and the multiple testing procedures do not pick up on this sort of nuance across outcomes. Again, the development of better procedures for evaluating associations with multiple comparisons with numerous correlated outcomes, and/or exposures, would be an important development. Evaluating numerous outcomes at once, as in the Ahrenfeldt et al. paper [6], is both a more efficient use of research resources, and is better able to assess the public health importance of specific exposures since their effects on numerous outcomes are considered simultaneously. Such “outcome-wide epidemiology” [17] may often be preferable to current standard practices of the slow accumulation of evidence from single exposure–outcome relationships. But if epidemiology moves in the direction of such outcome-wide analyses, better approaches to handle multiple testing problems will be needed in the evaluation of evidence.

Ahrenfeldt et al. used individual level longitudinal data in their analyses. Such data is strongly preferable to cross-sectional data for the reasons discussed above. Moreover, such analyses should be sharply distinguished from analyses using group-averaged data, as it is then impossible to control for individual-level confounding, and in the absence of individual level data, even control for country-level variables is not sufficient to adjust for bias for country-level confounding of those variables [18, 19]. Such methodological problems are sometimes collectively referred to as the problem of the “ecologic fallacy”—one cannot draw conclusions about causation on individual level outcomes with group-averaged data. Unfortunately, use of such group-averaged data is also common in the religion and health literature. This is problematic. For example, although religious service attendance tends to be associated with greater life satisfaction at the individual level across number cultural contexts [1, 3, 20, 21], there are some reports that more religious countries have lower average levels of life satisfaction [22, 20, 23]. However, these latter analyses are almost certainly confounded by a country’s economic development [20]. Indeed a study by Diener et al. [20] that uses both individual and societal religiosity, and controls for individual and societal difficult circumstances, suggests that both individual and societal religiosity are associated with greater subjective well-being, though the analysis is still with cross-sectional, not longitudinal, data. In any case, data (ideally longitudinal) on both individual service attendance as well as country aggregates of individual attendance or religiosity, are needed to examine the contextual effects of country-wide religious service attendance on health.

Broader cultural contexts

One outcome that Ahrenfeldt et al. did not examine, apparently because of unavailability of linked data across countries, was mortality. With research in the United States, this is arguably the outcome for which the evidence is most substantial [1,2,3, 7, 24,25,26,27]. However, even in Europe, at least two moderately large longitudinal mortality studies with religious service attendance as the exposure have been carried out, suggesting a protective association between service attendance and mortality in Finland [28] and in Denmark [29], two countries that are relatively low in religiosity [20]. Had mortality data been available to Ahrenfeldt et al. in the Survey of Health, Ageing and Retirement in Europe, this could have expanded the evidence base considerably, given the nine countries included in the study. Mortality is also one outcome for which longitudinal associations with service attendance have been found for other non-Christian religious groups, including predominantly Jewish [30] and Taoist/Buddhist [31] populations.

The number of countries included in the study also allowed Ahrenfeldt et al. to examine associations by region of Europe which they categorized as Northern (Denmark and Sweden), Southern (Italy and Spain) and Western (Austria, Belgium, Germany, Switzerland, Netherlands). There has been some indication in the research literature on religion and health, at least from mortality studies [32] that protective associations between religion and health may be more substantial in regions that are more religious. One might then expect the association to be weakest or least protective in Northern Europe, but from the interaction analyses in Table 3 from their paper, it is not at all clear that this is so. Of course, such an analysis is also potentially cofounded by affiliation, with Northern Europe predominantly Protestant, Southern Europe predominantly Catholic and Western Europe mixed. This issue may, however, not be especially problematic insofar as there is relatively little evidence that associations between religious service attendance and health vary by Catholic versus Protestant affiliation, except for the outcome of suicide [33,34,35]. In any case, it does not seem that the analyses of Ahrenfeldt et al. add much evidence either for or against the hypothesis that associations with health are stronger in areas that are themselves more religious. It may also be that limited sample size is an issue here with the need for much larger samples, and a greater number of countries, to more conclusively examine evidence for the hypothesis.

If we expand the perspective from Europe to other regions of the world as well, the existing analyses with individual do indeed suggest some evidence that the associations between religion and health may be stronger in more religious regions [32, 36]. Such evidence comes from cross-cultural analyses with individual level cross-sectional data on self-rated health [32, 36]; similar associations with cross-sectional data have been reported for subjective well-being [20]. Unfortunately, longitudinal data suggesting somewhat similar patterns for mortality is only available in the United States [26].

Person-culture fit?

Of course, even if the hypothesis that the associations are more protective in those regions which are more religious holds true, the fact that the association is in a protective direction at all still requires some explanation. It is sometimes postulated that if the protective associations are greater in those regions with greater levels of religious service attendance then the protective associations may entirely arise from person-culture fit [32]. Very likely person-culture-fit does play some role in the protective associations, but the culture-fit hypothesis only explains why the associations are even more protective in some regions than others. It does not explain why the associations are almost universally protective to begin with. One study [32] examining 59 countries, albeit with cross-sectional data on self-rated health, found that, after covariate control, there was evidence for an adverse association between self-rated health and religiosity in only two countries (Albania and Maldova), whereas there was evidence for a protective associations in 20 countries. Again, one must be cautious here because of the use of cross-sectional data, but the consistency itself of this, and, more importantly, of the many individual-level longitudinal mortality studies [1,2,3, 7, 24,25,26,27,28,29,30] needs to be explained. Person-culture-fit may eventually help explain the varying magnitude of the association, but it is not sufficient to explain the direction. As a contrast, we would not expect an exposure like smoking to become protective in areas in which smoking is particularly prevalent because of person-culture-fit, and we certainly would never think it to be protective nearly universally. Again, the direction and consistency of the protective association between service attendance and health itself needs to be explained.

Other mechanisms for religion and health

A number of explanations have been offered for the associations between religious participation and health. For mortality, evidence from longitudinal mediation analysis [37] suggests that better social support, reduced smoking, greater optimism, and less depression may all be important mechanisms [27]. Other proposed mechanisms have included greater meaning and purpose in life and greater self-control [1, 3]. While social support does seem to explain some of the associations between attendance and mortality, it does not seem to explain the majority of the effect [3, 25, 27]. Again other mechanisms seem to be present. Current evidence from longitudinal studies suggests that service attendance is subsequently associated not only with all-cause mortality but also with less smoking, alcohol abuse, and drug use [1, 38]; less depression and lower suicide rates [1, 3, 10, 34, 35]; greater meaning and purpose [39]; great optimism [27]; greater happiness and life-satisfaction [21]; lower crime [40, 41]; greater social support and greater likelihood of developing a more extensive social and friendship network [21, 25, 27] as well as lower likelihood of divorce [25, 42, 43]. The mechanisms by which religious participation affects health may thus be numerous and quite diverse, and current evidence thus perhaps suggests that it may be the small contribution of many different pathways, rather than the substantial contribution of any specific one, that supplies religious service attendance with its powerful effects on health. Religious community thus appears to be a major pathway to human well-being [44]. The study of Ahrenfeldt et al. provides evidence that this is so not only in the United States, but also, for some outcomes at least, in Europe also. How these associations play out in populations that are not predominantly Christian and in cultural contexts beyond Europe and the Western world remains an important open question. The study of Ahrenfeldt et al. has contributed substantially to the research literature for Europe; much more in other contexts remains to be done.