Keywords

Introduction

For the epidemiologist, Hodgkin lymphoma (HL) is unique among neoplasms for several reasons. As noted in previous chapters, the malignant giant multinucleated (Hodgkin-Reed Sternberg [HRS]) cells are rare and are surrounded by an infiltrate of nonmalignant lymphocytes and other immune cells. The histopathologic classification, based on the WHO classification from 2001, (Jaffe et al. 2001), divides HL into classical Hodgkin lymphoma (cHL) comprising >95% of the cases, and nodular lymphocyte predominant Hodgkin lymphoma (NLPHL) . cHL is further subdivided into two common subtypes, nodular sclerosis (NSHL) and mixed cellularity (MCHL), and two uncommon subtypes, lymphocyte rich (LRHL) and lymphocyte depleted (LDHL). Because the HRS cells morphologically resemble giant cells associated with chronic infection and because the symptoms of fever, malaise, and lymphadenopathy are also reminiscent of infection, the condition was originally thought to be an infectious disease. Eventually, the multifocal and progressive nature of the disease with tumors in lymph nodes, spleen, liver, and other organs, resulting in death if untreated, demonstrated its malignant nature.

cHL is unusual because of the epidemiologic risk pattern, with three age-specific modes of incidence that vary by time period, geography, race/ethnicity, gender, and socioeconomic status (SES) (MacMahon et al. 1971; Cozen et al. 1992; Glaser et al. 2008; Correa and O’Conor 1973), all suggesting strong environmental determination. HL is also a highly heritable cancer with an evident genetic contribution (Mack et al. 1995). Progress in understanding this etiologic heterogeneity has been impeded because many investigators initially examined HL as a single disease, with the effect of dampening or misrepresenting subtype-specific associations.

In this chapter, we present an overview of the epidemiology of HL, including descriptive, nongenetic, and genetic epidemiology. We emphasize the unique epidemiologic risk pattern of HL, the importance of early life exposures, the complicated role of Epstein-Barr virus (EBV), the immune-related risk factors, and the strong influence of heritability and genetic factors including that of HLA.

Descriptive Epidemiology: Person, Place, and Time

Understanding the pattern of HL occurrence is essential to understanding fundamental questions about its causation. HL is the most common form of malignant lymphoma affecting people under the age of 30 in economically developed countries, with some variation by race/ethnicity and especially socioeconomic status (SES) (Cozen et al. 1992; Mueller and Grufferman 2006). Figure 8.1 depicts the histologic pattern by age of all cases of HL diagnosed among the roughly 10 million people of Los Angeles County over 20 years. Characteristic of the occurrence in developed countries, especially in North America, is the high incidence of NSHL, the lesser frequency of MCHL, and the relative paucity of cases of the other cHL subtypes and of NLPHL.

Fig. 8.1
figure 1

Age-specific number of cases of Hodgkin lymphoma by histologic subtype, Los Angeles County, 1995–2014

Figure 8.2 depicts the age-specific incidence of the two major histologic subtypes, NSHL and MCHL, among males and females in Los Angeles County diagnosed from 1995 to 2014. Note the dramatic prominence of the young adult mode in cases from ages 15 to 35 of NSHL (now labeled adolescent/young adult Hodgkin lymphoma [AYAHL]), the absence of any such young adult mode among cases of MCHL, and the lower incidence of MCHL in females overall and especially at older ages.

Fig. 8.2
figure 2

Age-specific incidence rates of nodular sclerosis (NSHL) and mixed cellularity (MCHL) by gender, Los Angeles County, 1995–2014

In the USA, trends in the incidence of HL over time have been generally stable for at least four decades as reflected nationally by the National Cancer Incidence Surveillance, Epidemiology, and End-Results Program and by the constituent registries in Los Angeles County and California (https://seer.cancer.gov). However, over the last decade, a decrease of no more than 1–2% has occurred in the overall incidence rate per 100,000, but it is not clear whether this represents a true decrease or is a result of misclassification with non-Hodgkin lymphoma (NHL) due to the increasing use of needle biopsies for diagnosis (Glaser et al. 2015).

Figure 8.3 shows gender-specific comparisons in recent age-specific incidence for internationally diverse populations from North and South America, South and East Asia, and Europe (Forman et al. 2014). The AYAHL mode varies by gender and geography/demography and is generally correlated with the level of economic development in the source population, as was first noted by MacMahon (MacMahon 1957). Moreover, as MacMahon also first pointed out, AYAHL cases more often appear first in the mediastinum. These clear differences in the pathology, presentation, and pattern of occurrence are strongly consistent with a substantial difference in etiology (Cozen et al. 1992).

Fig. 8.3
figure 3

Age-specific incidence rates of Hodgkin lymphoma in males (a) and females (b) by international population, 2003–2007 (Forman et al. 2014)

Changes in incidence patterns according to the level of economic development (Fig. 8.3a, b) imply that age-specific incidence of AYAHL should also vary with changes in development over calendar time and place as this demographic characteristic changes. Population-based registration in the USA and Europe began well after the attainment of economic stability, but there are some available pertinent observations. A secular increase in AYAHL was first reported from Connecticut from a cancer registry dating from 1935; records from the first 45 years have been separately published (O’Conor et al. 1973), and those from an additional 35 years are included in the 10-volume International Agency for Research on Cancer series “Cancer in Five Continents” (Forman et al. 2014; Doll et al. 1966) and in the US Surveillance, Epidemiology, and End Results (SEER) Program records (https://seer.cancer.gov). From these data, it appears that the secular increase in incidence rates of AYAHL began in cohorts born in the 1940s.

In middle age, after about age 40, age-specific incidence increases throughout the world (Fig. 8.3a, b), with a pattern similar to that of NHL (Morton et al. 2008) and many solid neoplasms. At advanced ages, the average level of incidence and the rate of increase with age are greater among men than among women and greater among US whites than among Europeans or Asians. They are also greater among non-Hispanic whites than among Hispanic whites or African-Americans (data not shown). Of particular interest is the extremely low rate seen throughout advanced age in residents of Shanghai, a pattern characteristic of other B-cell lymphomas among East Asians (Gale et al. 2000).

At the other end of the age spectrum, the earliest incidence mode has occurred historically in the first decade of life among children in less developed countries (Correa and O’Conor 1971, 1973). Childhood cases are more similar in histopathologic subtype to those of older persons than to those with AYAHL , usually being MCHL. A hint of the earlier mode is seen among male children from both Cali and Mumbai (Fig. 8.3a, b). Recent reports from Brazil (Ferreira et al. 2012) and Mexico (Rendon-Macias et al. 2016) have contrasted the patterns of childhood and adolescent HL by levels of regional economic development within those countries, with lower rates and a larger proportion of NSHL in the more economically developed regions and higher rates and more MCHL in the less developed regions. As these reports indicate, the early childhood mode of MCHL is gradually diminishing worldwide (MacFarlane et al. 1995), leaving a low baseline frequency (Hjalgrim et al. 2016). In between the age extremes, NSHL predominates, accounting for the large and growing AYAHL mode, with incidence rates roughly equal among males and females.

Also included in IARC volumes are rates from Slovenia, of special interest because, unlike most of the other western European countries, the timing of a sudden social upturn was reflected in the rapid appearance of the AYAHL mode (likely to be mostly NSHL) (Fig. 8.4). The age-specific rates of AYAHL in Slovenia appear flat in 1975 but dramatically increase by 2005, with no such secular increase evident in HL at older ages. Similar increases in AYAHL rates have suddenly occurred in Finland and Estonia (Forman et al. 2014). Over a period of more gradual SES development in Asia, similar young adult incidence modes have been documented, i.e., in Singapore (Hjalgrim et al. 2008), as well as in Japan (Chihara et al. 2014) and Korea (Koh et al. 2013), in spite of the overall lower incidence of B-cell tumors in these populations (Morton et al. 2006). Even among US African-Americans and Hispanics, the beginning of similar changes in AYAHL incidence recently has been observed in Los Angeles (unpublished). These increases over calendar time in relation to SES are presumed largely due to increases in the rate of NSHL.

Fig. 8.4
figure 4

Age-specific incidence rates of Hodgkin lymphoma in Slovenia by gender and period of occurrence

Epstein-Barr Virus

EBV is a ubiquitous DNA virus with many latent, incompletely understood, genes and is a very successful human parasite (Crawford 2001). EBV infects most residents of developing countries as young children, usually without symptoms, but residents of developed countries are often infected a decade or so later, occasionally resulting in the clinical syndrome of infectious mononucleosis (IM) , presumably in those who are genetically susceptible (McAulay et al. 2007). IM is familiar to generations of teenagers and can be identified by a simple blood test (Crawford 2001). EBV has a crucial biological association with multiple malignancies, including epithelial carcinoma of the nasopharynx, Burkitt lymphoma, other NHL, and HL (Coghill and Hildesheim 2014). In the case of nasopharyngeal carcinoma, causal interactions with both genetic and behavioral determinants are known (Hildesheim and Levine 1993), and in the case of other lymphomas, the association appears under circumstances of reduced immune competence (MacMahon et al. 1991; Clarke et al. 2013).

During EBV infection, a variable proportion of B lymphocytes become and remain latently infected with the virus. These B-cells can both circulate and remain covertly within the germinal centers of lymph nodes (Mohamed et al. 2014). Following infection, after a short period of early convalescence and recovery, antibodies, to certain viral antigens, such as Epstein-Barr Nuclear Antigen 1 (EBNA1), are maintained more or less permanently (Crawford 2001), and other antibodies, such as those against viral capsid antigen (VCA) , are also present. The virus then exists in B lymphocytes in a latent-lytic cycle, with different patterns of latency measured by different EBV gene expression patterns, by corresponding antibodies. T-helper Type 1 (TH1) immunity keeps EBV latent, but when TH1 immunity temporarily wanes, the virus can become reactivated leading to a lytic stage, which can promote oncogenesis through a number of mechanisms (Khanna and Burrows 2000). During initial infection, antibodies to EBNA 2 predominate over antibodies to EBNA 1, but with recovery, this ratio reverses. Persistent higher anti-EBNA 2 compared to anti-EBNA 1 antibody levels may indicate chronic infection. When EBV is activated and replicating, the number of EBV copies (viral load) can be measured by PCR in serum (Gallagher et al. 1999), plasma (Ghandi et al. 2006) or whole blood (Gulley and Tang 2010). Thus, both prior infection and current activity of EBV can be measured using a variety of techniques.

In a minority of HL patients (~40% overall), EBV DNA is clonally present as an episome in HRS cell DNA, strongly suggesting that the cells were infected prior to malignant expansion (Weiss et al. 1989, 1991). Tumors demonstrating EBV-positive HRS cells are classified as EBV-positive HL. In general, risk factors for EBV-positive HL correspond to those for MCHL, including a relative increase among males, those lower of SES, and those diagnosed at age extremes (<8 or >50 years old) (Glaser et al. 2008; Flavell et al. 1999a; Chang et al. 2004a). Measures of maternal deprivation (Flavell et al. 1999b) and birthplace outside the USA (among US Hispanic cases) (Glaser et al. 2014) also predict EBV-positive HL (relative to EBV-negative HL). In acquired immunodeficiency conditions such as infection with the human immunodeficiency virus (HIV)/AIDS and transplant recipients, HL is almost always EBV-positive (Glaser et al. 2003a; Quinlan et al. 2010).

Serological evidence of past infection is present in essentially all cases of MCHL and most (but not all) cases of NSHL (some are known to be infected after HL develops) (Gallagher et al. 2003). An important set of studies conducted by Mueller and colleagues demonstrated an association between elevated titers of antibodies to EBV and (years) later risk of developing HL. First, historical samples from a blood bank in Norway were linked to the cancer registry to identify subjects who later developed HL. Higher than average antibody titers to IgG VCA, IgM VCA, and Ig (early antigen) EA strongly predicted future HL (Mueller et al. 1989). The results were later confirmed in a study of 128 cases in a nested case-control study within a cohort of US military active duty personnel (Levin et al. 2012) but only in EBV-positive cases. In a case-control study, higher IgG and IgA antibody titers to VCA and EA and an EBNA1/EBNA2 ratio <1.0 were associated with older age, male sex, lower educational level, smoking, and MCHL (vs. NSHL) (Chang et al. 2004a). The associations were attenuated when the variables were adjusted for each other, suggesting that the variables are correlated.

A number of investigators (Connelly and Christine 1974; Munoz et al. 1978; Miller and Beebe 1973; Rosdahl et al. 1974; Carter et al. 1977; Kvale et al. 1979; Gutensohn and Cole 1981; Bernard et al. 1987; Serraino et al. 1991; Mack et al. 2015) consistently found an association between a history of IM and risk of HL, mainly in the AYA group, the age at which the majority of cases are EBVnegative NSHL. As it became possible to assess the presence of EBV DNA in tumors, Hjalgrim and colleagues (Hjalgrim et al. 2003) conducted a population-based registry linkage study and found up to a 20-fold increased risk for EBV-positive HL within 5 years of documented IM, but no increase in risk for EBV-negative HL. A large well-conducted Scandinavian case-control study by the same group confirmed the lage discrepancy in the association between IM and EBV-positive and EBV-negative HL (Hjalgrim 2007a). Another case-control study in the U.K. (Alexander et al. 2003) found a less clear distinction, with IM conferring a significant 3-fold risk of EBV-positive and a non-significant 1.9-fold risk of EBV-negative HL among young adults. A significant association was observed for EBV-positive and EBV-negative HL separately when all ages were combined. Finally, a U.S. study (Chang et al. 2004a) found no difference in association with IM when EBV-postive and EBV-negative cases were compared to each other, but the study lacked power. In summary, there is strong evidence for an association between IM and EBV-positive cHL specifically, but an association with EBV-negative HL is less certain. This finding is paradoxical because IM occurs in adolescence and young adulthood and reflects delayed primary EBV infection associated with higher SES, factors more commonly associated with EBV-negative compared to EBV-positive HL (Flavell and Murray 2000).

Mueller and colleagues found that HL cases were more likely to have an EBNA1/EBNA2 ratio <1.0 compared to their unaffected siblings (Mueller et al. 2012). Similarly, the unaffected siblings with a history of IM also had a low EBNA1/EBNA2 compared to those who did not report IM. There were too few cases to examine the effect of histological subtype, and EBV tumor status was not provided; nevertheless it is clear from these studies that a dysregulated response to EBV after primary infection is associated with development of HL.

While peripheral blood mononuclear cell, serum, or plasma EBV DNA has been detected significantly more often in EBV-positive compared to EBV-negative cHL patients (Gallagher et al. 1999; Ghandi et al. 2006), it has not been systematically examined by histological subtype (Gallagher et al. 2003).

There is evidence of familial (Rostgaard et al. 2014) and genetic determinants of IM, including HLA class I genetic variants (McAulay et al. 2007), and of genetic determinants for antibody levels to EBNA1 (Rubicz et al. 2013). Besson and colleagues showed that the correlation between EBV viral load and levels of Ig against VCA was similar in HL patients and their relatives (Besson et al. 2006a). Although there is abundant evidence of HLA-specific risk alleles associated with EBV-positive cHL (reviewed in detail below in the genetic section), these appeared to be independent of IM (Johnson et al. 2015).

In summary, children with HL occurring before 8 years of age have invariably been infected with EBV and usually have the MCHL subtype and EBV-positive disease, similar to HL cases occurring in older adults. There appears to be a strong association between poor control of and/or response to EBV, including IM, and EBV-positive HL. It is still not clear whether this association reflects a more generalized immune dysregulation resulting in an ineffective immune response to EBV and HL as parallel consequences or whether actively replicating EBV itself is the causal agent. In contrast, there is little evidence for a role for EBV in the etiology of AYAHL EBV-negative NSHL (the majority of cases in developed countries).

Socioeconomic Status (SES)

SES can be characterized at the individual or at the population level according to the characteristics of the place of birth or residence at diagnosis based on the average local levels of income and education. It is important to realize that even within a given community, economic development does not impact all residents equally or at the same rate. Therefore, disease risk associated with different SES levels will be distributed through the community and will create a gradient of risk. Higher SES is strongly and positively associated with AYAHL but is inversely associated with HL in childhood and at older ages (Cozen et al. 1992; Gutensohn and Cole 1981; Clarke et al. 2005; Glaser et al. 2002; DeLong et al. 1984). SES is an index so generic that it provides a measure of almost any aspect of human circumstance or lifestyle, indicating the direction of a gradient but not the underlying causal exposure. Individual-level factors reflecting SES, such as family income (Abramson 1974), single family housing and higher maternal education (Gutensohn and Cole 1981), educational attainment (Serraino et al. 1991), intelligence quotient (LeShan et al. 1959), prestigious occupation (Vianna et al. 1974), and military rank (Cohen et al. 1964) also predict AYAHL. Maternal deprivation was associated with a higher risk of EBV-positive HL compared to EBV-negative HL in a UK study (Flavell et al. 1999b).

Body Size and Reproductive Factors

Height and higher body mass index (BMI) in association with HL have been repeatedly investigated, usually with respect to AYAHL , with most studies reporting modestly elevated levels of association (Mack et al. 2015; Li et al. 2013; Paffenbarger et al. 1977; Murphy et al. 2013). A recent meta-analysis (Larsson and Wolk 2011) provided stronger evidence of a positive relationship. Height alone either in childhood (Keegan et al. 2006; Isager and Andersen 1978) or adulthood (Mack et al. 2015; Murphy et al. 2013) has been more consistently associated with HL than has BMI. There is some heterogeneity of the association by age: Positive associations have generally been found for both BMI and height among younger cases and inverse associations for older cases (Li et al. 2013; Keegan et al. 2006). Birth weight has not predicted subsequent HL diagnosis (Isager and Andersen 1978; Barker et al. 2013; Langagergaard et al. 2008); however a high rate of fetal growth was positively associated in a study of 943 Swedish HL cases diagnosed prior to age 38 (Crump et al. 2012). Strenuous physical exercise was modestly inversely related to risk (Keegan et al. 2006). An association between AYAHL and greater consumption of saturated fat and lower consumption of monounsaturated fat was suggested (Gao et al. 2013). In the same study, HL cases showed a special fondness for sweets compared to controls (Epstein et al. 2015). Both adult height and BMI reflect cumulative nutrition, likely to reflect childhood affluence, although alternatively, each could also reflect increased levels of B-cell activating growth factors. A nested case-control study of male AYAHL cases permitted the adjustment of both BMI and height by measures of childhood family affluence and found that the latter was likely to be the operative determinant (Mack et al. 2015).

Reproductive behavioral measures, including low parity (Kravdal and Hansen 1993), late age at first pregnancy, cumulative breastfeeding experience (by the case), and hormone replacement therapy (Glaser et al. 2003b), predict HL risk. The associations could be explained by greater access to medical care, a biologic endocrine- or immune-related exposure, or higher SES. Maternal breastfeeding has also been found to predict childhood HL (Davis et al. 1988; Grufferman et al. 1998).

Measures of Early Life Microbial Exposure

Family Structure

The number of siblings emerged as a potential explanation for the association with high childhood SES and is often interpreted as a surrogate for microbial exposures in early life (with more exposure corresponding to more, closely spaced, siblings). Even after accounting for SES, sibship size has been repeatedly linked to HL in adolescence and young adulthood, with risk decreasing as the number of siblings increases (Mack et al. 2015; Westergaard et al. 1977; Chatenoud et al. 2005; Bernard et al. 1987; Bonelli et al. 1990; Altieri et al. 2006) (Fig. 8.5). In contrast, among cases occurring in early childhood, the opposite association is seen, with an increasing number of siblings linked to increasing risk (Westergaard et al. 1977). Sibship size has not been carefully studied in relation to HL among the elderly.

Fig. 8.5
figure 5

Odds ratio of young adult Hodgkin lymphoma according to sibship size

In addition to sibship size, AYAHL occurrence seems to be related to birth order and, within birth order, to the interval in age between siblings (although even in the largest study to examine these factors, sample sizes were not large after adjustment) (Mack et al. 2015). In case-control studies conducted with subjects ascertained since 2004, the association with sibship size has attenuated, but attendance at day care early in life was protective (Chang et al. 2004b, c).

Childhood Infections

Another marker of childhood microbial exposure is the recalled history of common infections of childhood. AYAHL patients have generally reported fewer than expected infections such as measles and mumps at early ages compared to controls (Mack et al. 2015; Paffenbarger et al. 1977; Gutensohn and Cole 1977; Glaser et al. 2005; Alexander et al. 2000). In contrast, HL cases occurring in early childhood have reported more infections compared to controls (Linabery et al. 2014). In AYAHL patients, these observations have been taken to imply a lack of exposure (generally or specifically) to a ubiquitous infection in childhood, with increased susceptibility in adolescence or beyond. In a Jerusalem birth cohort, HL diagnosed at <40 years of age occurred less commonly among infants hospitalized in the first year of life for a respiratory infection, while NHL diagnosed during the same follow-up period was significantly more common (Paltiel et al. 2006).

Tonsillectomy and Appendectomy

A marker of childhood risk that has been frequently reported but is difficult to interpret is a history of tonsillectomy. Studies have been inconsistent (Grufferman and Delzell 1984), but it seems likely (Vestergaard et al. 2010; Mueller et al. 1987) to be a predictor of AYAHL. In a cohort study of >55,000 tonsillectomized patients in Sweden, having a tonsillectomy before age 12 was associated with an increased risk of HL diagnosis prior to age 25 (1.7 cases expected, 7 observed, standardized incidence ratio [SIR]= 4.1) (Liaw et al. 1997). There was a 20-fold increased risk associated with having a tonsillectomy before age 5, based on an even smaller number of cases.

With respect to the significance of this association, it is possible that HL cases have some underlying immune susceptibility resulting in many respiratory infections and are therefore more often candidates for tonsillectomy . This explanation is consistent with findings from the Children’s Oncology Group study for HL cases diagnosed under 14 years (Linabery et al. 2014), but not with the observations in the Jerusalem birth cohort study cited above (Paltiel et al. 2006) (i.e., severe respiratory infections were protective for HL occurring up to age 39 years). A higher rate of tonsillectomy could also be correlated with higher childhood SES and therefore more access to medical care, which includes this common childhood surgery that has always had imprecise indications. However, this explanation is less likely to explain the results in the Swedish tonsillectomy cohort study (Liaw et al. 1997) since access to care is available equally to all in Sweden. Finally, regardless of the indication for tonsillectomy, it does have the effect of removing a lymphoid barrier to infections, which could have an impact on susceptibility.

Appendectomy has also occasionally (Henderson et al. 1979) albeit inconsistently (Silingardi et al. 1982) been found to be positively associated with AYAHL. While a Swedish linkage study (Cope et al. 2003) found the expected number of appendectomies preceding HL diagnoses, a study of 188 twin pairs discordant for AYAHL showed that appendectomy was more common among the HL-affected twin compared to their unaffected co-twin, a comparison naturally controlled for genotype and childhood SES (Cozen et al. 2009a).

One possible explanation for a true association with tonsillectomy and/or appendectomy is a tendency for a hyper-inflammatory response in lymphoid tissue, regardless of the trigger, among cases.

In addition to evidence of specific infections or surgeries related to them, evidence of general microbial exposure can be assessed in specific study circumstances. Disease-discordant twins make excellent subjects for this line of research because they can recollect, and therefore validate, each other’s childhood exposures. In the aforementioned study of AYAHL-discordant co-twins, the twin who had more oral exposures in infancy or early childhood, indicated by a history of relatively more thumb- and finger- sucking and the use of a pacifier, was significantly less likely to develop AYAHL by a factor of 50–80% (Cozen et al. 2009a). In a follow-up study, the fecal microbiome of the HL-affected twin had fewer unique microbial taxa than that of their unaffected co-twin (Cozen et al. 2013). Lower fecal microbial diversity is linked to a higher level SES in childhood (Björkstén et al. 1999), fewer older siblings (Laursen et al. 2015), a T-helper Type 2 (Th-2) skewed immune response, and a higher risk of atopy (Riiser 2015), factors which overlap to some extent with characteristics of AYAHL.

Adult Lifestyle

A large number of investigators have studied the link between cigarette smoking and HL over the years with mixed results. Since 2005, two hospital-based case-control studies (Monnereau et al. 2008; Gallus et al. 2004) found no increased risk. However, 3 cohort studies (Kroll et al. 2012; Nieters et al. 2008; Lim et al. 2007) found relative risks of 1.5, 2.1, and 2.3, 3 population-based case-control studies (Hjalgrim et al. 2007b; Briggs et al. 2002; Glaser et al. 2004) found OR’s of 1.6 and 1.8, and 2 large multicenter pooled case-control studies (Besson et al. 2006b; Kamper-Jorgensen et al. 2013) found odds ratios (ORs) of 1.4 and 1.2. In the latter, control for known determinants was generally accomplished, a dose-response relationship was observed, and risk was increased in men, in the elderly, and in those with MCHL and/or EBV-positive HL. In some studies, risk diminished with time since last exposure. Two meta-analyses based on 17 (Castillo et al. 2011) and 50 (Sergentanis et al. 2013) reports, respectively, found summary increased ORs of 1.2 and 1.4 and a suggestion that the risk was specific to EBV-positive HL.

Studies of alcohol consumption adjusted for tobacco use are remarkably consistent in showing a substantial protective effect, especially among non-smokers. In most instances, all adult cases were included (Monnereau et al. 2008; Kroll et al. 2012; Lim et al. 2007; Gorini et al. 2007; Klatsky et al. 2009; Deandrea et al. 2007; Tramacere et al. 2012; Ji et al. 2014). Pain experienced in untreated HL after alcohol consumption may impact its use (Atkinson et al. 1976) and thus reverse causality cannot be ruled out. In a study of the long-term use of anti-inflammatory agents, including COX-2 inhibitors, low-dose aspirin was suggestively protective against HL, but significance was marginal (Chang et al. 2010). Also seemingly protective was exposure to solar (UV) radiation, as measured by a history of sunburn and of recalled sun exposure (Smedby et al. 2005). This initial study was followed by others (Petridou et al. 2007; Grandin et al. 2008; Boffetta et al. 2008), using different measures of UV exposure and with disparate results. These were all included in a pooled analysis in an attempt to explain diverse results by examining subgroups of HL that concluded that protection was especially pertinent to EBV-positive HL (Monnereau et al. 2013). More recently, studies of HL and a latitude gradient in Australia (Van Leeuwen et al. 2013) and the U.S. (Bowen et al. 2016) adjusted for age, (and in the U.S.) and race, but not social class, are supportive of a protective effect with a dose-response, protective against NSHL, MCHL and in the U.S., all other cell types.

Clustering

There has long been interest in variation in the frequency of HL in time and space. Geographic differences in the incidence of HL between US cities are substantial (Glaser 1987). An early report of seasonality of the birth dates of HL patients (Langagergaard et al. 2003) was not confirmed by a second investigation (Crump et al. 2014). Several investigators have reported non-random variation in season at diagnosis (Newell et al. 1985; Westerbeek et al. 1998; Douglas et al. 1998; Chang et al. 2005a) generally finding peaks in the late winter or early spring, especially among young adults. This pattern is consistent with seasonality found for IM incidence (Douglas et al. 1996). HL mortality has also been linked to season of diagnosis (Porojnicu et al. 2005). A recent comprehensive report based on SEER data found an association between early spring (March) and latitude and HL incidence and mortality within 3 years of diagnosis (Borchmann et al. 2017). The authors suggest, as have others, that a reduction in Vitamin D production due to the low solar flux in the preceeding months could explain the seasonal effect. Although the observations appear sound, they must be interpreted with caution, especially with respect to etiology. A long latency between causal exposure and presenting symptom is the norm in cancer, including HL. In the registry linkage study, Hjalgrim and colleagues (Hjalgrim et al. 2003) estimated a median incubation period for EBV-positive HL following IM of 4.1 years (95% CI 1.8-8.3 years), inconsistent with a short latency period. Moreover, a similar early spring excess in diagnoses of etiologically distinct non-Hodgkin lymphoma was found using the same data source and similar methodology (Koutros et al. 2009). Borchmann et al. (Borchmann et al. 2017) and others (Chang et al. 2009) suggest an alternative hypothesis that perhaps the early spring diagnostic peak could be due to infection with seasonal viruses that bring undiagnosed younger patients in contact with medical care, where the HL is then discovered incidentally. However the association between the early spring diagnosis and mortality requires additional investigation and explanation.Allegations of geographically localized high risk (Vianna et al. 1971) have been unconvincing (Grufferman et al. 1979), but remaining suspicions of an infectious etiology have led to multiple studies of the tendency of HL to cluster in time and space (Alexander et al. 1989; Glaser 1990), as would be produced by a common-source or person-to-person infection (Grufferman 1977). Reports of cases with social connections are to be expected by chance, and some have been more widely publicized than closely scrutinized (Vianna et al. 1971). To result in a time-space cluster, cases would have had to have a common time-space exposure or a time-space link between cases in successive generations. Clustering of a rare condition like HL would be difficult to detect, since the need for statistical power mandates a large geographic and chronological range and this requirement is likely to lengthen the average time-space distance between cases beyond biological credulity. Most of those who have studied clustering have been too willing to assume that the incubation period between putative exposure and clinical onset is short, as it is with most familiar person-to-person viral infections. Malignancy takes time to progress to symptomatic disease, and the longer the incubation period, the more probable that the geographic link between cases will have been lost if presumptively based on a common residence or workplace onset. Most importantly, investigators have failed to recognize that an underlying association with SES will invariably produce an apparent geographic clustering (Alderson 1982; Greenberg et al. 1983). Two studies that did consider potential time-space connections searched for common social contact (Smith et al. 1973; Matthews et al. 1984). These investigators found no difference in the observed frequency of common social contact among case-case, case-control, and control-control pairs. These studies provide some evidence against person-to-person, congenital, or common-source infection.

Occupation and Environment

In a search for environmental determinants of HL, links between occupation and HL have repeatedly been investigated. With person-to-person transmission in mind, a modest increase in risk of HL among physicians (Vianna et al. 1974) was found but is also consistent with their higher SES (Grufferman et al. 1976; Smith and Kinlen 1974). In that context, no convincing increase in risk has observed among teachers (Grufferman et al. 1976), who also would have increased levels of exposure to cHL by virtue of exposure to students.

Historically, the most frequent finding to be reported was an association with woodworking occupations (Milham and Hesser 1967; Abramson et al. 1978). Less attention has been given to this exposure since an IARC review of the relationship failed to confirm a suggestive pattern (Demers and Boffetta 1998).

A significant inverse association between occupational exposure to allergens and risk of HL was seen in the EPILYMPH multi-country case-control study (Espinosa et al. 2013). With an eye toward the possible effects of immunogenic high molecular weight molecules, occupations involving exposure to materials such as latex were found to convey substantial increased risk (Kogevinas et al. 2004). Another set of studies concentrated on chemicals, with studies of chemists, workers in the chemical industry, and those exposed to a variety of commercial chemicals; however, no individual chemical or class of chemicals were shown to consistently predict occurrence (Grufferman et al. 1976; McCunney 1999). More recently, HL has been found in single reports to be weakly associated with exposure to the pesticide chlorpyrifos, moderately associated with exposure to ionizing radiation from uranium (Karunanayake et al. 2009), and substantially and significantly more common among gas-station workers (Neasham et al. 2011). Childhood MCHL but not NSHL has been found associated with household pesticide use during pregnancy (Rudant et al. 2007). No confirmatory studies have yet appeared for any of these reports.

Comorbid Conditions

Other conditions linked to HL can be grouped into several categories: conditions producing risk of HL by virtue of immunosusceptibility, conditions caused by HL and/or treatment for HL, and conditions likely to share environmental/genetic determinants.

In the first category, the major example is HIV infection (Biggar and Rabkin 1996), which produces a more than tenfold increase in risk of HL (Goedert 2000; Grulich and Vajdic 2015), and is almost always EBV-positive (Rapezzi et al. 2001). Unlike NHL, the risk of HL decreases with decreasing CD4 counts, and this decrease is especially precipitous for NSHL, which does not occur in patients with CD4 counts <50 cells/μL (Biggar et al. 2006). Cases also occur within a few months of first treatment with combination anti-retroviral therapy (cART) (Lee et al. 2016). Patients receiving transplantation have a fourfold increase in risk of HL (Quinlan et al. 2010), particularly if the transplant occurs at an early age (Clarke et al. 2013). The majority histologic subtype in both acquired immunodeficiencies is MCHL, however NSHL is increasing among HIV+ persons treated with cART with higher CD4 counts (Biggar et al. 2006; Jagadeesh et al. 2012).

In the second category, HL and/or treatment for HL increases the risk of other hematological neoplasms and solid tumors, and these risks persist long after treatment (Swerdlow et al. 2011; Castellino et al. 2011). Those cancers which occur in excess include NHL, both lymphoid and myeloid leukemias, viral carcinomas such as that of the cervix, and carcinomas of the lung, kidney, and especially breast and thyroid, probably because of proximity to the radiation therapy field (Swerdlow et al. 2011; Kaldor et al. 1990; Schonfeld et al. 2006). For example, HL patients diagnosed at 25–34 years old and treated with chemotherapy alone have a 33-fold increased risk of leukemia and those treated with both chemotherapy and radiation have a 17-fold increased risk of non-Hodgkin lymphoma (Swerdlow et al. 2011). A Scandinavian sudy estimated excess risk of acute myelocytic leukemia within the first 10 years following HL diagnosis at 7.9%, more pronounced in patients diagnosed prior to 1984 compared to after, probably due to modifications in chemotherapy (Schonfeld et al. 2006) A genetic variant in the gene PRDM1 is associated with increased susceptibility to radiation-associated second malignancies in HL patients, especially those receiving radiation therapy under age 20 (Best et al. 2010). Nonmalignant comorbidities related to both chemotherapy and radiation therapy include cardiomyopathy, pulmonary fibrosis, esophageal strictures, and reflux (Castellino et al. 2011).

Of more etiological interest is the third category of independent comorbidity. Case-control studies have identified a list of chronic diseases affecting both those with HL and their first-degree relatives more often than would be expected by chance. The descriptive commonality between HL and multiple sclerosis has long attracted attention (Newell 1970; Vineis et al. 2001; Kurtzke and Hamtoft 1976). A modest but significant increase in the frequency of HL and multiple sclerosis (from the population-based Danish registry) in the first-degree relatives of cases of the other condition has been reported, and anecdotal examples of individuals affected by both conditions have been noted (Hjalgrim et al. 2004). An attempt to confirm this association using hospital discharges in both Sweden and Denmark confirmed the Danish findings but found a modest and nonsignificant association in Sweden (Landgren et al. 2005a). More recently, modest evidence of genetic overlap was observed (Khankhanian et al. 2016), discussed in more detail in the genetic section below.

Using the linkage systems of Denmark and Sweden, a Scandinavian case-control study based on serology and personal interview found an association between personal histories and serological evidence of rheumatoid arthritis and HL, especially for EBV-positive and/or MCHL. The risk increased with subgroups of more definitive or more advanced rheumatoid arthritis (Hollander et al. 2015). In 2006, Swedish investigators selected a roster of population-based HL cases, identified their first-degree relatives, and searched hospital discharge registries for cases of autoimmune disease among them (Landgren et al. 2006). Cases of HL themselves had significantly increased lifetime risks of systemic lupus erythematosus, rheumatoid arthritis, sarcoidosis, and immune thrombocytopenic purpura, and among the relatives of cases, significant increased risks were found for sarcoidosis and ulcerative colitis, albeit based on small numbers. In a separate report of the same study design (Landgren et al. 2005b) but based only on the Swedish linkage system, the same investigators found a substantial deficit of type II diabetes mellitus in both HL cases and their first-degree relatives.

In 2014, a second team addressed the same question beginning with Swedish cases of autoimmune disease and identifying the cases of HL occurring among them and their relatives (Fallah et al. 2013). Again mostly with small numbers, this team found significant increases of NSHL compared to expected, among cases of sarcoidosis, immune thrombocytopenic purpura, rheumatoid arthritis, multiple sclerosis, and psoriasis. MCHL appeared in significant excess in cases of rheumatoid arthritis, sarcoidosis, systemic lupus erythematosus, immune thrombocytopenic purpura, Sjogren syndrome, and polymyositis/dermatomyositis. In addition to persons affected with certain of these same conditions, cases of HL of unspecified histology occurred among cases of Crohn’s disease. Having a parent with an autoimmune disease increased the risk of EBV-positive but not EBV-negative HL in children under 15 years old (Linabery et al. 2014). The most recent effort, again from Sweden and Denmark, examined the risk of autoimmune disease in a case-control study of HL and found a strong (>twofold risk) association with rheumatoid arthritis and EBV-positive MCHL (Hollander et al. 2015). Such autoimmune conditions may share genetic determinants, may be commonly influenced by EBV infection, or may share unknown environmental determinants of aberrant immune function.

T-helper Type 2 (TH2)-related immune abnormalities, such as asthma and hay fever, have been inconsistently linked to HL (Hollander et al. 2015; Martínez-Maza et al. 2010). Exceptions include a strong and significant positive association between a childhood history of eczema and risk of AYAHL in twin pairs discordant for the neoplasm (Cozen et al. 2009b) and between allergy in a parent or sibling and risk of childhood (under age 15) EBV-negative HL (Linabery et al. 2014).

A hallmark of HL is evidence of immunological dysfunction. As above, abnormalities in T-cell function, indicative of atopy or poor T-cell response, were more prevalent in relatives of HL cases compared to unrelated individuals (Merk et al. 1990). A twin study found higher levels of interleukin-6 (IL6) (Cozen et al. 2004) and lower levels of interleukin-12 (IL12) (Cozen et al. 2008) in the unaffected identical co-twins of cases (surrogate cases) compared to controls, suggesting a genetically determined immunophenotype. A recent case-control study nested in a cohort of US military recruits confirmed that higher IL6 levels prior to diagnosis predicted future development of HL, in addition to higher levels of CD30 (Levin et al. 2017). In the same study, detectable (vs. undetectable) levels of interleukin-10 (IL10) years prior to diagnosis predicted risk of EBV-positive HL only. Thus, the association between immune-related abnormalities and risk of HL and HL subsets provides evidence for the highly immunological nature of HL etiology.

Heritability, Familial, and Genetic Risk

Familial Risk

There is general consensus that HL is a heritable cancer. Early studies reported a three- to seven-fold higher risk of HL in relatives of cases (Grufferman et al. 1977; Kerazin-Storrar et al. 1983). and a nine-fold risk of HL in persons with a family history (Bernard et al. 1987). Subsequently, investigators in Sweden (Altieri and Hemminki 2006), Israel (Paltiel et al. 2000) and Iceland (Amundadottir et al. 2004) made use of cancer registries and/or large populationbased family registries to assess familial risk by comparing observed to modeled expected incidence, and found a three- to six-fold risk for HL individuals with affected first-degree relatives, or conversely, for the risk of HL in first-degree relatives of case. In Utah (Kerber and O’Brien 2005), Sweden (Goldin et al. 2004 ; Chang et al. 2005b; Goldin et al. 2009), and in all 5 Nordic countries combined (Kharazmi et al. 2015), the frequency of HL occurrence among first-degree relatives of cases was compared to the frequency among those of controls, again with risks of HL given an affected relative of three- to nine-fold. (The exception was a risk of 81 for LRHL among persons with an affected relative, based on six cases [Kharazmi et al. 2015]). Another Swedish group (Crump et al. 2012) conducted a birth cohort study to examine risk factors for HL, following the entire popualtion from birth in 1978 to 2008, linking to various Swedish registries to find HL-affected family members.They estimated that having a parent or sibling affected with HL conferred a seven- to nine- fold risk of HL to the cohort member, based on 13 HL-affected relatives identified for the entire cohort (Crump et al. 2012). The only study to examine familiality by EBV tumor status was conducted within the Children’s Oncology Group; the authors found that having a first-degree relative with HL conferred a roughly twofold nonsignificant higher risk for both EBV-positive and EBV-negative childhood HL (diagnosed at leass than 14 years) (Linabery et al. 2015).

Finally, a registry-based study conducted in Finland examined the risk of cancers among 4126 relatives of 693 rarely studied NLPHL cases (Saarinen et al. 2013). There were 12 occurrences of NLPHL among the relatives, with a very high risk compared to the expected based on population rates (SIR = 19, 95% CI = 8.8–36) and an even higher risk among those whose relative was diagnosed <30 years old, based on three cases (Saarinen et al. 2013). There was an especially high SIR for familial NLPHL among females (SIR = 48) compared to males (SIR = 17), regardless of the sex of the relative.

Of special interest in distinguishing between environmental and genetic factors is the study of twins. 179 MZ (initially unaffected) co-twins of HL cases had a significant, much higher histopathologically documented risk of developing HL than that expected based on population incidence (10 observed vs. 0.1 expected) , whereas no increased risk was observed among 187 DZ cotwins (0 observed vs. 0.1 expected) (Mack et al. 1995). Moreover, the majority of the originally unaffected MZ co-twins developed HL within an average of 4.5 years after diagnosis in the proband, All co-twins’ diagnoses occurred before age 50, and most had EBV-negative NSHL, consistent with the more common prevalence of the subtype at a young age (Mack et al. 1995). MZ-cotwins ascertained prior to their diagnosis and followed prospectively had the same higher risk as those identified in retrospect. In a review of cancer in twins born in England and Wales, HL risk to the likesex co-twins of HL cases was found to be higher than that to the unlike-sex co-twins of cases (Swerdlow et al. 1996), consistent with the higher reported risk to genetically identical MZ twins (Mack et al. 1995). In the largest HL study from a family database (Kharazmi et al. 2015), the risk of developing HL was 57 for co-twins of HL cases (6 co-twins out of 46 like-sex pairs). The age at diagnosis in both members of 4 out of 6 twin pairs occurred before age 30. In summary, family and twin studies indicate that HL, especially, but not exclusively AYAHL, is among the most familial of neoplasms. The relative contribution of genetic vs. environmental factors has not yet been determined.

In the Co-Morbidity section we discussed the occurrence of NHL following an HL diagnosis, presumably related to treatment. However, this association raises the question of other hematological neoplasms in family members of HL patients, suggesting common determinants. A low relative incidence (up to three-fold) has been documented in investigations based on cancer registries linked to family data bases in Israel (Paltiel et al. 2000) and Sweden (Goldin et al. 2005) , and in a case-control study in Scandinavia (Chang et al. 2005b). Further examination of links between HL and NHL subtypes in relatives have been conducted in a pooled international multicenter case-control study conducted within the InterLymph consortium (Wang et al. 2007) and again using the Swedish resources (Goldin et al. 2009a). These studies confirmed a modest association between HL and both diffuse large B-cell lymphoma (DLBCL) and follicular lymphoma. No increased familial risk betweeen HL and chronic lymphocytic leukemia (CLL) (Goldin et al. 2009b) or multiple myeloma (MM) (Schinasi et al.2016) was observed. None of these associations approach the mangnitude of the familiality of HL specific risk described above.

Genetic Risk Factors

Based on the evidence of strong heritability, a decades-long search to find the specific genetic risk factors in question has been ongoing. Different strategies have been employed that focus on either highly penetrant, rare variants associated with a very large risk, or common variants associated with a low risk but possibly a larger public health impact in terms of attributable risk (Manolio et al. 2009). Generally, studies of multiplex families are used to identify high-risk, rare variants, and very large case-control studies are employed to identify common loci at lower levels of risk.

Family Studies

HL clustering in families has been reported numerous times (Robertson et al. 1987; Cragen and Fraumeni 1972; Halazun et al. 1972; Shibuya et al. 1984). Family study designs, including segregation and linkage analyses in multiplex families and case-parent trio studies, are effective at identifying rare variants that explain risk in families using modern genotyping and sequencing technology. A group at the National Cancer Institute has followed HL families for over 40 years. Initially, HLA alleles were examined (Harty et al. 2002) (see HLA section below), and subsequently, an early genome screen using microsatellites in 44 of these families identified a significant peak at chromosome 4p16 flanked by loci D4S2935 and D4S394, with a nonparametric LOD score of 2.6 (p = 0.0002), suggesting a recessive model (Goldin et al. 2005). The biological implication of this locus is not known, but it has been associated with a variety of diseases including familial systemic lupus erythematosus (Xing et al. 2007), Wolfram syndrome (Larsen et al. 2004; Ohata et al. 1998), and progression of colon cancer (Al-Mulla et al. 2006).

In another study, a family with three siblings affected with NSHL as young adults and a mother who died of an unspecified mediastinal tumor were found to have an inherited translocation of t(2,3) (q11.2;p21.31) (Salipante et al. 2009). The translocation was localized to a breakpoint carrying the Kelch domain protein 8B (KLHDC8B) which was under-expressed in the affected family members (Salipante et al. 2009). A C > T substitution in the 5′-untranslated region of this gene was detected in affected probands from 3 out of 52 replication families with at least 2 cHL-affected members compared to 4 out of 307 controls, resulting in an OR of 4.64 (95% CI = 1.01–21.4) (Salipante et al. 2009). The KLHDC8B mutation was linked with cHL and lung cancer in the three families. One sporadic (non-familial) cHL patient’s malignant HRS cells showed loss of heterozygosity in a region that includes KLHDC8B. When the protein level of the gene was halved, there was disruption of cytokinesis resulting in an increase of binucleated cells, suggesting a link between the protein level and development of multinucleated HRS cells. This mutation has not been demonstrated in other sporadic or familial cHL patients but remains a suggestive candidate for follow-up.

In the sequencing era, a single family with three affected out of five siblings provided samples for exome sequencing. All three siblings had EBV-positive tumors; two were diagnosed with NSHL at ages 5 and 12 and the third was diagnosed with MCHL at age 12 (Ristolainen et al. 2015). In a comparison with controls with other phenotypes and with publically available exome data, a single locus with a homozygous deletion and resulting frame shift in the gene aggrecan (ACAN) was identified that resulted in 19 missing amino acids. ACAN is a component of the extracellular matrix of cartilage, including intervertebral disks (Gruber et al. 2011). In addition, the three affected siblings and unaffected father had a single nucleotide polymorphism (SNP) in KIAA0141, and a removal of a stop codon in fusion protein LY75-CD302, also expressed in HRS cells (Kato et al. 2003).

The Finnish group that demonstrated the strong heritability of NLPHL from registry data (Saarinen et al. 2013) identified a single family from the same registry with four cousins affected with NLPHL. Based on samples from 11 family members, a heterozygous 2 base pair deletion resulting in a frame shift and stop codon in exon 13 of the nuclear protein coactivator of histone transcription (NPAT) gene was observed in the affected cases and 3 healthy relatives but in none of the unrelated healthy Finnish controls (Saarinen et al. 2011). Furthermore, the deletion corresponded to lower expression of the NPAT mRNA. A different NPAT mutation appeared to be overrepresented in some sporadic HL cases, but the power was low and therefore results not conclusive. The function of NPAT in relation to NLPHL risk may be related to its interactions with the ataxia-telangiectasia mutated gene (ATM) , known to be involved in leukemias and lymphomas (Taylor et al. 1996a). Of note, none of the members in this family had the KELCH (Salipante et al. 2009) mutation.

NCI investigators conducted another study using exome sequencing to examine 17 multiplex HL families and found an association with a single locus that replicated in an additional 48 families (Rotunno et al. 2016). The C > T missense mutation (rs56302315) at chromosome 4q12, located in the kinase insert domain receptor (KDR) gene , was found in one NHL patient and three HL patients in the offspring generation in one family, and in two HL patients in two different generations in the second family (Rotunno et al. 2016). KDR is one of two receptors for vascular endothelial growth factor (VEGF) and mediates induction of endothelial proliferation, survival, and migration. It is expressed in endothelial cells in the liver and spleen among other sites and has been implicated in progression of a number of solid tumors including esophageal, head, and neck and other cancers and presumably facilitates metastasis. Several additional mutations were found in individual families but none replicated (Rotunno et al. 2016). There were no shared variants identified in the previously reported KELCH (Salipante et al. 2009), ACAN (Hafez et al. 1985) and NPAT (Saarinen et al. 2011) high-risk genes.

HLA Type as a Genetic Risk Factor

Overview

The HLA region, located at chromosome 6p21.3, is the most gene-dense region of the entire human genome and includes the genes that encode the HLA molecules, antigen processing genes such as TAPI and TAP2, and other immune response genes such as TNFα. The highly polymorphic HLA receptors present antigen processed from intracellular pathogens, phagocytized extracellular pathogens, endogenous degraded material, tumor antigens, or environmental antigens (such as those causing allergic reactions). HLA class I receptors include A, B, and C types and are expressed on every nucleated cell in the body and present processed antigen to CD8+ T-cells. HLA class II receptors include DR, DQ, and DP, which are expressed only on professional antigen-presenting cells including dendritic cells, monocytes/macrophages, and B-cells and present antigen to CD4+ helper T-cells. The HLA class III region is located between the class I and class II regions and contains genes encoding some other proteins involved in the immune response. The HLA receptors include a single or heterodimer chain and a binding pocket with variable amino acid sequences; they are the most highly variable set of proteins in the organism, conferring variation in antigen-specific binding, presentation, and subsequent T-cell signaling, permitting differential immune response to the antigenic stimulation that varies across individuals. There are over 2600 possible alleles representing up to 17 HLA class I and II genes. The DNA sequences encoding the series of HLA class I, II, and III immune genes are contained within haplotype blocks in very long sequences, sometimes over 1,000,000 DNA base pairs, and are thus highly linked, making it difficult to determine the actual causal set of variants within the haplotype. Furthermore, because of geographic-specific natural selection, genetic drift, and even differential mating preferences, HLA haplotypes vary by genetic origin, and they must be evaluated within ethnic strata. In addition, because of the immense variability of genetic haplotypes and the molecular structure of the various receptors present in any individual, large sample sizes are needed to identify more specific haplotype associations.

HLA alleles were originally studied using antibodies formed against the major HLA molecules with serology and have subsequently been examined using genetic sequencing techniques correlating DNA polymorphisms to HLA receptor proteins. HLA phenotypes and more recently genotypes emerged as early candidates to explain the obvious heritability of HL. An excellent and detailed review of this topic is presented by McAulay and Jarrett (2015). A summary of the major results for HLA class I and class II associations is presented below.

HLA Class I

Because HLA class I proteins present antigen processed from intracellular pathogens such as viruses and because it is suspected that a viral infection may be a trigger in the development of at least some HL, it could be presumed that HLA class I alleles would be more strongly associated than class II alleles. Indeed, the most striking initial finding was an association between HLA-A1 alleles and increased risk of HL, using the serological techniques available prior to sequencing technology (Hafez et al. 1985). The association with the HLA-A*01 alleles has been replicated using both serological and genotyping techniques in nine case-control and secondary analytic studies targeting HLA (Johnson et al. 2015; Hafez et al. 1985; Falk and Osoba 1974; Svegaard et al. 1975; Kissmeyer-Nielsen et al. 1975; Niens et al. 2007; Hjalgrim et al. 2010; Huang et al. 2012a; Hansen et al. 1977), with five of these studies showing that risk was specific for MCHL (Kissmeyer-Nielsen et al. 1975) or EBV-positive HL (Johnson et al. 2015; Niens et al. 2007; Hjalgrim et al. 2010; Huang et al. 2012a). Other HLA class I alleles associated with increased HL risk include HLA-B*05 (Falk and Osoba 1974; Svegaard et al. 1975), HLA-B*08 (Falk and Osoba 1971, 1974; Svegaard et al. 1975; Kissmeyer-Nielsen et al. 1975; Bjorkholm et al. 1975), and HLA-B*37 (Johnson et al. 2015; Huang et al. 2012a; Greene et al. 1979), while HLA-A*02 (Niens et al. 2007; Hjalgrim et al. 2010; Huang et al. 2012a), HLA-A*03 (Falk and Osoba 1974; Enciso-Mora et al. 2010) and HLA-A*11 (Svegaard et al. 1975; Falk and Osoba 1971) have been associated with decreased risk. In more detailed examinations, HLA-A*02 was specifically associated with a decreased risk of EBV-positive but not EBV-negative HL (Niens et al. 2007; Hjalgrim et al. 2010). A study showed that an HLA-A*02*07 allele, common among Chinese, was associated with a decreased risk of EBV-negative HL and an increased risk of EBV-positive HL (Huang et al. 2012b). Using microsatellite markers in a case-family control study in the Netherlands, Diepstra and colleagues (Diepstra et al. 2005) identified two microsatellite markers in the HLA class I region, D6S265 and D6S510, that conferred a high risk of EBV-positive HL. The location of these markers corresponds to the HLA-A*01 and HLA-A*02 mentioned above (McAulay and Jarrett 2015).

A decade later, technology permitted agnostic genome-wide association studies (GWAS) that examined genetic variation in markers spaced across the genome that could be imputed utilizing known linkage disequilibrium to include numbers of markers larger by orders of magnitude. The variation in the markers (single nucleotide polymorphisms, SNPs) among cases is compared to that in controls, adjusting for ancestry because of the differential linkage in different ancestral populations. The resultant p-value estimate is roughly 0.05 divided by the total number of independent blocks of linked genes in Europeans, thought to be 1,000,000, thus the genome-wide significance threshold is usually set at 5 × 10−8. This strategy has been very successful in identifying much of the genetic contribution for some conditions and traits, including prostate cancer (Olama Al et al. 2014) and multiple sclerosis (International Multiple Sclerosis Genetics Consortium et al. 2011). For rare diseases like HL, it can be challenging to compile the very large numbers of subjects needed to find significant associations. In addition, in order to find important causal variants without the dampening effect of misclassification, GWAS should be conducted separately for HL subtypes (EBV/age/histology), decreasing available numbers further. When significant associations are identified, the genetic variants must be validated by a more direct genotyping method like quantitative real-time PCR and confirmed in an independent set of cases and controls.

Four main GWAS of HL were conducted from 2010 to 2014, two in Europe (Enciso-Mora et al. 2010; Urayama et al. 2012) and two in the USA (Best et al. 2010; Cozen et al. 2012), each combining subjects from other studies. (The larger US GWAS study was first published with the smaller (Best et al. 2010) in a meta-analysis (Cozen et al. 2012). In spite of the challenges, a small number of highly associated risk variants (“low-hanging fruit”) were identified, probably because of the strong heritability of HL.

HLA class II associations were observed in all of these studies (see below); however only one European study (Urayama et al. 2012) identified risk loci in the HLA class I region, specifically associated with EBV-positive and not EBV-negative HL. These loci were confirmed in a meta-analysis that added additional 22 cases (Cozen et al. 2014). Of note, the risk estimates for these HLA-A loci were both >2.0, among the highest of the GWAS associations. HLA typing was available for the majority of the cHL patients (257 EBV-positive and 642 EBV-negative patients), and the two HLA class I variants were in strong linkage with the previously reported HLA-A*01 (rs2734986) and HLA-A*02 (rs6904029) associations (Hjalgrim et al. 2010).

HLA Class II

HLA class II alleles are more complex than HLA class I because there are two chains (α and β) compared to the one chain for HLA, and for one group, DRB genes, the number of genes is variable between individuals and can include a DRB3, DRB4, and/or DRB5, in addition to the universal DQ and DP. Examination of HLA class II alleles began in earnest in the 1990s, mainly in the UK, with the advent of molecular typing. Seven publications from several groups with multiple studies per group found that HLA-DPB*0301 was associated with an increased risk of HL (Johnson et al. 2015; Oza et al. 1994; Alexander et al. 2001; Taylor et al. 1996b, 1999; Tonks et al. 1992; Klitz et al. 1994), with a relative risk of approximately 1.95 in the largest, a multinational study of 741 cases (Oza et al. 1994). Cases with EBV-positive tumors were slightly more likely to be positive for HLA-DPB*03:01 compared to EBV-negative tumors (Johnson et al. 2015; Alexander et al. 2001). The lone US study found an elevated risk of NSHL associated with the HLA-DPB*03:01 allele (Klitz et al. 1994) however no study examined the allele by both histology and EBV status together. Asian patients from Japan and Taiwan were included in a multinational study, and among these patients, the allele DPB1*04:01 was associated with a decreased risk of cHL (Oza et al. 1994). DRB1*15:01 was associated with an increased risk of NSHL in 2 studies (Harty et al. 2002; Klitz et al. 1994), including a study of 16 multiplex families (Harty et al. 2002). Two studies also found that DQB1*06:02 was a risk allele for NSHL and that a haplotype including both DRB1*15:01 and DQB1*06:02 accounted for the same level of risk as either allele alone, possibly explained by a tight linkage (Harty et al. 2002; Klitz et al. 1994). The third study modeled previously generated data from the UK and found DRB1*15:01 and DPB1*01:01 were associated with a decreased risk of EBV-positive cHL, the opposite effect seen for NSHL (Johnson et al. 2015). Finally, a study conducted in the Netherlands found an inverse association between DRB1*07:01 and cHL (Huang et al. 2012a); the association was confirmed in the UK modeling study, with a suggestion of a stronger inverse association with NSHL (Johnson et al. 2015).

The importance of the HLA class II region was confirmed in the agnostic GWAS mentioned above. In the first published GWAS, a SNP at 6p21.3 (rs6903608) was the most significantly associated locus found to be associated with cHL, with an OR of 1.7 and p-value of 2.84 × 10−50 in the combined discovery and replication sets (Enciso-Mora et al. 2010). The SNP is located near the DRA and DRB1 gene regions (usually assigned to DRA because of proximity). This SNP was confirmed in a combined meta-analysis of the other three GWAS, where it was strongly associated with NSHL (p = 1 × 10−26), EBV-negative cHL (p = 7 × 10−33), and NSHL in young adults from 15 to 35 years old (p = 6 × 10−27), but not with MCHL (p = 0.19) or EBV-positive cHL (p = 0.68) (Cozen et al. 2014). The UK HLA modeling study also confirmed that rs6903608 was the best predictor for the risk of EBV-negative cHL, accounting for all other common risk alleles (Johnson et al. 2015). In a study combining the two US GWAS (Cozen et al. 2012), a protective haplotype of 5 SNPs mapping to DRB1*07:01 was associated with a ~50% decreased risk of (mainly EBV-negative) NSHL, similar to results in other studies (Johnson et al. 2015; Huang et al. 2012a). Finally, a SNP located near HLA-DPB1 (rs6457715) was identified by one of the European GWAS studies, linked to EBV-positive HL independent of histological subtype (Delahaye-Sourdeix et al. 2015).

The functional significance of the strong HLA associations observed for cHL has yet to be determined, but there are some plausible hypotheses. It is possible that the mechanism underlying increased risk of EBV-positive cHL associated with HLA-A*01:01 may be due to a poor immune response to EBV infection, either by weak binding or downstream signaling, whereas the HLA-A*02 allele, which is protective, may be associated with a stronger immune response resulting in better control of the EBV infection (McAulay and Jarrett 2015; Kushekhar et al. 2014). Interestingly, some of the HLA associations have opposite effects for EBV-positive and EBV-negative/NSHL (Johnson et al. 2015; McAulay and Jarrett 2015). That, combined with the attenuated association between IM and EBV-negative compared to EBV-positive HL (Hjalgrim et al. 2007a), suggests that HLA alleles associated with an effective immune response to EBV may protect against consequences of a poorly controlled EBV infection, which include IM and EBV-positive HL.

There is less evidence for a mechanism that explains the association between HLA class II types and risk of EBV-negative HL and NSHL, but one possibility is a differential ability to present tumor antigens and the propensity for a strong CD4+ TH2 response in the microenvironment that generally promotes tumor cell survival (Kushekhar et al. 2014). Of note, HLA class II alleles and haplotypes have been among the strongest genetic associations identified from GWAS of any B-cell lymphomas, including CLL (Conde et al. 2010), follicular lymphoma (Smedby et al. 2011), and multiple myeloma (MM) (Beksac et al. 2016).

HL has also been linked to the blood group B antigen, another cell surface receptor (Mack et al. 2015).

Non-HLA Genetic Associations

A large number of genetic associations in non-HLA genes have also been identified and have been classified as relating to the immune response, carcinogen metabolism, DNA repair, or folate metabolism (Sud et al. 2017). A comprehensive review focused on published reports of 13 genetic variants in 10 genes and included meta-analyses of published results and false-positive report probabilities (Sud et al. 2017). The authors concluded that the only risk SNP that maintained a significant association after these considerations was rs17655 located in the DNA repair gene XPG/ERCC5 (ORmeta-analysis = 2.05, p = 0.046). This meta-analysis highlights the importance of achieving adequate power to appropriately measure the association between polymorphic variants and an increased risk of HL in candidate gene association studies. Additional significant variants in immune-related and DNA repair genes were found by including GWAS data, but these were interpreted with caution.

Harty et al. identified a Ile333Val polymorphism of TAP1, an antigen processing gene located within the HLA III region, associated with familial NSHL in the multiplex family study described above (Harty et al. 2002). A microsatellite marker in the same region was also associated with EBV-negative HL in a case-control study from the Netherlands (Diepstra et al. 2005).

Seven additional genome-wide significant associations with risk loci outside of the HLA region have also been identified: 2p16, 3p24, 5q31, 6q23, 8q24, 10p14, and 19p13.3 in or near the genes REL, EOMES, IL13, HBS1L-MYB, PVT1, GATA3, and TCF3, respectively (Enciso-Mora et al. 2010; Urayama et al. 2012; Cozen et al. 2014; Frampton et al. 2013). The effect sizes associated with these loci were weaker than those observed for the HLA alleles, ranging from 19 to 50% increases in risk. There was heterogeneity by subtype, with variants in REL, EOMES, IL13, and GATA3 linked to EBV-negative/NSHL and AYAHL risk , and there was a suggestion of a similar pattern for the genetic variant in TCF3. In contrast, the loci at HBS1L-MYB PVT1 had similar effects and p-values across all cHL subtypes.

There is biological plausibility for at least some of these GWAS-identified loci. REL encodes c-REL, a member of the NFκB family of proteins that facilitate activation of the NFκB pathway. NFκB is expressed in HRS cells and promotes proliferation while suppressing apoptosis (Kuppers 2009). The NFκB pathway can be activated by CD40 or alternatively by expressed EBV genes LMP1 or LMP2, but since these risk variants did not appear to be associated with EBV-positive cHL, the significance of EBV as an activator of the pathway is questionable. EOMES codes the protein eomesodermin that regulates embryological limb development as well as lymphocyte effector function. It is a member of the TBR1 family of T-box genes and interacts with TBET (Tbx2) to increase differentiation of CD8+ T-cell function, necessary for effective tumor response in cHL (Zhang et al. 2011). The highly variable (but gene desert) 8q24 region harbored a SNP (rs2019960), similarly associated across HL subtypes. This region has been associated with risk of multiple cancer types and is thought to interact with and regulate MYC (Grisanzio and Freedman 2010). The cHL variant is not in the same LD block as the loci associated with other cancers, but it may have a similar functional significance with respect to interaction with MYC, a master oncogene. The SNP is located near PVT1, a gene that regulates MYC expression, although the role of MYC is unclear in HL pathogenesis.

More interesting is the missense SNP (rs201541) located in the interleukin-13 (IL13) gene (rs20541) at chromosome 5q13, moderately associated with EBV-negative HL and NSHL (Urayama et al. 2012; Cozen et al. 2014). The SNP results in an amino acid change from glutamine to arginine, resulting in increased IL13 expression. Another SNP (rs2069757) at the same locus is even more strongly associated, primarily with EBV-negative HL. IL13 is a TH2 cytokine that is associated with increased IgE levels and atopy and increased production of collagen and sclerosis (wound-healing response) (Wynn 2008). It is also expressed by HRS cells, and the protein is sometimes observed in the nonmalignant tumor microenvironment (Skinnider and Mak 2002). IL13 could function in a number of ways, including promoting a TH2-mediated immune suppression against cytotoxic T-cell activation, production of sclerosis, and/or as an autocrine growth factor for HRS cells (Kushekhar et al. 2014). Genetic variants in the TH2 transcription factor GATA3 at chromosome 10p14 are also associated with the EBV-negative NSHL subset, especially in young adults (Cozen et al. 2014). Like IL13, GATA3 is expressed both in the HRS cells and in the T-cells in the HL microenvironment and can promote a TH2, T-cell exhausted environment (Steidl et al. 2011; Greaves et al. 2013).

In the largest GWAS meta-analysis that combined 3 of 4 the previous GWAS, with a total of 1816 HL cases and 7877 controls, a novel risk locus (rs1860661) was found on chromosome 19 in exome 2 of the TCF3/E2A gene (Cozen et al. 2014). The minor (G) allele was significantly correlated with increased gene expression in controls, and a mutation at the same locus was found in one HL cell line. TCF3/E2A expression promotes stability of the B-cell phenotype; it is inhibited in HL tumor cells resulting in in the hallmark downregulation of the B-cell receptor and other essential B-cell markers (Mathas et al. 2006). Thus, evidence suggests that the G allele protects against HL by increasing TCF3/E2A expression which stabilizes the B-cell phenotype. Similar to most of the other variants, the associations were mainly observed for EBV-negative, NSHL subtypes and not for EBV-positive subtypes.

In summary, a number of genome-wide significant loci have been identified for cHL, with the strongest associations in the HLA-class II region, especially for the AYA, EBV-negative, NSHL subtype. In addition, there are several associations suggesting an interaction with EBV among patients with EBV-positive tumors. The association of HLA and TH2-related genetic risk variants with EBV-negative NSHL suggests that deficient TH1 surveillance in the setting of virally infected B-cells may have a role in susceptibility, but there is currently no candidate virus for this subset. It has been proposed that this same immunophenotype may enhance survival of HRS precursor cells (Zhang et al. 2011) but this seems further down the path of pathogenesis than risk, more related to progression. Thus, at present, the causal mechanisms for these common risk SNPs, identified by GWAS, remain unknown. For a comprehensive discussion on risk mechanisms of the reported susceptibility alleles, see Kushekhar et al. (2014).

Genetic Overlap with Other Diseases

To date, two studies examined GWAS data for cHL in combination with GWAS for other diseases. Law et al. used a method that compares subsets of studies, accounting for shared controls and multiple testing, to examine commonalities in HL, CLL, and MM, all cancers of B-cell origin at different stages of development. Novel loci inversely or positively associated with the three B-cell tumors were identified. An allele in HLA class II DRB1 (Ser37+Phe37) was significantly positively associated with both HL and CLL (and inversely associated with MM), and an allele in HLA class II DQB1 (Gly70) was inversely associated with HL and positively with CLL. A variant at 3q22.2 (rs11715604) was also inversely associated with HL risk and positively associated with CLL risk. This SNP maps to the gene NCK1 which, among many other functions, increases T-cell proliferation and activation. In addition, risk loci for CLL (BAK1, IRF4), MM (ULK4), and both (TERC) were all positively associated with HL.

Because of the previously noted association between cHL and autoimmune disease, a meta-analysis was conducted with a cHL (1816 patients) and a multiple sclerosis (9772 patients) GWAS with a total of 25,255 controls, in order to identify common risk loci. Both diseases had many HLA region risk SNPs in common. Polygenic risk allele scores composed of the MS risk alleles explained ~4.5% of HL risk. In a genetic diseasome network, in which published GWAS data proximity was calculated and plotted, HL was genetically much more closely aligned with autoimmune diseases than with solid tumors (meaning more SNPs and similar loci in common) (Fig. 8.6).

Fig. 8.6
figure 6

Human disease network shows distinct autoimmune and solid cancer clusters and places hematologic cancers in context. In a network of disease proximity, constructed using systematic GWAS data, autoimmune diseases (purple) tightly cluster. Solid tumors (orange) also form a cluster but exhibit less genetic relatedness to HL compared to autoimmune diseases (from Khankhanian et al. 2016). Autoimmune Diseases: AR alopecia areata, AS Ankylosing spondylitis, Beh Behcet’s disease, Cel Celiac Disease, CD Crohn’s disease, GD Grave’s Disease, IGA IgA glomerulonephritis, KAW Kawaski disease, MS Multiple sclerosis, PBC Primary biliary cirrhosis, PSA Psoriatic arthritis, RA Rheumatoid arithitis, PSQ Sclerosing cholangitis, SLE Systemic lupus erythematosus, SS Systemic scleroderma, TID Type 1 diabetes mellitus, UC Ulcernative colitis, Vit Vitiligo. Hematologic Neoplasms: CLL Chronic lymphocytic leukemmia, HL Hodgkin lymphoma, MM Multiple myeloma. Solid Cancers: BCC Basal cell carcinoma, BLC Bladder carcinoma, BRC Breast cancer, CNS Central nervious systsem, OESC Oesophageal carcinoma, LUA Lung adenocarcinoma, LUC Lung carcinoma, MEL Melanoma, OVC Ovarian Carcionoma, PAC Pancreatic Carcinoma, PRC Prostate Carcionoma, RCC Renal cell carcinoma, SCC Squamous carcinoma, STC Stomach carcinoma, THC Thyroid carcinoma

Heritability

Heritability of a trait or phenotype is a theoretical statistical concept that represents the proportion of variation of the occurrence of the disease in a population that is explained by genetic variation. Assumptions about the contribution of environmental factors and chance undermine the accuracy, but it is a useful measure when comparing the genetic contribution between different diseases Using GWAS data on 906 cases from Northern Europe with replication from the Swedish Family Cancer Registry, Thomsen and colleagues estimated heritability for all HL together at greater than 35%. About 20% of this heritability estimate is attributable to genetic variation in the HLA region. Another heritability analysis again on the Swedish Family Cancer Registry (without genetic data) estimated heritability at 28.4%.

Summary

HL comprises multiple etiological entities that share fundamental immunological dysregulation and a rare neoplastic cell. There are both genetic and environmental determinants that vary by age at diagnosis, SES, histological subtype, and EBV prevalence in the tumor cells. It is difficult to separate the presence of EBV in the HRS cells and the characteristics of the microenvironment (histology) because they are correlated; nevertheless there appears to be distinct determinants for each. NSHL/AYAHL is the most common type, varying greatly in place and time in relation to the social changes that accompany economic development in societies and increasing affluence in individuals. It is highly heritable and is linked to specific HLA class II alleles and genotypes and TH2-related genes. The behavioral factors responsible for the relatively sudden appearance of this condition at times of affluent social change are unknown; there are suggestions that a deficit of early childhood exposure to microbes may result in a specific or nonspecific a susceptible immune response. EBV-positive HL, like MCHL, is more common at the age extremes and in settings of social deprivation or acquired immunodeficiency. It appears to be associated with ineffective response to EBV infection and specific HLA class I alleles and genotypes. Both genetic and environmental risk factors tend to show opposite gradients with EBV-positive MCHL and EBV-negative NSHL. NLPHL is quite rare and also appears to be highly heritable and is almost never associated with EBV. The other types are too rare to study separately in an epidemiological context.

What must we still learn about the epidemiology of HL? The most pressing epidemiological question is what causes the relatively rapid appearance of AYAHL/NSHL in response to increasing economic development. Could it be related to a general deficit of early childhood microbial exposures mediated by the microbiome or delayed exposure to an, as yet unidentified, specific virus? A related question is why is IM not associated with risk of AYAHL/NSHL since both preferentially affect affluent adolescents? Another important question involves the relationship of EBV to HL. Could EBV be a marker for an immunosusceptible phenotype or is it causal? Some of these important questions may be answerable in the future through large international collaborations providing sufficient sample size to study individual determinants and entities by EBV tumor status, age, and histopathology simultaneously.