Introduction

Type 2 diabetes mellitus (T2DM) is a lifestyle-related disease that affects approximately 382 million patients worldwide [1]. Recently, an association between T2DM and cancer risk, particularly hepatocellular carcinoma (HCC), has been reported. Several cohort studies in Japan revealed that the risk of liver cancer incidence was extremely high among individuals with a history of diabetes mellitus [hazard ratio (HR) 1.94, 95 % CI 1.65–2.36), even higher than that of pancreatic cancer (HR 1.85, 95 % CI 1.46–2.34) [2]. Among 820,900 participants from 97 prospective studies, diabetes mellitus was associated with death from cancer (HR 2.32, 95 % CI 2.11–2.56) with the highest HR (2.16, 95 % CI 1.62–2.88) seen for liver cancer [3]. Several clinical variables such as age, sex, and body mass index (BMI) are associated with HCC prevalence [4], and T2DM patients often exhibit risk phenotypes associated with these factors. However, T2DM patients with these clinical risk phenotypes do not often develop HCC; this observation indicates that genetic factors might influence HCC development in patients with T2DM.

Multiple reports indicate that susceptible genes associated with these comorbid diseases, such as T2DM are also associated with various cancers. Notably, single-nucleotide polymorphisms (SNPs) that are reportedly associated with susceptibility to T2DM are also associated with prostate cancer [5] and colorectal cancer [6]. Therefore, we hypothesized that T2DM and HCC may be influenced by the same susceptible genes. Many SNPs that are associated with T2DM have been identified and confirmed in multiple genome-wide association studies (GWASs) [718]. However, relationships between HCC development and these SNPs have not been examined.

In addition, GWASs have successfully identified the rs738409 SNP in patatin-like phospholipase domain-containing protein 3 (PNPLA3) (I148 M) as being associated with nonalcoholic fatty liver disease (NAFLD)/nonalcoholic steatohepatitis (NASH) [19]. Furthermore, PNPLA3 variants are associated with not only hepatic fibrosis, but also NAFLD-related HCC in European Caucasians [20]. Because the prevalence of NAFLD/NASH is high in patients with T2DM [21], we reasoned that PNPLA3 genotype could influence HCC development in patients with T2DM. The aim of study was to identify genetic variants that predispose T2DM patients to HCC by genotyping the SNP rs738409 in PNPLA3 and 50 other SNPs reportedly associated with T2DM.

Methods

Patients

We recruited 389 T2DM patients, 59 with HCC and 330 without HCC from Kurume University Hospital, Hiroshima University Hospital, Aichi Medical University Hospital, Nara City Hospital, and Kohnodai Hospital in Japan. Each participant satisfied the following three inclusion criteria: (1) negative for HBs-Ag and anti-HCV Ab, (2) alcohol intake <60 g/day, (3) disease duration of T2DM >10 years. The presence of T2DM was determined on the basis of fasting blood glucose levels >126 mg/dl or hemoglobin A1c (HbA1c) >6.5 % in accordance with the diagnostic criteria for diabetes mellitus, or by the use of anti-diabetic agents. To minimize the effects of stratification on genetic associations, only individuals of Japanese ancestry were included in this study.

The 59 T2DM patients diagnosed with primary HCC were classified into the DM–HCC group. HCC was diagnosed by a combination of imaging procedures and tests for serum tumor markers such as alpha-fetoprotein and des-gamma-carboxy prothrombin. For 49 of the 59 cases, a surgical specimen or tumor biopsy was available, and in each case (49/49), the HCC diagnosis was confirmed histologically.

The 330 T2DM patients without HCC were classified into the DM–non-HCC group. DM–non-HCC patients were subsequently stratified based on liver stiffness measurements (LSMs) determined via transient elastography (Fibroscan, Echosens, Paris, France); this stratification was potentially significant because the DM–non-HCC group may include patients with liver cirrhosis who are at a higher risk of developing HCC in the future. Among the 330 DM–non-HCC patients, 198 showed no indication of advanced hepatic fibrosis and were classified as the DM-control group. The inclusion criteria for DM-control were as follows: (1) >65 years old and (2) LSM <7 kilopascals [22] (Fig. 1).

Fig. 1
figure 1

Scheme of classification for the study group. The allele frequencies for each of the 51 SNPs that passed the filtering criteria were compared between the DM–HCC and DM–non-HCC groups and separately between the DM–HCC and DM-control groups

Written informed consent was obtained from each participant. This study was conducted in accordance with provisions of the 1975 Declaration of Helsinki and approved by the Institutional Ethics Committee of National Center for Global Health and Medicine (NCGM-A-000206) and each hospital participating in the study.

DNA preparation

Genomic DNA was extracted from peripheral blood lymphocytes via the phenol chloroform DNA extraction method; purified DNA was resuspended in TE buffer. Each sample was stored at −20 °C until use.

Genotyping

The SNP rs738409 has a non-synonymous variant and is located in the third exon of PNPLA3; it was genotyped in each sample via TaqMan SNP genotyping assays (Applied Biosystems, Foster City, CA, USA) that were conducted with the LightCycler 480 Real-Time PCR System (Roche, Mannheim, Germany).

In addition, 58 other SNPs were selected as candidate SNPs for this study; each SNP has a documented association with T2DM in GWASs during 2003–2010, and they are located in or near the following loci: EIF2AK4 [7], KRT4 [7], FAM60A [7], ANGPT4 [7], SPDEF [7], A2BP1 [7], Intergenic [7], GCKR [7], SLC30A8 [710], HHEX [710], CDKN2B [710], IGF2BP2 [710], CDKAL1 [710], TCF7L2 [79], KCNJ11 [710], PPARG [79], WFS1 [8], JAZF1 [8, 9], TSPAN8/LGR5 [8, 9], ADAMTS9 [8, 9], CDC123/CAMK1D [8, 9], NOTCH2 [8, 9], THADA [8, 9], FTO [8, 9], CALPN10 [9], HNF1A [9], VEGFA [9], BCL11A [9], HNF1B [7, 9], KCNQ1 [7, 11, 12], SRR [12], PTPRD [12], MAF/WWOX [12], ACACB [13], CACNA1D [13], CLIC5 [13], KCNJ15 [14], GCK1 [15], SLC12A3 [16], NCALD [17], and ELMO1 [18]. Each of the 58 SNPs was genotyped via multiplex SNP typing assay (DigiTag 2 assay [23]). Of the 58 SNPs, eight were not subject to further analysis because of a low call rate <0.95. None of the remaining 51 SNPs (including PNPLA3 rs738409) showed Hardy–Weinberg equilibrium p < 0.001.

After the quality control filtering (i.e., SNP call rate ≥95 %, and HWE p value ≥0.001), the allele frequencies of the 51 SNPs (including PNPLA3 rs738409) were compared in this analysis.

Statistical analysis

Baseline data on 19 continuous variables for each of three groups (DM–HCC, DM–non-HCC, and DM-control) are shown in Table 1 with the mean value and standard deviation; p values were calculated using the Mann–Whitney U test. Categorical values are also shown in Table 1 with the observed number of samples and the percentage of observations (%); p values were calculated with χ 2 test.

Table 1 Clinical characteristics

The statistical significance of associations between each of 51 SNPs and HCC was assessed via χ 2 test. To eliminate false-positive results due to multiple testing, statistical significance level was set as 0.001 (0.05/51). Subsequently, a multivariable logistic regression analysis was conducted; this analysis incorporated genotypes and biologically relevant covariates such as age, sex, and BMI that are each known to be associated with risk of progression to HCC [4].

Results

Clinical characteristics

DM–HCC patients were significantly older than DM–non-HCC patients (76.2 ± 7.5 years for DM–HCC and 70.0 ± 10.9 years for DM–non-HCC, respectively, p < 0.0001), and there were significantly more males in the DM–HCC group than in the DM–non-HCC group (p = 0.0336). The clinical characteristics related to liver injury, such as white blood cells count (WBC), platelet count (PLT), serum albumin (Alb), aspartate aminotransferase (AST), alkaline phosphatase (ALP), gamma-glutamyltranspeptidase (GGT), total bilirubin (T-Bil), FIB-4 index, LSM, ferritin, and hyaluronic acid were each significantly different between the DM–HCC and DM–non-HCC groups. There were no differences in anti-HBc Ab between the groups. No significant difference was observed in the use of sulfonylurea and exogenous insulin (Sulfonylurea: p = 0.33, Insulin: p = 0.28). The proportion of use of metformin in DM–HCC patients was significantly lower than that of DM–non-HCC group (OR 0.32, 95 % CI 0.14–0.74, p = 0.0051). Comparing DM–HCC to DM-control instead of DM–non-HCC, a similar tendency was observed (Table 1).

The association between HCC development and the SNPs examined in this study

Each of 58 SNPs with documented associations with T2DM were genotyped for each of the 389 Japanese T2DM patient samples. An additional SNP PNPLA3 rs738409, which is associated with severity of nonalcoholic fatty liver disease in Japanese patients, was also genotyped in the same set of 389 samples. Quality control filtering (i.e., SNP call rate ≥95 %, and HWE p value ≥0.001) identified 51 SNPs (including PNPLA3 rs738409) that fulfilled the filtering criteria; the allele frequencies of these 51 SNPs were used for comparisons between the DM–HCC and DM–non-HCC groups and separately between the DM–HCC and DM-control groups (Supplementary Table 1).

The PNPLA3 rs738409 showed the strongest association of all 51 SNPs with the presence of HCC in patients with T2DM (Table 2). The allele frequency of the PNPLA3 rs738409 G allele was significantly higher among DM–HCC individuals than among DM–non-HCC individuals (unadjusted OR 2.53, 95 % CI 1.66–3.87, p = 1.05 × 10−5). Of the other 50 SNPs, rs3785233 (which is located in an intron of A2BP1) and rs7754840 (which is located in an intron of CDKAL1) each showed a marginal association with the presence of HCC (unadjusted OR 1.63, 95 % CI 1.01–2.61, p = 0.0422 for A2BP1, unadjusted OR 1.54, 95 % CI 1.03–2.31, p = 0.0336 for CDKAL1).

Table 2 The SNPs associated with HCC in T2DM patients

The OR calculated for comparisons between the DM–HCC and DM-control groups with regard to PNPLA3 rs738409 allele frequencies (unadjusted OR 3.22, 95 % CI 2.07–5.01, p = 1.01 × 10−7) were higher than those calculated for comparisons between the DM–HCC and DM–non-HCC groups. Likewise, the OR calculated for A2BP1 rs3785233 allele frequencies in comparisons between the DM–HCC and DM-control groups (unadjusted OR 1.85, 95 % CI 1.11–3.08, p = 0.0166) were higher than those calculated for comparisons between the DM–HCC and DM–non-HCC groups.

The impact of JAZF1 on HCC development in patients homozygous for the PNPLA3 G allele with T2DM

A significant association of PNPLA3 with HCC was observed not only with an allelic model but also with a recessive model (unadjusted OR 3.73, 95 % CI 2.02–6.88, p = 1.33 × 10−5) (Table 2). However, 43 patients homozygous for the PNPLA3 G allele were in the DM-Control group; these individuals each had an LSM <7 kPa and had not developed HCC. To further clarify the genetic factors that might predispose individuals to develop HCC, we carried out association tests that only involved individuals homozygous for the PNPLA3 G allele; specifically, we compared PNPLA3 G homozygotes who developed HCC to those who did not (Supplementary Table 2). We found that the SNP rs864745, which is located in the intron of juxtaposed with another zinc finger gene 1 (JAZF1), showed a significant association with the presence of HCC both in a comparison between DM–HCC and DM–non-HCC PNPLA3 G homozygotes (unadjusted OR 3.44, 95 % CI 1.77–6.71, p = 0.0002), and in a comparison between DM–HCC and DM-Control PNPLA3 G homozygotes (unadjusted OR 6.06, 95 % CI 2.48–14.83, p = 2.44 × 10−5) (Table 3). Additionally, rs4523957 and rs391300, which are located within separate introns of serine racemase (SRR), each showed a marginal association with HCC in the comparison between the DM–HCC and DM-control groups (unadjusted OR 4.34, 95 % CI 1.67–11.31, p = 0.0015 for rs4523957, unadjusted OR 3.24, 95 % CI 1.30–8.05, p = 0.0024 for rs391300). These SRR SNPs, rs4523957 and rs391300, were in high linkage disequilibrium (R 2 = 0.83) with each other. The association of the JAZF1 SNP and that of the SRR SNPs with HCC were stronger for the DM–HCC and DM-control comparisons than for the DM–HCC and DM–non-HCC comparisons.

Table 3 The SNPs associated with HCC among T2DM patients homozygous for the PNPLA3 G allele

Several other SNPs, which were located in ANGPT4, CDKN2B, or TCF7L2, showed minimal association with HCC in the comparison between the DM–HCC and DM–non-HCC groups.

Multivariate logistic regression analysis for PNPLA3 and JAZF1

A multivariate logistic regression analysis was performed to identify any potential confounding non-genetic factors affecting HCC progression in patients with T2DM. Sex, age, and BMI were included along with the PNPLA3 G allele in this analysis (Table 4). When comparing the DM–HCC and DM–non-HCC groups, three factors (PNPLA3 G allele, sex, and age) were each identified as independent risk factors for HCC (PNPLA3: adjusted OR 2.47, 95 % CI 1.60–3.82, p = 4.70 × 10−5; sex: adjusted OR 2.16, 95 % CI 1.16–4.04, p = 0.0153; age: adjusted OR 1.08, 95 % CI 1.04–1.13, p = 3.47 × 10−5). Interestingly, in the comparison between the DM–HCC and DM–control groups, a stronger association between the PNPLA3 G allele and HCC was observed (PNPLA3: adjusted OR 2.83, 95 % CI 1.81–4.42, p = 4.73 × 10−6).

Table 4 Multivariate analysis of the effect of the PNPLA3 G allele on HCC risk

An association of the JAZF1 G allele with HCC in patients with T2DM and homozygous for the PNPLA3 G allele was identified to be independent from three other potentially confounding factors (age, sex, and BMI) in both multivariate logistic regression analyses, the one comparing between the DM–HCC and DM–non-HCC groups (adjusted OR 3.38, 95 % CI 1.64–6.95, p = 0.0009) and the other comparing the DM–HCC and DM-control groups (adjusted OR 4.38, 95 % CI 1.81–10.60, p = 0.0011) (Table 5).

Table 5 Multivariate analysis of the effect of the JAZF1 G allele on HCC risk among T2DM patients homozygous for the PNPLA3 G allele

Even if the presence of cirrhosis was included as a factor in the multiple logistic regression analyses to adjust the influence of liver fibrosis, the PNPLA3 G allele and the JAZF1 G allele were independent risk factors for HCC in this cohort (Table 6). In addition, the multiple logistic regression analyses including platelet count instead of the presence of cirrhosis were performed, and it was revealed that the associations of PNPLA3 and JAZF1 with HCC were also independent from platelet count (Supplementary Tables 3, 4). These associations were also independent from the use of anti-diabetic medications (Supplementary Tables 5, 6).

Table 6 Multivariate analysis for PNPLA3 or JAZF1 including the presence of cirrhosis

Discussion

In 2007, the Japan Society of DM reported that malignancies were the most frequent cause of death (34.1 %) among 18,385 patients with DM during 1991–2000, and that among the malignancies, HCC showed the highest frequency (8.6 %). Surprisingly, the frequency of deaths caused by liver cirrhosis was 4.7 %; cumulatively, 13.3 % of these patients with DM died of liver diseases; notably, neither the incidences of HBV or HCV infection, nor the quantity of alcohol intake were reported in the study [24]. Subsequently, Shima et al. concluded that the majority of liver injuries in Japanese DM patients were associated with NAFLD/NASH [21], and currently, NASH is considered to be a leading cause of non-HBV and non-HCV- related HCC.

In 2008, Romeo reported an association between the risk of liver fat accumulation and the PNPLA3 SNP rs738409 [19]. Subsequently, it was reported that the PNPLA3 variant was significantly associated with the severity of NAFLD [25, 26] and alanine aminotransferase (ALT) levels [19, 27]; additionally, the association of PNPLA3 was validated in different populations around the world, from children to adults [28, 29]. In addition, this PNPLA3 SNP is associated with HCC in European Caucasian patients with NAFLD [20, 30], but an association between PNPLA3 and HCC was not replicated in a study of Japanese patients with NAFLD [31].

Here, we found that the rs738409 C >G polymorphism was significantly associated with HCC, even in our restricted cohort of Japanese patients with T2DM. This effect was independent of potentially confounding factors including age, sex, BMI, and even the presence of cirrhosis. We believe that NASH may have been a confounding factor in our study because T2DM was present in 59 % of patients with NASH who developed HCC and had participated in a cross-sectional multicenter study in Japan [32]. However, only 24.5 % (12/49) of DM-HCC patients who had given a liver biopsy specimen were diagnosed with steatohepatitis based on histology in our current study. Thus, larger prospective cohort studies should be conducted for further examination of associations of PNPLA3 with HCC in Japanese patients with NAFLD/NASH.

A previously unidentified susceptibly SNP for HCC was identified by stratifying our cohort by PNPLA3 genotype. The analysis involving only the group of individuals with the PNPLA3 GG genotype identified that the SNP rs864745 located in JAZF1 increases susceptibility to HCC among patients with T2DM and homozygous for the PNPLA3 G allele. The frequency of the JAZF1 rs864745 G allele was significantly higher among DM-HCC individuals than among DM-non HCC or DM-Control individuals, and this association was independent of age, sex, BMI, and even the presence of cirrhosis.

The SNP rs864745 resides within intron 1 of JAZF1, which encodes a transcriptional repressor [33]. JAZF1 is expressed in the pancreas, and expression of JAZF1 is downregulated in patients with T2DM [34]. Notably, JAZF1 rs864745 variants influence traits of insulin secretion [33, 35]. According to these reports, JAZF1 rs864745 is associated with susceptibility to T2DM because it is associated with a lower acute insulin response, and insulin is considered to be a background factor that influences the onset and progression of cancer [36, 37].

Here we found that the JAZF1 rs864745 G allele is a risk factor for development of HCC. The studies that identified the association of this SNP with T2DM also show that the A allele is associated with T2DM. Although our hypothesis was that risk alleles for HCC would be the same as those for T2DM, our results showed that the JAZF1 A allele, which confers susceptibility to T2DM, reduced the risk of HCC development in patients with the PNPLA3 GG genotype who had T2DM. The SNP rs864745 in JAZF1 also influences susceptibility to colorectal cancer [6]. Interestingly, the A allele of rs864745, which is associated with increased risk for T2DM, is also associated with a decreased risk of colorectal cancer; this pattern is similar to the pattern of association between JAZF1 and HCC found in this study. JAZF1 also reportedly influences susceptibility to prostate cancer [38].

Several additional T2DM susceptibility loci showed a trend towards significant association with HCC. Associations between variants located in A2BP1 (rs3785233) or SRR (rs4523957 and rs391300) and HCC were convincing when DM-HCC patients were compared with DM-Control patients instead of DM-non-HCC patients. A large-scale clinical study is required to confirm or refute these associations because they may be potential candidates of low-penetrance genes for susceptibility to HCC in Japanese patients with T2DM.

The current study provides the proof-of-concept for our approach to searching for susceptible genes influencing HCC development; specifically, the list of known candidate genes involved in the pathogenesis of comorbid T2DM is probably highly enriched with HCC in T2DM susceptibility genes. A similar approach could be useful for analyses of other diseases; especially when the incidence of certain cancers is higher in patients of which comorbidity is predisposed to genetic influence.

In conclusion, the PNPLA3 rs738409 was determined to be associated with HCC development in a cohort of Japanese patients with T2DM. Additionally, the JAZF1 rs864745 was firstly identified as a risk factor for HCC among T2DM patients with the GG genotype at PNPLA3 rs738409. We believe that inclusion of these SNPs into multi-factorial risk assessments may help physicians to identify T2DM patients who have a high risk of developing HCC among the expanding population of T2DM patents.