Introduction

Nonalcoholic fatty liver disease (NAFLD) represents a spectrum of conditions resulting from excessive accumulation of fat in the liver, commonly estimated as the percentage of hepatocytes containing lipid observed by microscopic examination of liver biopsies. The prevalence of NAFLD in the general population ranges from 15 to 30 %, and increases with obesity [1, 2]. The spectrum of NAFLD includes two major variants which differ substantially in natural history and prognosis. Simple steatosis is generally considered a benign process, but on rare occasions can progress to the more serious liver disease [3, 4] Nonalcoholic steatohepatitis (NASH) can progress to life-threatening liver disease and liver failure, especially in those with elevated liver transaminases [5, 6]. Biopsy-based epidemiological studies of NAFLD in extreme obesity indicate that insulin resistance, diabetes mellitus, and the metabolic syndrome are established risk factors [2, 7].

Overweight, obesity, and especially visceral obesity are recognized risk factors for NAFLD [8, 9]. Studies of patients with extreme obesity who undergo liver biopsy at the time of bariatric surgery have consistently found the prevalence of NAFLD and NASH at up to 90 and 37 %, respectively [1012]. In studies of patients with extreme obesity, determinants of visceral obesity demonstrate a higher correlation with NAFLD than BMI alone [13, 14]. The prevalence of NAFLD is dependent on upon the detection method used with liver biopsy established as the gold standard for diagnosis because it is the only reliable method of distinguishing NAFLD from NASH and providing accurate disease staging [2, 15].

Knowledge of the epidemiology and natural history of NAFLD is limited because of the lack of a reliable non-invasive methodology for diagnostic screening and staging of the disease. Clinical assessment using non-invasive tests such as liver ultrasound and measurement of serum transaminase levels lacks sufficient sensitivity and specificity for reliable screening, and algorithmic predictors of NAFLD using combinations of common serum markers, BMI and related comorbidities, abdominal ultrasound, and other parameters, have been reported but none have demonstrated sufficient clinical utility [1517]. In the absence of suitable clinical, laboratory, and imaging technologies, liver histology remains as the only available tool for accurate diagnosis and disease staging. Use of liver biopsy for population-based screening is undesirable because of risk, cost, patient time lost from usual activity, and the fact that liver steatosis is often of little clinical significance.

The setting of abdominal surgery provides an opportunity to obtain liver biopsies under controlled conditions, particularly in severely obese patients undergoing bariatric surgery [18]. Bariatric surgery may thus be an ideal setting in which to obtain liver biopsies for diagnosis and staging of NAFLD. However, whether liver biopsies should be obtained on all patients or only those with certain risk factors is not clear since there is little data comparing the reliability of commonly performed clinical assessments such as hepatic ultrasound, alanine transaminase (ALT) and aspartate aminotransferase (AST) levels, and intra-operative visual observation of macroscopic liver appearance with liver histology in patients undergoing bariatric surgery. The purpose of this study is to determine the predictive accuracy of liver ultrasound, AST level, ALT level, and visual inspection of the liver at surgery for assessing NAFLD and NASH.

Methods

Patients

Patients participating in the bariatric surgery program in the Center for Nutrition and Weight Management at Geisinger Health System (Danville, PA) were recruited into a research program in obesity. All patients provided informed consent, and the research was approved by the institutional review board. Consented patients who underwent Roux-en-Y gastric bypass surgery with intraoperative wedge biopsy by one of two laparoscopic bariatric surgeons from January 1, 2004 through February 9, 2009 were included in the analysis. The bariatric surgery program consisted of a 6–12-month pre-operative evaluation, education, and medical treatment phase followed by bariatric surgery. During the preoperative program, consented patients underwent a complete medical evaluation including a comprehensive medical history and physical examination, laboratory studies including AST and ALT, and liver ultrasound examination. In addition, a preoperative diet program designed to achieve loss of 10 % of excess weight is provided for all patients.

Assessment of Hepatic Histology

Intra-operative wedge biopsies of the liver were obtained 10 cm to the left of falciform ligament prior to any liver retraction or surgery on the stomach. The liver specimen was fixed in 10 % neutral buffered formalin, and stained with hematoxylin and eosin for routine histology and Masson’s trichrome for assessment of fibrosis. All sections were read by an experienced pathologist and subsequently re-reviewed by a second experienced pathologist using the Brunt criteria for NASH [19] as follows: steatosis grade 0 (no intracellular lipid), 1 (0 to <33 %), 2 (34 to <66 %), and 3 (>67 %); lobular inflammation grade 0 (no inflammation), grade 1 (mild inflammation), grade 2 (moderate inflammation), grade 3 (severe inflammation); fibrosis grade 0 (no perisinusoidal fibrosis), grade 1 (mild perisinusoidal fibrosis), grade 2 (moderate perisinusoidal fibrosis), grade 3 bridging fibrosis, and cirrhosis.

Laboratory Assessment of Liver Function

Liver function tests (LFTs) were evaluated with ALT and AST levels. These were measured using Roche automated clinical chemistry technology by the Geisinger Clinical Laboratory. ALT and AST values of >35 U/L (female) or >50 U/L (male) were considered abnormal.

Ultrasound

A right upper quadrant (RUQ) ultrasound was performed preoperatively in order to evaluate liver size and degree of steatosis. All ultrasounds were read by an experienced radiologist, and steatosis was classified as mild, moderate, or severe according to the fall in echo amplitude with penetration to the deep portion of the liver, the extent of discrepancy in echo amplitude between liver and kidney, and the extent of loss of echoes from the portal vein [20, 21].

Visual Inspection

In preparation for this study, the two laparoscopic bariatric surgeons agreed on criteria for grading the liver appearance by visual inspection as normal or mild, moderate, and severe fatty infiltration in appearance. Grading was based on perception of weight, as well as visual inspection of size, color, and character of the liver. Livers were classified as normal on the basis of a burgundy color, absent golden speckling, and a normal weight when retracted. Mild fatty infiltration was classified based upon a normal weight when retracted with burgundy color and < 25 % golden speckling with a firm consistency when cut. Both normal livers and those with mild fatty infiltration had a sharp liver edge (Fig. 1). Livers characterized as having moderate steatosis were characterized as heavy when retracted and grossly enlarged. These livers had a dull or rounded edge in addition to having a softer and more friable parenchyma and had >50 % golden speckling appearance when incised (Fig. 2). Livers, which were characterized as having severe steatosis, had evidence of capsular bulging and nodular irregularities suggestive of cirrhosis. Livers characterized as severe steatosis were so heavy that they were difficult to retract using the standard retractor (Care Fusion, San Diego, CA) and had enlarged, rounded or dull edges as well as nearly uniform golden parenchyma often with gritty, granular consistency when cut (Fig. 3).

Fig. 1
figure 1

The liver at bariatric surgery classified as normal or mild fatty infiltration

Fig. 2
figure 2

The liver at bariatric surgery classified as moderate steatosis

Fig. 3
figure 3

The liver at bariatric surgery classified as severe steatosis

Statistical Analysis

Descriptive statistics such as the calculation of means and percentages were used to describe the study population. The percentage of patients with histologically documented steatosis, steatohepatitis, or fibrosis was compared with the results of the pre-operative non-invasive assessments (i.e., LFTs, liver ultrasound, and visual assessment of the liver during bariatric surgery). ALT and AST values of >35 U/L for females or >50 U/L for males were considered abnormal and were compared to histology results using chi-square tests. Data for liver ultrasounds were classified using an ordinal scale, normal, mild, moderate, or severe, and visual inspection, mild, moderate, and severe and were compared with histology results using Cochran Armitage trend tests. p values <0.05 were considered statistically significant. The diagnostic utility of each pre-operative non-invasive assessment for the presence of the different histological components of NAFLD/NASH was evaluated by calculating sensitivity, specificity, positive predictive value [PPV], negative predictive value [NPV], and accuracy. Values were calculated separately for histological classification of steatosis, steatohepatitis, or fibrosis using each pre-operative non-invasive assessment, i.e., normal AST versus abnormal AST, normal ALT versus abnormal ALT, normal ultrasound versus mild/moderate/severe ultrasound, and mild visual inspection (VI) versus moderate/severe VI. Finally, to determine if combinations of the non-invasive assessments improved the overall diagnostic utility, measures of accuracy were calculated using any abnormal result, i.e., abnormal LFT or mild/moderate/severe ultrasound or moderate/severe VI, and all abnormal results, i.e. abnormal LFT and mild/moderate/severe ultrasound and moderate/severe VI). SAS version 9.2 was used for all statistical analyses.

Results

A total of 580 patients were initially selected for inclusion in the study; 514 (89 %) had available results for ALT, AST, ultrasound, and VI. The remaining 66 patients were excluded because of incomplete data (52 without ultrasound and VI, 6 without LFTs, 4 without ultrasound, 3 without VI, 1 without LFTs and VI). One additional patient was excluded due to a diagnosis of hepatitis, leaving 513 for analysis. Of the 513 patients studied, 419 (82 %) were female, 504 (98 %) were Caucasian, with a mean age of 44.2 (SD = 10.2) years (Table 1). All patients had a BMI of greater than 35 kg/m2 at initial visit with a mean BMI of 47.2 kg/m2. Of the 513 patients, steatosis was present in 68 % of patients (n = 348) including 38 % with mild steatosis (n = 197), 19 % with moderate steatosis (n = 99), and 10 % with severe steatosis (n = 52). Steatohepatitis (defined as lobular inflammation with or without fibrosis) was present in 28 % (n = 146), most with mild levels of inflammation. Some degree of fibrosis was present in 23 % (n = 116), and four patients had cirrhosis (Table 2). NASH (defined as the presence of steatohepatitis, fibrosis or cirrhosis) was present in 32 %.

Table 1 Description of study population (N = 513)
Table 2 Description of liver histology results (N = 513)

We first determined what fraction of patients with normal and abnormal ALT and AST levels had histologically documented steatosis, steatohepatitis, or fibrosis (Table 3). The percentage of patients with abnormal ALT levels was significantly higher than those with normal ALT levels (79 vs. 65 %), consistent with fatty liver-induced liver damage. The same was true for abnormal and normal AST levels, though the difference was slightly smaller (76 vs. 67 %). Sixteen percent of patients with normal ALT levels had steatohepatitis; 19 % with normal ALT levels had fibrosis. Patients with abnormal ALT levels were more than three times as likely to have steatohepatitis relative to patients with normal ALT levels (50 vs. 16 %, p < 0.0001) and about twice as likely to have fibrosis (37 vs. 19 %, p < 0.0001). A similar trend was present for abnormal AST levels, (47 vs. 27 % for steatohepatitis and 38 vs. 21 % for fibrosis).

Table 3 Histology compared to LFTs, ultrasound, and VI

We then evaluated the ability of hepatic ultrasound to distinguish steatosis, steatohepatitis, and fibrosis. In patients considered normal by liver ultrasound, 38 % of patients had histological steatosis. A total of 70 % of patients with mild fatty liver by ultrasound had steatosis, increasing to 88 % with moderate US-defined fatty liver and to 94 % with severe ultrasonographic findings. A similar but much lower trend was present for steatohepatitis, ranging from 9 % steatohepatitis in ultrasonographic normal liver to 57 % with severe ultrasound findings. Fibrosis detection ranged from 8 % in ultrasonographically normal liver to 44 % with severe ultrasound findings.

Intra-operative VI was fairly good at identifying steatosis but was a relatively poor modality for detecting histological evidence of steatohepatitis and fibrosis. In patients considered to have normal/mild fatty liver involvement by VI, 58 % had steatosis. A total of 81 % of patients with moderate visual findings had histological steatosis increasing to 88 % with severe VI-identified fatty liver. However, VI was less able to detect steatohepatitis and fibrosis, varying from 18 % with steatohepatitis and 12 % with fibrosis in patients with mild VI findings, to 39 % steatohepatitis and 36 % fibrosis in patients with moderate Vl findings, and 63 % steatohepatitis and 48 % fibrosis in patients considered severe by VI.

The sensitivity, specificity, positive, and negative predictive values and accuracy were then calculated for each of the four clinical assessments (abnormal ALT, abnormal AST, findings of mild fatty infiltration or greater by liver ultrasound and intra-operative VI evidence of NAFLD/NASH) used to detect the different histological components of NAFLD/NASH. Despite significant correlations between clinical assessments and histological findings, the sensitivity and specificity of these clinical tests for histological evidence of NAFLD/NASH pathology varied (Table 4). For steatosis, a mild or greater finding by liver ultrasound was the most sensitive (80 %) method, versus abnormal AST as the least sensitive (10 %). In patients with abnormal findings on all four methods, the sensitivity was only 12 %. In patients with an abnormal finding for any one of the four methods, the sensitivity to detect steatosis was 89 %. For all four methods, the sensitivity to detect steatohepatitis and fibrosis was greater than for steatosis ranging from 14 % sensitivity for abnormal AST to detect steatohepatitis to 97 % for any abnormal test to detect steatohepatitis or fibrosis.

Table 4 Accuracy of clinical assessments for predicting steatosis, inflammation, or fibrosis on histology (N = 513)

The four methods were better when assessing specificity for steatosis, with a low of 68 % for mild or greater liver ultrasound findings to 93 % for abnormal AST. In patients with abnormal findings for all four methods, the specificity was 98 %. In contrast to sensitivity, the specificity for three of the four clinical assessments was less for steatohepatitis and fibrosis, with abnormal AST maintaining a specificity of 93 %.

The positive predictive values (PPVs) for each of the four methods for steatosis ranged from 76 % for abnormal AST to 84 % with mild or greater ultrasound findings. In patients with all abnormal findings, the positive predictive value was 93 %. The PPVs for the four methods were much less for steatohepatitis and fibrosis than for steatosis. The opposite effect was seen for negative predictive values (NPVs). The four methods had greater NPVs for steatohepatitis and fibrosis than for steatosis. The highest NPVs were any abnormal result for steatohepatitis and fibrosis at 97 %. For steatosis, ultrasound had the best NPV at 62 % with all normal findings at 68 %. A similar trend at lower levels was present for accuracy. Liver ultrasound identification of steatosis had the highest accuracy at 76 % while abnormal AST for steatosis was the least accurate at 37 %.

Discussion

Our results are consistent with other studies demonstrating a high prevalence of NAFLD-related pathology in patients with extreme obesity [2227], with almost two-thirds of patients manifesting steatosis, over a quarter with steatohepatitis, and over a fifth with fibrosis. We sought to determine whether the use of non-invasive preoperative transaminases, liver ultrasound, and intra-operative visual assessment, could be used to identify histologically verified NAFLD and NASH classified from intra-operative biopsies. Our data indicate that no individual clinical assessment was greater than 93 % sensitive or specific for the major histological findings in NAFLD/NASH with PPVs not greater than 50 % for steatohepatitis or fibrosis. The use of any single one of these four clinical assessments to direct the use of liver biopsy does not seem to have sufficient clinical utility. These results are consistent with the findings of other investigators who have found that abnormal levels of ALT and ALT do not have a high correlation with the presence of NAFLD, steatohepatitis, or fibrosis [26, 2830]. This lack of correlation with abnormal transaminase levels and NAFLD has stimulated investigations directed toward the identification of biomarkers that are more sensitive and specific for NAFLD and NASH [1, 15, 31].

Data on the reliability of ultrasound in assessing histological steatosis, fibrosis, and the presence of NASH in patients undergoing bariatric surgery are also inconsistent [15, 32, 33]. Ultrasound is notoriously operator dependent, and verification of the ultrasound findings by a second reviewer was not carried out in this study.

Reports on the utility of visual inspection of the liver also show varying reliability in assessing NAFLD [34, 35]. There are also no validated standards or parameters for observers to use for visual inspection of liver to determine whether liver biopsy should be performed. The extent of variability within and between reviewers for either measure is unknown. As with ultrasound, our data suggest that visual inspection of the liver at the time of surgery may provide useful information regarding the presence of steatosis, but does not provide reliable information regarding the severity of steatosis or the presence of more serious variants of NAFLD. Although visual examination criteria were established before the study by the two highly experienced bariatric surgeons, they are arbitrary, operator dependent, and a possible limitation in this study.

Each independent clinical assessment for the presence and severity of NAFLD was not highly predictive of liver histology. However, the important finding of this study is that the combination of the results of all clinical assessments (all normal or all abnormal) may have some clinical utility. Abnormal results across all clinical assessments did not predict abnormal histology to a high degree, with only 70 % PPV for steatohepatitis dropping to 54 % PPV for fibrosis. However, normal results across LFTs, ultrasound, and VI may be useful in ruling out whether a patient has steatohepatitis or fibrosis; 97 % of patients that were normal for all for clinical assessments did not have histological evidence of inflammation or fibrosis. However, 3 % of patients with potentially serious hepatic pathology would not be detected. Although liver biopsy remains the recommended procedure for identification and staging of NAFLD because of the unreliability of non-invasive assessments, the presence or absence of abnormalities of all four assessment modalities may prove useful in decision making in regard to indications for liver biopsy at the time of bariatric surgery.

The high prevalence of NAFLD in patients with extreme obesity and the significant morbidity and mortality risk [2] associated with the more advanced forms of NAFLD together with the inability of non-invasive tests to distinguish between non-NASH and NASH variants of NAFLD argue for liberal use of liver biopsy at the time of bariatric surgery. These results confirm the high prevalence of more severe variants of NAFLD in bariatric surgery patients. Although, surgical weight loss does improve NAFLD, the improvement does not occur in all patients, and appropriate hepatology follow-up is indicated because of the potential for progression of the liver disease. In the absence of reliable non-invasive diagnostic tools, liver biopsy remains the gold standard for diagnosis and staging of this condition. Current best practice recommendations are to consider percutaneous liver biopsy for patients with metabolic syndrome as this condition is associated with the more serious NAFLD variants and for patients at risk for steatohepatitis and liver fibrosis on the basis of the NAFLD Fibrosis Score [8]. The NAFLD score is based on age, BMI, hyperglycemia, platelet count, albumin, and AST/ALT ratio (http://nafldscore.com) [36].

A limitation in this study is the lack of exclusion or stratification of patients according to alcohol intake. Alcohol abuse is a contraindication to bariatric surgery, and patients are carefully screened before surgery. Additional limitations include the lack of independent corroboration of the ultrasound readings, and the arbitrary nature of visual inspection.

In this study, there was only one documented complication of liver biopsy. A single bile leak mandated laparoscopic drainage. There was a 2.0 % incidence of self-limited bleeding, but the bleeding source was not established.

Until reliable markers for the identification of NAFLD and its severity are available for clinical use, liver biopsy will remain the most reliable diagnostic approach. Liver biopsy at the time of bariatric surgery allows for diagnosis and staging, which allows for identification of patients at risk, referral to hepatologists, and recommendations for treatment and close follow-up. Similar to other studies of NAFLD in extreme obesity, our results suggest that liver biopsy can be performed safely at the time of bariatric surgery and should be considered for all candidates for surgery.