Introduction

The adenoma–carcinoma sequence is widely recognized as the main carcinogenic process that describes the development of colorectal carcinomas, the incidence of which is increasing in many countries [14]. Thus, screening colonoscopy is important for the early detection and endoscopic resection of colorectal tumors [5]. Several studies have shown only a low risk of lymph node metastasis from early colorectal carcinoma that involves the superficial layer of the submucosa, less than 1000 μm from the muscularis mucosae [68]. Therefore, most colorectal tumors can be treated by endoscopic resection. Polypectomy and endoscopic mucosal resection are widely practiced worldwide, and endoscopic submucosal dissection is gaining increased acceptance as an effective, minimally invasive approach to many colorectal tumors.

Magnifying endoscopy provides details of the surface of the gastrointestinal tract and thus allows examination of the pit pattern (shape of the openings of colorectal crypts) of colorectal tumors, even during routine colonoscopy performed with indigo carmine dye or crystal violet staining [6]. Classifying the stereomicroscopic pit pattern (differentiating between types I through V) of colorectal tumors, according to the system proposed by Kudo et al. [9] and modified by Kudo and Tsuruta, is useful for evaluating the histopathologic features of the lesions (Fig. 1) [6, 10, 11]. In addition, further, more recent, subclassifications of the type VI pit pattern [1214] have made it relatively easy to decide upon therapeutic strategies. However, such classifications and, hence, diagnoses are subjective and can vary among individual endoscopists. A better approach would be objective evaluation of the pit pattern.

Fig. 1
figure 1

Classification of pit patterns of colorectal lesions

Quantitative image analysis, based on computational analysis of a digital image, offers an objective means to improve the consistency and speed of diagnosis at the point of care. Texture analysis is an established technique in the field of image analysis [1517], and its clinical application has been described with respect to ultrasonography and microendoscopy [18, 19]. For such analysis of clinical images, properties governing the distribution of and relations between gray-level values, which represent texture, are analyzed statistically. Dense scale-invariant feature transform (SIFT) is an algorithm that has been used recently to detect and describe dense local features in images [20, 21].

To our knowledge, the feasibility of quantitative analysis of the pit patterns of the neoplastic colorectal mucosal surface and histopathologic features of colorectal tumors has not been studied. Quantification of pit patterns should allow for the more precise diagnosis of colorectal lesions, particularly if the quantification process is computer-automated. The present study was conducted to quantify the pit patterns of neoplastic colorectal tumors observed by magnifying endoscopy and to investigate the relations between the quantified pit patterns and histopathologic features of the colorectal tumors by means of texture analysis and dense-SIFT-based discriminant function values.

Methods

Lesions and colonoscopic observation

We analyzed 165 colorectal neoplastic lesions, observed by high-resolution magnifying endoscopy with crystal violet staining, in 165 patients who underwent endoscopic resection or surgery at Hiroshima University Hospital during the period April 2007 through December 2009. Colorectal lesions considered unsuitable for evaluation (out of focus, insufficient staining, blurring, halation) were not included. Included were 56 tubular adenomas (TAs), 52 carcinomas showing intramucosal (M) to scant submucosal (SM-s) invasion, and 57 carcinomas showing massive submucosal (SM-m) invasion. Detailed images of the pit patterns were obtained from among the magnified endoscopic images of the mucosal surfaces (types IIIL/IV, n = 44 cases; type VI-mildly irregular, n = 36 cases; type VI-severely irregular, n = 45 cases; and type VN, n = 40 cases). The pit pattern types and corresponding histologic diagnoses are shown in Table 1. The lesions were first detected by conventional white-light colonoscopy and then observed under maximum optical magnification with crystal violet staining, as reported previously [6]. The instruments used in this study were a magnifying videoendoscope system (CF-H260AZI; Olympus Optical, Tokyo, Japan), which provides up to 75× magnification (optical magnification) on a 19-inch monitor. After magnifying endoscopy, all lesions were endoscopically or surgically resected. Thereafter, the images were digitized and stored on an Olympus EICP-D HDTV recorder (1440 × 1080 pixels). The study was conducted with approval from the Hiroshima University Hospital Ethics Committee, and informed consent was obtained from patients and/or family members for endoscopic examination.

Table 1 Histologic features of colorectal tumors in relation to pit patterns

Extraction of pit regions

From a magnified endoscopic image recorded at maximum optical magnification, a 150 × 150- to 300 × 350-pixel region judged to represent a particular pit pattern was selected manually as a region of interest (ROI) (Fig. 2). All ROIs were identified by one endoscopist (K.O.), who was well trained in magnifying colonoscopy and had no knowledge of the histologic features of any of the study cases. The endoscopist classified the pits according to the Kudo and Tsuruta criteria (Fig. 1) [6, 911] and subclassified the type VI pit patterns as type VI-mildly irregular or type VI-severely irregular (Fig. 3) [12]. Only one ROI was selected for each lesion.

Fig. 2
figure 2

Extraction of a pit region. a Observation and recording of stained (crystal violet) image at maximum optical magnification. b A region of interest is selected for analysis

Fig. 3
figure 3

Magnified features of colorectal lesions with a type VI pit pattern. a Example of a type VI-mildly irregular pattern. b Example of a type VI-severely irregular pattern

Texture analysis

To estimate the textural properties of the pit patterns, a total of 14 different textural features were extracted from each ROI on the magnified images; that is, algorithms were used to obtain the following: (1) gray-level histogram moments (GLHM): mean, variance, skewness, kurtosis; (2) spatial gray-level dependent matrices (SGLDM): energy, entropy, correlation, local homogeneity, inertia; (3) gray-level difference matrices (GLDM): contrast, angular second moment, entropy, mean, inverse difference moment [1517, 2225]. Table 2 lists the 14 different textural features.

Table 2 Textural features

We first analyzed all magnified images, which included type IIIL/IV, type VI-mildly irregular, type VI-severely irregular, and type VN pit patterns, to quantify the pit patterns of the colorectal tumors. From among the total 14 different textural features, the best performing features were identified, and we used these “best” features for analysis. We then analyzed differences between the various quantified pit patterns themselves and differences in the texture analysis values in relation to differences between the histopathologic classifications of the colorectal tumors.

Analysis of discriminant function values by means of dense-SIFT

We first produced a set of training images from among the magnified endoscopic images not used for analysis (89 type IIIL/IV images and 105 type VI-severely irregular images). We used dense-SIFT descriptors as local features [20, 21, 26, 27]. SIFT descriptors are computed at points of interest on a regular grid, and also at several different scales on the local gray-scale patch centered over the interest point [27]. The magnified endoscopic images listed in Table 1, which included type IIIL/IV, type VI-mildly irregular, and type VI-severely irregular pit patterns, were used as test images. We used the same dense-SIFT descriptors on the test images that were used on the training images. With the training images, clustering was performed to generate K clusters, and the training and test images were then converted into feature vectors by assigning each SIFT descriptor to the nearest cluster. We used spectral regression discriminant analysis and calculated discriminant function values using MATLAB software (The MathWorks, Natick, MA, USA) [28]. Likewise, we analyzed relations between the discriminant function values, pit patterns, and histopathologic features.

Histologic examination

Resected specimens were pinned to a board and fixed in 10% buffered formalin for 12–48 h. The specimens were then cut into 2- to 3-mm blocks. Hematoxylin and eosin-stained sections were examined. Histologic diagnosis was based on the World Health Organization criteria [29]. Massive submucosal invasion was defined as invasion to a depth of 1000 μm or more, as previously described [68]. The depth of submucosal invasion was measured according to the General rules for clinical and pathological studies on cancer of the colon, rectum and anus of the Japanese Society for Cancer of the Colon and Rectum [30].

Statistical analysis

Values are reported as means ± SD. Significance levels after Bonferroni correction for multiple testing were P = 8.33 × 10−3 (0.05/6) for differences in values between type IIIL/IV and type VN pit patterns and P = 1.67 × 10−2 (0.05/3) in the other evaluations.

Results

“Best” textural features

Of the 14 different textural features, the GLDM inverse difference moment and SGLDM local homogeneity differed significantly between the different pit patterns and were thus considered the best performing features.

GLDM- and SGLDM-based quantification of type IIIL/IV and type VN pit patterns and differences in relation to histologic features

The GLDM inverse difference moments per pit pattern (Fig. 4a) were as follows: 6.9 ± 1.1 for type IIIL/IV, 6.3 ± 0.9 for type VI-mildly irregular, 5.7 ± 0.8 for type VI-severely irregular, and 8.4 ± 1.0 for type VN. SGLDM local homogeneity values (Fig. 4b) were as follows: 7.7 ± 1.0 for type IIIL/IV, 7.2 ± 0.8 for type VI-mildly irregular, 6.7 ± 0.8 for type VI-severely irregular, and 8.9 ± 0.8 for type VN. The differences in the values of both properties between pit patterns were significant.

Fig. 4
figure 4

Histograms showing results of texture analysis of type IIIL/IV and type VN pit patterns. a Analysis by gray-level difference matrices (GLDM) inverse difference moment. b Analysis by spatial gray-level dependent matrices (SGLDM) local homogeneity

The GLDM inverse difference moments per histologic type (Fig. 5a) were as follows: 6.8 ± 1.0 for TAs, 6.2 ± 1.1 for M/SM-s lesions, and 7.4 ± 1.6 for SM-m lesions. The SGLDM local homogeneity values (Fig. 5b) were as follows: 7.6 ± 0.9 for TAs, 7.1 ± 1.0 for M/SM-s lesions, and 8.1 ± 1.4 for SM-m lesions. The differences in the values of both properties between lesions types were significant.

Fig. 5
figure 5

Histograms showing results of texture analysis per histologic type of type IIIL/IV and type VN pit patterns. a Analysis by GLDM inverse difference moment. b Analysis by SGLDM local homogeneity. TA Tubular adenoma, M carcinoma with intramucosal invasion, SM-s carcinoma with scant submucosal invasion, SM-m carcinoma with massive submucosal invasion

GLDM- and SGLDM-based quantification of type IIIL/IV, type VI-mildly irregular, and type VI-severely irregular pit patterns and differences in relation to histologic features

The GLDM inverse difference moments per pit pattern (Fig. 6a) were as follows: as stated above, 6.9 ± 1.1 for type IIIL/IV, 6.3 ± 0.9 for type VI-mildly irregular, and 5.7 ± 0.8 for type VI-severely irregular. The SGLDM local homogeneity values (Fig. 6b) were as follows: as stated above, 7.7 ± 1.0 for type IIIL/IV, 7.2 ± 0.8 for type VI-mildly irregular, and 6.7 ± 0.8 for type VI-severely irregular. The differences in the values of both properties between pit patterns were significant.

Fig. 6
figure 6

Histograms showing results of texture analysis and dense-scale-invariant feature transform (SIFT) -based discriminant function values of type IIIL/IV and VI pit patterns. a Analysis by GLDM inverse difference moment. b Analysis by SGLDM local homogeneity. c Analysis by dense-SIFT

The GLDM inverse difference moments per histologic type (Fig. 7a) were as follows: 6.8 ± 1.0 for TAs, 6.0 ± 0.9 for M/SM-s lesions, and 5.8 ± 1.0 for SM-m lesions. The SGLDM local homogeneity values (Fig. 7b) were as follows: 7.6 ± 0.9 for TAs, 6.9 ± 0.8 for M/SM-s lesions, and 6.7 ± 0.9 for SM-m lesions. The value for TAs was significantly higher than that for M/SM-s or SM-m lesions. The value for M/SM-s lesions tended to be higher than that for SM-m lesions, but the difference did not reach statistical significance.

Fig. 7
figure 7

Histograms showing results of texture analysis and dense-SIFT-based discriminant function values per histologic type of type IIIL/IV and type VI pit patterns. a Analysis by GLDM inverse difference moment. b Analysis by SGLDM local homogeneity. c Analysis by dense-SIFT

Dense-SIFT quantitative analysis and the relation to pit patterns and histologic features in type IIIL/IV and type VI-severely irregular pit patterns

The dense-SIFT-based discriminant function values per pit pattern (Fig. 6c) were as follows: −5.6 × 10−2 ± 3.1 × 10−2 for type IIIL/IV, −2.6 × 10−2 ± 2.9 × 10−2 for type VI-mildly irregular, and 0.78 × 10−2 ± 2.2 × 10−2 for type VI-severely irregular. The differences in the values of property between pit patterns were significant.

The dense-SIFT-based discriminant function values per histologic type (Fig. 7c) were as follows: −4.5 × 10−2 ± 3.4 × 10−2 for TAs, −1.1 × 10−2 ± 3.7 × 10−2 for M/SM-s lesions, and 0.11 × 10−2 ± 2.1 × 10−2 for SM-m lesions. The value for TAs was significantly lower than that for M/SM-s or SM-m lesions. The value for SM-m lesions tended to be higher than that for M/SM-s lesions, but the difference did not reach statistical significance.

Discussion

The pit pattern types initially proposed by Kudo et al. [9] and modified by Kudo and Tsuruta correspond to the histologic characteristics of colorectal lesions [6, 10, 11]. As previously reported, magnifying colonoscopy is useful in differentiating neoplastic and non-neoplastic lesions [6, 3134] and in assessing the depth of invasion of early colorectal carcinoma [6, 3537]. Type I and II pit patterns predict non-neoplastic lesions, whereas type III, IV, and V pit patterns predict neoplastic lesions. Lesions with a type III or IV pit pattern are almost always adenomas, and thus endoscopic resection is indicated. The type V pattern can be subclassified into types VI and VN. Whereas almost all lesions with a type VN pit pattern are carcinomas invading the submucosa to 1000 μm or more, the type VI pit pattern includes TA, M/SM-s carcinoma, and SM-m carcinoma; thus, some investigators have proposed dividing the VI pit pattern into subtypes [1214]. Our institution subclassifies lesions with a type VI pit pattern as mildly irregular or severely irregular. Mildly irregular lesions are found significantly more often in association with adenoma, carcinoma with mucosal invasion, and carcinoma with scant submucosal invasion than in association with carcinoma with massive submucosal invasion [12]. Nevertheless, such pit pattern classification is subjective and based on experience, and quantification is difficult. We recently described quantification and computer-aided detection of the regular pit pattens of colorectal lesions; the type V pit pattern, including subtypes VI and VN, was not included in that study [38]. Furthermore, no group has yet described quantitative analysis in relation to the histopathologic features of colorectal tumors. Therefore, we applied texture analysis as well as analysis of discriminant function by dense-SIFT to the full range of pit patterns of neoplastic colorectal lesions observed by magnification endoscopy, and we investigated quantification of the pit patterns in relation to histologic features.

Texture analysis is used to evaluate the position and intensity of signal features, i.e., pixels, and their gray-level intensity in digital images [1517]. Textural features, which are mathematical parameters computed from the distribution of pixels, characterize the texture type and thus the underlying structure of the objects shown in the image. In the present study, we used a statistical approach based on representations of texture using properties governing the distribution of and relations between gray-level values in the images. GLDM inverse difference moment and SGLDM local homogeneity were the two textural features that performed the best, and they explained the homogeneity of the digital images. Low values are associated with low homogeneity. SIFT descriptors, as proposed by Lowe [20, 21], have been used recently in the field of image recognition, to describe local features [26, 27]. The SIFT descriptors bundle a feature detector and a feature descriptor. The detector extracts from an image a number of frames in a way that is consistent with variations in the illumination, viewpoint, and other viewing conditions. The descriptor associates to the region a signature, which identifies its appearance compactly and robustly. We computed dense-SIFT descriptors, in particular, for dense sampling. We choose dense-SIFT descriptors because SIFT-based local descriptors are known to perform better [39] than other descriptors such as principal components analysis-based SIFT [40] descriptors. Other types of descriptors, such as histogram of oriented gradients [41] descriptors, are not good for pit pattern analysis because they are developed to detect fixed-shape objects such as human faces.

Texture analysis in the present study yielded high but descending values for the type IIIL/IV, type VI-mildly irregular, and type VI-severely irregular pit patterns (in that order); however, the value was high for the type VN pit pattern. Irregularity of the pit patterns increased in ascending order: type IIIL/IV, type VI-mildly irregular, type VI-severely irregular pit patterns, and the type VN showed an area of obvious non-structure, because of loss or decrease of pits with an amorphous structure, reflecting histologic destruction of the glands and the covering epithelium and the presence of desmoplastic reactions at the lesion surface. In other words, the pattern of the mucosal surface structure increased in complexity in correspondence to the type IIIL/IV, type VI-mildly irregular, and type VI-severely irregular pit patterns, whereas the mucosal surface structure corresponding to the type VN pattern was simplified. With respect to histologic features, texture analysis yielded high but descending values for TA and M/SM-s (in that order) but a relatively high value for SM-m. This can be explained by the fact that SM-m tumors showed mainly the type VN pattern (37/57 cases). Therefore, texture analysis yielded high but descending values for the type IIIL/IV, type VI-mildly irregular, and type VI-severely irregular pit patterns (without the type VN pit pattern) (in that order). Similarly, with respect to histologic types, texture analysis yielded high but descending values for TA, M/SM-s, and SM-m (in that order).

When we used dense-SIFT descriptors and discriminant analysis for type IIIL/IV and type VI-severely irregular pit patterns, the differences were superior to those obtained by texture analysis. This technique is being applied more and more in the field of image recognition, and it appears to be more precise than the more traditional techniques. With respect to histologic types, dense-SIFT-based discriminant function values tended to be higher for SM-m than for M/SM-s, but the difference did not reach statistical significance. This can be explained by the fact that many type VI-severely irregular lesions [51.1% (23/45)] were M/SM-s carcinomas.

Our study was a single-center, retrospective study, and only one ROI judged to represent a particular pit pattern was selected by a single endoscopist because we compared the texture analysis values and dense-SIFT-based discriminant function values against histopathologic features. When we reach the point at which we feel we have succeeded in developing a fully automated computer-aided system for pit pattern identification, including automatic selection of ROIs, we will need to select ROIs and create a database of pit pattern classifications determined by the consensus of several endoscopists at multiple centers. We will then conduct a prospective study in which the size of lesions will also be considered.

In conclusion, we successfully quantified pit patterns of neoplastic colorectal tumors and characterized the relation between the quantified pit patterns and the histologic features of the tumors. We anticipate that further development to full automation will allow for computer-aided diagnosis of pit patterns on magnified endoscopy images and, hence, assist in the diagnosis of colorectal lesions.