1 Introduction

Ground-penetrating radar (GPR) is a non-invasive high-resolution geophysical method that can accurately image and characterize near-surface targets by means of electromagnetic (EM) waves in the 10 MHz and 6 GHz frequency range. The successful application to geologic materials requires sufficiently large contrast of electromagnetic impedance (Annan 2003). Such conditions are frequently met in archaeological exploration, where the contrast between artifacts and background is large in most environments (e.g. Vaughan 1986; Pipan et al. 1999, 2001; Piro et al. 2007; Forte and Pipan 2008; Conyers and Leckebusch 2010; Conyers 2013; Urban et al. 2014). The use of GPR for archaeological imaging and interpretation evolved from single 2-D reflection profiles to 3-D volume mapping. In fact, common tools to visualize and interpret GPR data from archaeological sites are now time-slices and iso-surface renderings. They can be generated from 3-D GPR data acquired with 2.5 grids (e.g., Goodman et al. 1995; Leckebusch and Peikert 2001; Piro et al. 2003; Pérez-Gracia et al. 2009; Bini et al. 2010; Zhao et al. 2015a), or less frequently with true 3-D multichannel arrays (Trinks et al. 2010; Verdonck et al. 2013; Lualdi and Lombardi 2014).

A better understanding of GPR datasets can be achieved by extracting further quantitative information from processed 2-D and 3-D data, besides amplitude and travel time analysis. Attributes of radar data provide an effective tool on such purpose, and integrated analysis based on different sets of attributes can facilitate the identification and characterization of subsurface structures and improve the overall quality and efficiency of GPR interpretation (e.g. Grasmueck 1996; McClymont et al. 2008; Sassen and Everett 2009; Forte et al. 2012; Zhao et al. 2016), particularly in archaeological prospecting (Böniger and Tronicke 2010; Creasman et al. 2010; Zhao et al. 2013a, b, 2015b).

Shape and structure (i.e., geometrical relationships) can help identifying subsurface features by applying statistical pattern recognition methods. Texture is the repeating pattern of local variations in image intensity (Haralick et al. 1973; Chopra and Marfurt 2007), and is one of the most effective attributes, to define amplitude patterns and to describe the spatial arrangement of neighboring features (Gao 2003). Textural analysis was first applied to 2-D image analysis (Kaizer 1955; Haralick et al. 1973; Reed and DuBuf 1993), and successively became a popular tool in seismic reflection data interpretation to improve identification and interpretation of complex stratigraphic features. Several authors proposed different analysis methods to extract the maximum amount of information from such attribute (e.g., Zhang and Simaan 1987; Pitas and Kotropoulos 1992; Randen et al. 2000; Patel et al. 2008). An effective algorithm is based on the gray-level co-occurrence matrix (GLCM) (Love and Simaan 1984; Reed and Hussong 1989; West et al. 2002; Gao 2003; Chopra and Alexeev 2006; Angelo et al. 2009; Yenugu et al. 2010; de Matos et al. 2011; Gao 2011; Eichkitz et al. 2013). However, the applicability of texture analysis to GPR data is largely untapped, and it is reported just in few cases (Moysey et al. 2006; Deiana 2008; McClymont et al. 2008).

In this study, we focus on the applicability and the effectiveness of texture analysis of GPR datasets for archaeological prospection. After a basic GPR processing sequence, including data editing, geometry header definition, DC removal, amplitude analysis, band-pass filtering, background noise removal, true amplitude recovery, velocity analysis, topographic correction, and f–k migration, we have tested textural attributes on processed 2-D GPR profiles and 3-D GPR volumes acquired from three archaeological sites, in totally different conditions, as for topographic surface, buried structures, period of the archaeological remains, and contrast between targets and background. Moreover, following Haralick et al. (1973), we adopt GLCM to generate contrast, energy, entropy, and homogeneity of textural features, and to supply different information about the GPR datasets, such as uniformity or complexity, to better image and map different buried archaeological targets.

2 Methodology

Textural attributes based on GLCM come from image processing and were originally developed to capture roughness/smoothness of an image. The algorithm extracts textural features through a co-occurrence matrix, and was initially proposed and used for image classification by Haralick et al. (1973). A co-occurrence matrix (COM) is a 2-D matrix whose elements are the relative occurrence frequencies of pairs of quantities that are separated by a certain distance and lie in a given direction (Fig. 1). In the case of GLCM, the quantities are gray levels of pixels. If distance and direction are, respectively, indicated by d and θ, the COM can be constructed in any considered direction, with P representing a COM: P (0, d), P (45, d), P (90, d), and P (135, d). Thus, the spatial relationships are fully described by distance d and angle θ.

Fig. 1
figure 1

Possible directions to calculate the GLCM

We consider an image with four gray levels represented by matrix A in Fig. 2. The spatial relationship used to construct the GLCM is the frequency of occurrence of the first element to the right, and the inter-sample distance d is equal to 1, then the GLCM will be:

$${\text{GLCM}} = \left[ {\begin{array}{*{20}c} 1 & 3 & 0 & 0 \\ 0 & 0 & 4 & 0 \\ 0 & 0 & 1 & 2 \\ 0 & 0 & 0 & 1 \\ \end{array} } \right].$$
Fig. 2
figure 2

Simplified test image with four amplitude levels used to form a COM

The elements in the first row of the matrix represent how many times the four gray levels occur to the right of number 1 in Fig. 2. Therefore, in this example, element g11 is 1, because we find a single occurrence of gray level 1 in the pixel to the right of another gray level 1 pixel in the whole matrix A (pixels a 21 and a 22). Element g12 is 3, because gray level 2 occurs three times to the right of gray level 1 in matrix A and so on. Furthermore, to get a symmetrical matrix, we can add the matrix to its transpose matrix:

$${\text{GLCM}}^{\text{T}} = \left[ {\begin{array}{*{20}c} 1 & 0 & 0 & 0 \\ 3 & 0 & 0 & 0 \\ 0 & 4 & 1 & 0 \\ 0 & 0 & 2 & 1 \\ \end{array} } \right],$$

the symmetrical matrix then becomes

$${\text{GLCM}}_{\text{Sym}} = \left[ {\begin{array}{*{20}c} 2 & 3 & 0 & 0 \\ 3 & 0 & 4 & 0 \\ 0 & 4 & 2 & 2 \\ 0 & 0 & 2 & 2 \\ \end{array} } \right].$$

The next step is to normalize the matrix \({\text{GLCM}}_{\text{Sym}}\) to make the co-occurrence matrix directly representing the joint probability occurrence of gray levels i and j for two specific pixels with the defined spatial relationship, that is

$$P = \left[ {\begin{array}{*{20}c} {0.083} & {0.125} & {0.000} & {0.000} \\ {0.125} & {0.000} & {0.167} & {0.000} \\ {0.000} & {0.167} & {0.083} & {0.083} \\ {0.000} & {0.000} & {0.083} & {0.083} \\ \end{array} } \right],$$

where P denotes that GCLM is an estimate of probability. Matrices generated from all directions are always combined into an averaged COM, as no directional variations in texture are expected.

The interested reader can find more details about GLCM in the GLCM Texture Tutorial by Hall-Beyer (2007). However, GLCMs are not suitable for direct interpretation and the large amount of information that they can provide can be better extracted through additional parameters. Haralick et al. (1973) introduced 14 different measures of textural features from the input GLCMs. Contrast, energy, entropy, and homogeneity are the most common attributes used in geophysical interpretation.

2.1 Contrast

It is the measurement of the amount of local variation present in an image. Mathematically, it is given by

$$\sum\limits_{i} {\sum\limits_{j} {(i - j)^{2} P_{i,j} } } ,$$

where P ij denotes the ith row and the jth column of the GCLM matrix.

The value of contrast is high for pixels with high-amplitude variations.

2.2 Energy

It is the measurement of textural uniformity of an image. Mathematically, it is given by

$$\sum\limits_{i} {\sum\limits_{j} {P_{i,j}^{2} .} }$$

The value of energy is low when all elements in the GLCM are equal.

2.3 Entropy

It is the measurement of disorder or complexity of the image. Mathematically, it is given by

$$\sum\limits_{i} {\sum\limits_{j} {P_{i,j}^{{}} \log P_{i,j}^{{}} } } .$$

The value of entropy is large for images that are not uniform texturally. In such case, many GLCM elements have low values and the values of GLCM are uniform.

2.4 Homogeneity

It is the measurement of the overall smoothness of an image. Mathematically, it is given by

$$\sum\limits_{i} {\sum\limits_{j} {\frac{{P_{i,j} }}{{1 + (i - j)^{2} }}} } .$$

It measures similarity of pixels and achieves high value when most of the occurrences in the GLCM are concentrated near the main diagonal.

For seismic/GPR applications, the GLCM with dimensions n × n, where n is the number of gray levels, is calculated by a sliding window procedure and normalized to approximate co-occurrence probability, to capture the amplitude response of the reference position in the rows and the amplitudes of the neighboring position in the columns. The number of gray levels depends on the dynamic range, and is normally a power of two (Gao 2003). Another important factor used in the calculations is the size of the window, which depends on the dominant frequency of the data and should typically cover approximately one average period of the signals.

3 Case Study I: Texture Analysis of GPR Data to Characterize Archaeological Features Linked to Geological Units

Case study I focuses on a 2-D archaeological GPR survey performed in the San Vincenzo village, Stromboli Island, Italy, with a ProEx system (Malå Geoscience) equipped with 500 MHz central-frequency antennas. Rough topographic surface and geometrically irregular archaeological remnants characterize the site. In 1980, the archaeologist found the first evidences of a prehistoric village (Cavalier 1981), which belongs to the Capo Graziano culture and dates back to the Early Middle Bronze Age (about 2300–1400 B.C.). The site was inhabited since then and exhibits a superposition of cultural layers, with evidences of archaeological remains that belong to different periods, including Classical, Hellenistic, and Contemporary. In addition, an alternation of ashes, lava flows, and loose volcano clastic debris flood the target zone (Kokelaar and Romagnoli 1995; Rosi et al. 2000), as Stromboli is known worldwide for its peculiar, persistent volcanic explosive activity. The final goal of the GPR survey is to provide new information about the buried cultural heritage in the 0.5 ~ 2 m depth range. The superposition of different cultural levels and the overall low signal-to-noise ratio of the GPR data offer a challenging opportunity of methodological validation.

Figure 3 shows two parallel GPR processed profiles separated by 5 m. Several high-amplitude GPR reflections/diffractions can be identified in the shallow part, which may be associated with potential archaeological features. The subsurface is characterized by geological layers with complicated morphology, since the test site has a soil composed by pyroclastic products with local alluvial and colluvial sediments and lava blocks. The texture analysis of the GPR data aims at improving the interpretation, and, in particular, discriminating archaeological and geological targets and possibly providing an archaeological target classification of very elusive and discontinuous structures.

Fig. 3
figure 3

Case study I: examples of 2-D processed 500 MHz GPR profiles. a Profile I; b Profile II

Figure 4a–c shows the results of energy calculation by using 4, 9, and 16 gray levels, respectively. Higher numbers may in theory improve the quality of the GLCM output, but according to Chopra and Marfurt (2007), significant differences in the computed quantities are hardly observed in the application to seismic datasets. The differences among texture-attributes calculated with different numbers of gray levels in our case are large up to the limit of 16 levels (Fig. 4), and the number of gray levels further affects resolution and must be carefully selected to optimize the results in the depth window of interest. High values of energy in Fig. 4 indicate high-amplitude continuous reflections, associated with potential archaeological features. The archaeological targets of the San Vincenzo site are located in the 0.5 ~ 2 m depth range. After extensive tests concentrated on such depth range, we selected nine gray levels for texture analysis. Moreover, we selected a 6 ns window in the study, as the optimal size of the time window depends on the dominant frequency of the data of interest and should typically cover approximately the average GPR wavelet length.

Fig. 4
figure 4

Case study I: results of energy calculated on Profile I with different numbers of gray levels. The number is equal to 4 (a), 9 (b) and 16 (c)

Figure 5 shows contrast (a, b), entropy (c, d), homogeneity (e, f), and schematic archaeological interpretations (g, h) generated from the GPR data of Profile I (a, c, e) and Profile II (b, d, f). Features associated with potential buried culture heritages are highlighted by high contrast, high entropy, and low homogeneity, and labeled as zone I-A, zone I-B, zone I-C, and zone I-M. The information thus obtained is much clearer than in the amplitude data of Fig. 3, and target identification is simpler based on patterns with distinct characteristics. Localized high values of homogeneity in zone I-B (seen the black dotted lines in Fig. 5e) can be interpreted as possible interruptions in the continuity of archaeological structures. Geological interfaces are also visualized more clearly by homogeneity characteristics, indicated by turquoise and orange color arrows and lines (Fig. 5e). Contrast is less effective than homogeneity to highlight targets and this is possibly due to the complicated topography. The archaeological team validated the GPR results through the excavation of trenches located on the base of GPR outcomes. Trench I and Trench II (Fig. 5e, f) found remnants of the Bronze Age buried at about 1.5 m and building materials of the Classical Period buried at about 0.4 m. It is interesting that the correlation of textural features and actual subsurface structures allows the identification of different sub-volumes that can be linked to specific cultural layers. As an example, zones I-A and I-M (Fig. 5e, f; profiles I and II, respectively) exhibit similar textural characteristics and are linked to archaeological features of the same historic period. Furthermore, detailed interpretation results are displayed in Fig. 5g, h.

Fig. 5
figure 5

Case study I: results of 2-D texture analysis and archaeological interpretations on Profile I and Profile II. a, b Contrast calculated on Profiles I/II; c, d entropy calculated on Profiles I/II; e, f homogeneity calculated on Profile I/II; gh archaeological interpretations on Profile I/II

4 Case Study II: Texture Analysis of GPR Data To Characterize Archaeological Features in Homogeneous Subsurface Conditions

Case study II is based on a common offset 2.5-D GPR survey performed in an area close to the church of San Giovanni in Duino (NE Italy). We collected the data with a Zond Groundtracer GPR equipped with 300 MHz central-frequency antennas and carried out the acquisition of 8 profiles along a regular grid with 1 m line spacing and 0.05 m trace distance. The objective of the survey was to image features of potential archaeological interests at depths not larger than 2 m, in support of design and planning of possible archaeological excavations. The surface topography is almost flat and the subsurface conditions are not so complicated and challenging as the former case study, but a shallow water table (at about 1 ~ 2 m below the surface) is responsible for high attenuation at larger depths.

The interpretation of GPR data for archaeological prospection starts from the analysis of individual 2-D profiles to evaluate the characteristics of the records and to perform a preliminary identification and classification of signals and targets. Figure 6a shows an example of processed GPR section that highlights the main characteristics of the data from the proposed archaeological test site. Archaeological targets are here basically associated with high-amplitude GPR reflections and diffractions: the white rectangular frame highlights the characteristic response obtained from the buried archaeological remains in this area, and the signal is totally attenuated below 30 ns (i.e., about 1.2 m depth, giving a velocity of about 0.08 m/ns). We applied texture analysis to the processed section and obtained the energy, entropy, and homogeneity attributes (Fig. 6b–d, respectively). Figure 6b shows low energy associated with the potential targets. Low textural energy corresponds to high entropy and low homogeneity.

Fig. 6
figure 6

Case study II: example of a processed 300 MHz GPR profile and results of 2-D texture analysis. a Amplitude profile; b energy; c entropy; and d homogeneity

To map the potential archaeological targets in terms of extension, shape, and depth of the imaged features, 3-D (or 2.5-D) data analysis is crucial. Figure 7a shows the amplitude time slice of the GPR data volume at 30 ns. A clear circular feature with an approximate 10 m diameter shows up in the images and is highlighted by the white dotted line. It is compatible with the remains of an ancient building (namely a baptistery) adjacent to the church. We further performed texture analysis of GPR time slices. Textural energy and entropy are displayed in Fig. 7b, c. We can notice that the circular structure is apparent in all time slices, but its border seems better defined by the entropy (Fig. 7c). Based on the GPR results, preliminary excavations have been performed discovering the foundations of a circular wall about 60 cm wide (Fig. 7d). As testified by the picture in Fig. 6a, the water table was at a depth of about 1.2 m during the archaeological excavation, thus preventing the digging of a deeper trench.

Fig. 7
figure 7

Case study II: time slices of the 3-D GPR data volume at 30 ns and results of the preliminary archaeological excavations. a Amplitude slice; b energy slice; c entropy slice; and d photograph of the excavations where a clear circular wall (W) is highlighted by the white dotted line as well as the water table

5 Case Study III: Texture Analysis of GPR Data to Characterize Superimposed Archaeological Features

Case study III is based on a 2.5-D GPR survey in the area of the Aquileia Archaeological Park (NE Italy). We performed the survey with a GPR system equipped with 250 MHz central-frequency antennas, and we acquired 40 profiles along a regular grid with 0.5 m line spacing and 0.05 m trace distance. The ancient Roman town of Aquileia was founded in the second century B.C. and rapidly became one of the most important fluvial harbors of the Roman Empire (Tavano 1986). It was first razed to the ground by Attila in the V century and successively abandoned for approximately 250 years before the beginning of the IX century. During this period, the whole area evolved into a marsh due to an uncontrolled water supply from previously canalized watercourses and possible tectonics effects. This test site is characterized by several interconnected or superimposed foundations and remains of walls in the 0.5 ~ 4.0 m depth range. Chaotic subsurface conditions are due to a mixture of debris and sediments ranging from sand to clay in the shallow layer, which are responsible for the often poor signal-to-noise ratios and the high attenuation of the GPR records. The surface topography is almost flat, so topographic corrections were not mandatory.

Figure 8a shows a processed GPR profile, selected from the 2.5-D grid, which provides a typical example of the conditions of the study area. In this section, radar signals that can be linked to potential archaeological targets are observed in the depth range between 10 ns and about 70 ns, with several superimposed or interconnected features. Such cases hamper the correlation of GPR signals related to the corresponding subsurface structures just by amplitude profile interpretations.

Fig. 8
figure 8

Case study III: example of a processed 250 MHz GPR profile and results of 2-D texture analysis. a Amplitude profile; b contrast; c energy; d entropy; and e homogeneity

Texture analysis can distinguish features via their shape or their structural characteristics. Figure 8b–e shows contrast, energy, entropy, and homogeneity calculated on the previous 2-D GPR section, respectively. From these pictures, we can see that the textural attributes highlight locally homogeneous characteristics and allow a subdivision of the section in three zones, i.e., zones III-A, B, and C. Figure 9 shows the histograms of amplitude and the corresponding textural attributes within the three zones, while Table 1 summarizes the results of the qualitative analysis of such distributions.

Fig. 9
figure 9

Case study III: histograms of amplitude and corresponding textural attributes. ae Histograms of amplitude, contrast, energy, entropy, and homogeneity of zone III-A; fj histograms of amplitude, contrast, energy, entropy, and homogeneity of zone III-B; and ko histograms of amplitude, contrast, energy, entropy, and homogeneity of zone III-C

Table 1 Results of the analysis of the distributions of the different parameters from Fig. 9a–o

We observe a strong attenuation of the electromagnetic waves between zone III-A and zone III-B, which seems associated with a variation in water saturation of homogeneous sediments, based on the results of archaeological excavation. The high-amplitude continuous reflections have relatively high contrast, low energy, high entropy, and low homogeneity (see the zone III-A1 in Fig. 8), while the low-amplitude discontinuous reflections have low contrast, high energy, low entropy, and high homogeneity (see e.g. the zone III-A2 in Fig. 8). Low-frequency high-amplitude anomalies generally exhibit low contrast, high energy, low entropy, and high homogeneity (e.g., the zone III-B1 in the yellow rectangular frames in Fig. 8, associated with geometrically coherent anomalies). Cross plots of amplitude-energy, amplitude-entropy, and energy-entropy provide a useful correlation tool to establish relationships among amplitude and textural attributes in the two zones III-A1 and III-B1 (Fig. 10). Such results better characterize the different zones and confirm the previous qualitative analysis.

Fig. 10
figure 10

Case study III: example of two-dimensional cross plots within the defined zones III-A1 and III-B1. ac Amplitude-energy, amplitude-entropy, and energy-entropy-based data extraction from III-A1; df amplitude-energy, amplitude-entropy, and energy-entropy-based data extraction from III-B1

The extension of the analysis to 2.5-D GPR grid helps understanding the spatial extent and shape of the potential targets. Figure 11a, b shows energy and homogeneity along a time slice at 50 ns. Some of the anomalies in the slices may correspond to buried remains, but the interpretation is not so straightforward. Therefore, we propose composite displays to obtain a better characterization of subsurface targets and zones.

Fig. 11
figure 11

Case study III: 3-D results of energy (a) and homogeneity (b)

Figure 12a shows the amplitude time slice at the same position: the signal-to-noise ratio is rather low and the identification of anomalies is difficult, because potential buried remains lack a clear geometrical coherence. While the instantaneous amplitude slice calculated at 50 ns can image several geometrically coherent anomalies more clearly. Overlays with homogeneity and instantaneous amplitude and mixed displays involving energy and instantaneous amplitude (Fig. 12b, c, respectively) provide more reliable information and emphasize the lateral limits of the subsurface structures. The combinations help highlighting potential targets and anomalous zones of the image. Such displays allow the identification of different targets. From the archaeological point of view, the blocks, marked by green triangles in Fig. 12, can be associated with building foundations, the quite continuous features, marked by white triangles in Fig. 12, are probably related to wall remains, and the striped features, marked by cyan triangles in Fig. 12, can be realistically interpreted as ancient roads/passageways, because they are approximately 5 m wide and exhibit the typical pattern of the Roman roads. Detailed interpretation results of the main structures are displayed in Fig. 12d. Such features are hardly visible in the amplitude time slice (Fig. 12a), while they are quite apparent on composite attribute displays (Fig. 12b, c).

Fig. 12
figure 12

Case study III: comparison between amplitude time-slice (a), homogeneity overlaid on instantaneous amplitude slice (b), energy mixed with instantaneous amplitude slice (c) at 50 ns, and archaeological interpretations of main structures (d)

6 Discussions and Conclusions

The applicability and the effectiveness of texture attribute analysis to GPR datasets in archaeological prospection are demonstrated by the proposed case studies. The sites selected to test the method, namely San Vincenzo, Duino, and Aquileia, are three archaeological sites with different characteristics of buried targets and target/background contrasts, as well as topographic conditions. The analysis of textural attributes highlights characteristic differences that allow discrimination and classification of specific reflectors and zones/volumes of the radar dataset. All of the four attributes based on GLCM can provide a clear picture of the location of different potential targets, and show excellent discrimination between the different reflection patterns of potential targets and background. Combinations of textural attributes can improve identification and interpretation of coherent anomalies associated with continuous geological interfaces or potential archaeological targets.

The number of gray levels (n) is a crucial parameter and requires a test and selection phase to properly image subsurface features. The resolution of texture images is also constrained by the dimensions of the analysis windows (w). Such two parameters (n, w) depend on the resolution and characteristics of the data and require preliminary tests to optimize the results and to improve detection of patterns of interest.

Composite displays, such as overlays and mixed displays, are crucial in complicated conditions (e.g., low S/N, low contrast) to extract more information, and to enhance the capability of target detection, particularly if pseudo 3-D data are available. The combination of textural attributes with the conventional ones can provide further improvements in detection, classification, and discrimination of signals in GPR records in a well constrained, automatic, and objective manner. The main advantages of the combination are linked to the different sensitivities of the different attributes to peculiar components of the recorded wave field (see the case study III). Furthermore, neural networks techniques are often used for statistical analysis and data classification (Van der Baan and Jutten 2000), as well as to combine together attributes related to different physical parameters (Forte et al. 2012). The analysis ability of the network is stored in the connection weights, obtained by a process of adaptation to, or learning from a set of examples (i.e., different attributes for multi-attribute analysis). Computationally, the connectivity of the nodes within a general neural network can efficiently perform pattern classification (West et al. 2002). Ongoing developments in neural networks to classify textural attributes could provide quantitative measurements in different facies classifications, thereby reducing the overall level of subjectivity in the analysis of composite displays, making the multi-attribute approach even more attractive.