Introduction

Regional geochemical maps are often interpolated from point data, which are usually sampled in surficial media such as stream or lake sediments. These data may contain large amounts of information that are critical for environmental studies and geochemical exploration (e.g., Xu and Cheng 2001). The spatial distribution of elements in a given geological–geochemical environment is the end product of human activities and geological processes such as volcanic or intrusive activities, sedimentary processes, tectonism, metamorphism, and mineralization. Characterizing the geochemical patterns and delimiting geochemical anomalies are crucial tasks in the field of environmental studies and mineral exploration. Conventional methods for these tasks are limited, because they are generally based on the frequency distribution of sampled values. These approaches process geochemical data with a variety of techniques, including the calculation of threshold values corresponding to the mean plus twice the standard deviation (Hawkes and Webb 1962), probability graphs (Sinclair 1974), exploration data analysis (Tukey 1977; Behrens 1997; Reimann 2005a, b; Carranza 2010), and multivariate statistics (e.g., Zuo 2011a, b; Zuo et al. 2009a, b, 2013a, b; Yousefi et al. 2012, 2014), each of which may be limited depending on the type of elemental distribution. Such techniques have been extensively and, in some cases, successfully used in geochemical data processing, especially for the separation of geochemical anomalies from the background. However, most conventional methods assume that concentrations of geochemical elements in the crust follow a normal or log-normal distribution (e.g., Davis 2002; Li et al. 2003), with some requiring the samples to be collected uniformly over the region of interest (Cheng et al. 1994). In addition, the fundamental geological assumption for these methods is that the populations generated by different geological processes are statistically distinguishable (Xu and Cheng 2001). Furthermore, conventional methods based on frequency distributions often ignore spatial variations in the geochemical data, but these can provide valuable information for mineral exploration and environmental studies.

Allègre and Lewin (1995) demonstrated that the ordinary distribution of trace elements can be normal, non-normal, or multimodal. Since fractals were first applied to the field of geochemistry in the early 1990s (e.g., Bölviken et al. 1992), research on fractal/multifractal models has shown that most geological processes, such as surficial weathering and erosion, generate scale-invariant patterns (Lavallee et al. 1993). This applies to surficial elemental concentrations (e.g., Cheng et al. 1994, 1999, 2000; Xu and Cheng 2001; Li et al. 2003; Lima et al. 2003; Cheng 2007; Zuo and Cheng 2008; Zuo et al. 2009a, b, 2013a, b, 2015; Zuo 2011a, b; Afzal et al. 2010, 2011, 2012, 2013; Agterberg 2012; Arias et al. 2012; Mehran Heidari et al. 2013; Wang et al. 2015).

The properties that could potentially be used to differentiate between distinctive populations of geochemical data include the geochemical value frequency, spatial variability of geochemical values, geometrical characteristics of anomalies, and scaling properties of a geochemical anomaly (e.g., Cheng et al. 1994, 1996, 1997, 2000; Xu and Cheng 2001; Li et al. 2003; Lima et al. 2003; Afzal et al. 2010, 2011, 2012, 2013; Agterberg 2012; Zuo et al. 2015). There is no doubt that the most effective way to identify geochemical anomalies or quantify the characteristics of geochemical patterns is to adopt a comprehensive technique that combines the properties mentioned above. Over the past few decades, many attempts have been made to develop such models. For instance, nonlinear models based on fractal/multifractal theory are widely acknowledged as powerful tools (e.g., the grade-tonnage model (Turcotte 2002), size-frequency distribution analyses of giant mineral deposits (Agterberg 1996), concentration-area model (C-A: Cheng et al. 1994), spectrum-area model (S-A: Cheng et al. 2000), concentration-distance model (C-D: Li et al. 2003), singularity analysis (Cheng 2007; Cheng and Agterberg 2009), and concentration-volume method (C-V: Afzal et al. 2011). These methods are gradually being adopted as an effective and efficient means of decomposing geochemical patterns into different components (e.g., Cheng and Agterberg 2009; Zuo 2011a; Zuo and Cheng 2008; Zuo and Xia 2009; Zuo et al. 2009a, b, 2013a, b, 2015; Agterberg 2012).

In this study, we present a simple MATLAB-based program for processing geochemical data by means of fractal/multifractal modeling. Our program has two main functions. The first quantifies the spatial distribution characteristics of geochemical patterns using the multifractal spectrum, and the second identifies geochemical anomalies using C-A, S-A, and singularity analysis. The applicability of this program is demonstrated by processing a soil geochemical dataset from Inner Mongolia, China.

Multifractal spectrum

Multifractals are spatially intertwined fractals with a continuous spectrum of fractal dimensions (e.g., Cheng and Agterberg 1996). The basic concepts involved in a multifractal model include the partition function χ q (ε), mass exponent τ(q), singularity exponent α(q), and the multifractal spectrum f(α). Let μ i (ε) be the total amount of a measure (or total concentration of an element) μ in the ith cell of a linear scale ε. The partition function χ q (ε) can then be defined as (Evertsz and Mandelbrot 1992):

$$ {\chi}_q\left(\varepsilon \right)={\displaystyle \sum_{i=1}^{N\left(\varepsilon \right)}{\mu_i}^q\left(\varepsilon \right)}, $$
(1)

where N(ε) is the total number of cells of size ε. If the distribution of μ i (ε) is multifractal, the partition function of χ q (ε) has a simple power-law relation with the cell size ε for q with –∞ ≤ q ≤ +∞, or

$$ {\chi}_q\left(\varepsilon \right)\propto {\varepsilon}^{\tau \left(\mathrm{q}\right)}, $$
(2)

where ∝ represents proportionality, and τ(q) is the mass exponent of order q. The singularity exponent α(q) and the multifractal spectrum can be obtained by a Legendre transformation (Evertsz and Mandelbrot 1992):

$$ a(q)=\frac{d\tau (q)}{dq}, $$
(3)
$$ f\left[\alpha (q)\right]=\alpha (q)q-\tau (q). $$
(4)

Further details on multifractal models and the moment method can be found in Cheng and Agterberg (1996) and Evertsz and Mandelbrot (1992). The moment method is used to create the multifractal spectrum. The specific process consists of the following steps:

  1. (1)

    Generate a grid map by interpolating the original geochemical sample data;

  2. (2)

    Define a series of square boxes with edge size ε i and a series of moment values q i ;

  3. (3)

    The total amount of a measure μ i (ε) in the ith cell with linear size ε is obtained as:

$$ {\mu}_i\left(\varepsilon \right)={c}_i\times {\varepsilon}^2; $$
(5)
  1. (4)

    Calculate the mass partition function for the qth order of χ q (ε) using \( {\chi}_q\left(\varepsilon \right)={\displaystyle \sum_{i=1}^{N\left(\varepsilon \right)}{\mu_i}^q\left(\varepsilon \right)} \);

  2. (5)

    If μ i (ε) follows a multifractal distribution, then a power-law relationship exists between χ q (ε) and ε, i.e., χ q (ε) ∝ ε τ(q);

  3. (6)

    Obtain the singularity exponent α and multifractal spectrum f(α) by a Legendre transformation: α(q) = (q)/dq, f[α(q)] = α(q)q − τ(q).

Concentration-area multifractal model

Cheng et al. (1994) proposed the C-A model to separate geochemical anomalies from the background. This represented the first significant progress in the fractal/multifractal modeling of geochemical data (e.g., Zuo et al. 2012), and is a fundamental technique that is frequently used to model geochemical anomalies (Carranza 2009). The C-A model gives

$$ A\left(\rho \right)\propto {\rho}^{\hbox{-} \beta }, $$
(6)

where A(ρ) denotes the area with concentration values greater than or equal to ρ, ∝ represents proportionality, and β is the fractal dimension. Several straight lines can be fitted on a log-log plot of A(ρ) against ρ by means of the least-squares method, and we can estimate β from the slope of these lines. The values of ρ corresponding to the breaks in these straight lines act as thresholds that separate a geochemical map into areas of high, moderate, and low anomalies (e.g., Cheng et al. 1994; Carranza 2009; Afzal et al. 2010; Arias et al. 2012; Zuo 2011a; Zuo et al. 2013a; Mehran Heidari et al. 2013).

Spectrum-area multifractal model

By extending the idea of the C-A model into the frequency domain, Cheng et al. (2000) developed the S-A model to characterize the power spectrum density-area relationship (Zuo and Xia 2009; Zuo et al. 2013a, b). The S-A model can be expressed as:

$$ A\left(\ge E\right)\propto {E}^{\hbox{-} \beta }, $$
(7)

where E represents the power spectrum density as a function of the wave number vector, A(≥E) denotes the area (in units of the wave number) with values above E, and β is an exponent that can be estimated by plotting A against E on a log-log plot; subsequently, the filters can be constructed.

To implement the S-A model and identify geochemical anomalies, we perform the following steps (Xu and Cheng 2001; Zuo 2011a, b; Zuo et al. 2013a, b; Afzal et al. 2012, 2013):

  1. (1)

    Generate a grid map by interpolating the original geochemical sampling data;

  2. (2)

    Convert the interpolated map from the spatial domain to the Fourier domain using a fast Fourier transform;

  3. (3)

    Calculate the power spectrum of the converted map, and form a dataset consisting of the power spectrum density (E) and the area with power spectrum density values greater than or equal to E. Plot these data on a log-log scale;

  4. (4)

    Determine breakpoints to divide the data pairs into several segments with different scaling properties, and use these to build up filters;

  5. (5)

    Apply the filters to the map in the frequency domain, and transform the filtered map back to the spatial domain using the inverse Fourier transform.

This fractal filtering method enables a geochemical map to be divided into a background map and an anomaly map (Cheng et al. 2010; Zuo 2011a, b, 2012).

Singularity analysis

The concept of a singularity can be used to depict the characteristics of singular processes, i.e., those that result in anomalous energy releases or material accumulations within a narrow spatio-temporal interval (Cheng 2007). From a multifractal perspective, the phenomenon derived from a singular process can be described by the following power-law model:

$$ C\left(\varepsilon \right)\propto {\varepsilon}^{\alpha \hbox{-} E}, $$
(8)

where ε is a normalized distance measure such as the block cell edge, E is the Euclidian dimension, C(ε) represents the element concentration in a space whose characteristic scale is ε, and α is the exponent of the power-law relationship. This exponent is the singularity index, which characterizes how the element concentration varies in the defined space with respect to ε. When α < E, we say there is abnormal enrichment of the element concentration, whereas α > E indicates a depletion of the element concentration. The case α ≈ E represents a nonsingular location.

To estimate the singularity index from a geochemical map, Cheng (2007) proposed a window-based approach that formed the original singularity analysis method. The algorithm can be described as follows: first, given a location on the geochemical map, we define a series of sliding windows A(r i ) (square, circular, rectangular, or other shapes) with variable window sizes r min = r 1 < r 2 < … < r n  = r max. Next, the average element concentration C[A(r i )] is calculated for each window size (this is equal to the sum of all the cells’ concentrations divided by the total number of cells within the window). We plot C[A(r i )] against r i on a log-log plot, and then obtain the linear relationship:

$$ \log C\left[A\left({r}_i\right)\right]=C+\left(\alpha -2\right) \log {r}_i. $$
(9)

Using this relationship, the singularity index α at the current location can be estimated as the slope plus 2. We repeat these steps to create a singularity distribution map, and then delineate areas of geochemical anomalies on the basis of the singularity index.

Program description

The software was written in MATLAB 2012a. The program uses a graphical user interface (Fig. 1) that presents the integral structure and operation of the software (Fig. 2). We now describe the various parameters and options in each model, and identify several problems with parameter selection.

Fig. 1
figure 1

Anomaly identification system (AIS) graphical user interface

Fig. 2
figure 2

Integral structure and operation procedure of AIS

The program has the following menus:

  1. (1)

    File. This menu contains three submenus, Set path, Open, and Exit. When the program is opened, only the File and Help menus are active; the other menus are grayed out, because these features are not available. Users must first set the path for which the results of further operations will be saved, and then all menus will be activated once the user opens the geochemical data to be processed. Our procedure can process both ASCII data and point vector data in the default “.xls” format. For the point vector data, columns one and two represent the X and Y coordinates, respectively, and the other columns represent element concentrations. The first row of the .xls file should contain the characters X, Y and the abbreviated geochemical element names. For the ASCII data, the headers for the six rows document the size of the data matrix (e.g., N columns and M rows), the coordinates of the lower-left corner (e.g., xllcorner = 30 and yllcorner = −20), the cell size, and a flag for data voids, which is followed by values arranged as a data matrix.

  2. (2)

    Preprocessing. This menu consists of two submenus, Spatial interpolation and Descriptive statistics. Spatial interpolation is only available for point vector data, and refers to common interpolation methods (e.g., the inverse distance weighted method). It is used to convert original sampling point data into a grid map. The other option calculates the statistical properties of the input data, such as the mean, standard deviation, median, kurtosis, and so on. Certain graphical outputs are also available, i.e., histogram, cumulative histogram, boxplot, and quantile-quantile (Q-Q) plot.

  3. (3)

    Anomaly identification. One of the core functions of this software, this menu contains three submenus, Singularity analysis, Spectrum-area model, and Concentration-area model. Singularity analysis executes the singularity mapping technique. This has two parameters, the increment of the window radius and the number of windows. By default, the minimum window size is 3 × 3, so if the increment of the window radius is set to 1, then the next window size is 5 × 5; the number of windows is self-evident. There are no rules on how to select values for these two parameters. When the Spectrum-area model is selected, we must first specify the number of points in the X direction, which represent the power spectrum density. This number cannot be too small, otherwise the inflection points will not be revealed; however, this number should not be too large, either, or the computation time will be excessive. Of course, this parameter value cannot exceed the total number of input data. Once a value has been entered, the log-log plot of the area against the spectrum density can be obtained. Note that a progress bar monitors the computational process. Users can then determine the appropriate number of breakpoints by means of toolbar buttons, and the split lines can be located by clicking the left mouse button on the area of the coordinates. In terms of how many breakpoints should be selected, the shape of the curve and practical demands are the main considerations. Accurate values of the abscissas of the breakpoints and the goodness of linear fitting for each segment are displayed at the bottom of the interface. These breakpoints can be used as reference values to set thresholds for the Pattern separation option. After the filters have been constructed, the corresponding anomalous and background components can be obtained. The Power spectrum option allows users to obtain the power spectrum for the interpolated data in the Fourier domain. Note that users are prompted to save the results of every manipulation, and the data pairs of the S-A model and fitting parameters can be exported via a toolbar button. The Concentration-area model has a similar interface and operation.

  4. (4)

    Multifractal modeling. This menu refers to the multifractal spectrum analysis. For this option, there are two key parameters: the scale vector and the moment vector. The scale should consist of positive integers, and the increment is set to a default value of 1. The moment vector should be symmetric about the origin, and its absolute value should not be too large. For a large positive moment, the partition function is dominated by a few large cell measures, whereas for a negative moment, the partition function is mainly determined by a few small measures. This indicates that the partition functions of different moments reflect different distributions of the measure (Xie and Bao 2004). Unlike the scale vector, the increment for the moment vector can be set via a pop-up menu. The analysis results can be visualized as graphs, or saved for further analysis.

  5. (5)

    Help. This menu provides contact information for the author. Users are encouraged to contact the author if they encounter any problems.

Case study

Study area and data

To demonstrate the applicability of the software program, we examine geochemical data taken from the northeast of Dong Ujimqin Banner district (45.66°N–46.17°N, 117.5°E–118°E). Located in Inner Mongolia, near the boundary between China and Mongolia, this is one of the most important Ag polymetallic belts in the north of China (Wang 2003). This grass-covered district is mainly composed of Tertiary and Quaternary sediments overlying Devonian, Permian, and Jurassic Formations. The Devonian Formation consists of sandstone, siltstone, slate, and volcanic clastic, and is rich in ore-forming elements (Huang et al. 2013). Intrusions, also overlain by the sediments, are distributed across the whole district, largely extending along a NE trending belt, and there are well-developed faults in the NE and NW directions that control the spatial distribution of deposits (Fig. 3a). Yanshanian intrusions, characterized by plutonic granitic rocks, are associated with hydrothermal mineralization (Huang et al. 2013). Three Ag polymetallic deposits have been discovered in recent years (Jiang et al. 2007; Yu et al. 2011). More detailed information on the geological setting can be found in Liu (2011).

Fig. 3
figure 3

a Simplified geochemical map of the study area in Inner Mongolia, northern China (after China Geological Survey); b Geochemical sampling data of Ag

A total of 1974 soil samples from depths from −20 to −80 cm were collected at a density of 1–2 samples per km2 (Liu et al. 2013). We select the data on Ag, one of the main ore-forming elements, to illustrate the validity of the program. Concentrations of Ag (detection limit 0.02 ppb) were determined by electrospray mass spectrometry. In the original point data, the Ag concentration ranges from 21.9 to 373. 9 ppb (mean of 83.4 ppb). The higher concentrations of Ag mainly occur in the upper-right of the study area (Fig. 3b). More detailed information on sample preparation and analysis can be found in Liu (2011).

Statistical properties of the original data

To investigate the overall distribution characteristics and features of outliers, we used the Descriptive statistics option to produce a histogram, cumulative histogram, Q-Q plot, and boxplot (Fig. 4). The histogram is broadly symmetric, except for the heavy-tailed distribution of extremely high values. This characteristic is corroborated by the Q-Q plot, in which high values deviate heavily from the straight line. A high percentage of outliers can be observed on the boxplot. Therefore, the assumption of a normal distribution for the original data may not be accurate or effective for dealing with singular values. Instead, the original geochemical data should be analyzed by means of fractal/multifractal modeling via the proposed program.

Fig. 4
figure 4

Statistical properties of original geochemical data of Ag

Quantifying the spatial distribution characteristics of geochemical patterns

A grid map was created from the original geochemical data by means of the inverse distance weighted method. To quantify the spatial distribution characteristics of Ag, we calculated the multifractal spectrum of the element concentrations using the method of moments via the Multifractal modeling option. The moment (q) varied from −6 to +6 in steps of 0.5, and the results imply that this interval was sufficiently wide to generate the necessary information (Fig. 5). The scale vector ranged from 1 to 7 in steps of 1. As shown in Fig. 5a, the χ q (ε) − ε plots are drawn on a log-log scale, and the slopes of the straight lines (termed the τ(q) parameters) are estimated by least-squares fitting. The estimated mass exponents are plotted in Fig. 5b. The singularity exponent α(q) can then be derived by numerically differentiating the data, as shown in Fig. 5c. Finally, the multifractal spectrum f(α) was obtained through a Legendre transformation, as displayed in Fig. 5d. We can observe that the multifractal spectrum of Ag is a continuous curve, but this is not symmetric, and deviates distinctly to the left. This asymmetry may reflect the fact that the spatial distribution of concentrations has a continuous multifractal nature, and has undergone a certain degree of local superimposition or other modifications (e.g., Xie and Bao 2004).

Fig. 5
figure 5

The moment method used to deduce multifractal spectrum. a Log-log plot of the mass partition function χ q (ε) versus scale ε b Relationship between mass exponent τ(q) and moment q c Relationship between singularity exponent α(q) and moment q d Relationship between multifractal spectrum f(α) and singularity exponent α

Identifying anomalies by means of singularity analysis

Prior to the calculation of the singularity index, several parameters must be set. We used a series of square windows with half-window sizes ranging from 1 to 17 km at intervals of 2 km. Singularity index values of α < 2 and α > 2 represent enrichment and depletion, respectively. The patterns given by the singularity index illustrate that the anomalous areas coincide with the locations of three known Ag deposits (Fig. 6), particularly for the two deposits located in the northeastern part of the study area. This suggests that the singularity index can readily identify geochemical anomalies.

Fig. 6
figure 6

Estimated singularity index α by means of original singularity analysis

A number of anomalous areas are delineated in Fig. 6. These include one located in the lower-left region of the two northeastern (known) deposits, and some less intensive anomalies that are mainly distributed in the middle and northern part of the study area. Such anomalous areas should be further investigated in the next round of mineral exploration in the study area.

Separation of anomalies from the background using the S-A model

The interpolated map was converted to the frequency domain by means of a Fourier transformation. The power spectrum values were calculated and plotted with the Spectrum-area model option. Two vertical lines were selected to divide the calculated data pairs, and three straight lines were fitted by the least-squares method. The abscissas of the breakpoints and goodness of the linear fitting are shown in the lower part of the interface (Fig. 7). These data imply the existence of three subsets of frequencies in terms of distinct scaling properties.

Fig. 7
figure 7

MATLAB interface for log-log plot showing the relationship between power spectrum value E and area A(≥E). Three straight lines are fitted by means of least squares fitting. And the abscissas of the breakpoints and goodness of the linear fitting are showed at the bottom

The breakpoints indicated by the two vertical lines were taken as thresholds to construct the corresponding filters based on the power spectrum given by the Pattern separation option. By applying these filters to the Fourier-transformed functions and then converting them back to the spatial domain, we generated three components corresponding to the background, anomalies, and noise. The background and anomaly maps are shown in Figs. 8 and 9, respectively. The anomalous areas obtained from the S-A model had a high spatial correlation with the known deposits, which suggests that our model can effectively decompose mixed patterns into a varied geochemical background and an anomalous map.

Fig. 8
figure 8

Background map obtained from the S-A model

Fig. 9
figure 9

Anomaly map obtained from the S-A model

Comparing the results obtained by the S-A model (Fig. 9) and the singularity analysis (Fig. 6), we can observe a high spatial correlation, and the three known Ag deposits are located in or near the highly anomalous areas. This indicates that both the S-A model and singularity analysis are powerful tools for identifying anomalies associated with mineralization.

Conclusions

The software program presented in this paper is a MATLAB-based graphical user interface for processing geochemical data by means of fractal/multifractal modeling. The models provided by this program are powerful tools for characterizing the distribution of geochemical data and separating geochemical patterns into several components. The application of this software is not limited to the geochemical data considered in our case study, and could also be applied to geophysical data. The graphical user interface allows for easy data input and graphical representation of the analysis results.

One of the key advantages of this program is its ease of use. Another benefit is that the analysis results can be further investigated in ArcGIS by means of a simple format conversion. The main limitation is the inability to add map features to data when the analysis results are exported as a graph. Any criticisms or suggestions arising from the use of this program will be warmly welcomed.