Introduction

Volcanic eruptions often have devastating effects. A basic step towards the mitigation of their consequences consists of forecasting the time evolution of pre-eruptive unrest at volcanoes. The present state of knowledge of the complex physical process responsible for the volcanic eruptions makes a theoretical approach to forecasting rather difficult. In this situation, the empirical identification of possible repetitive schemes (patterns) in the pre-eruptive unrest of volcanoes may represent a viable strategy to improve significantly our forecasting capability and our physical knowledge of the system. The potential identified patterns might not only indicate whether the unrest is evolving into an eruption but also provide an estimation of the energy associated with the volcanic eruption, for instance the Volcanic Explosivity Index (VEI).

Volcanic unrest has a complex nature, involving different interactive processes. An almost complete picture of the phenomena consists of a large variety of different signals, for example different seismicity variables, deformation, gas emission, and so on. A robust technique that aims to identify possible pre-eruptive patterns in volcanic unrest has to take into account all, or at least some, of these measurements simultaneously. This necessity has been recently discussed by Sparks (2003) as the key to successful forecasting, and it is the basic concept underlying many studies made by expert volcanologists in dealing with volcanic unrest (see, for instance, the volume by Newhall and Punongbayan 1996 and, in particular, Harlow et al. 1996; Voight et al. 1999; Hill et al. 2002). In these studies, to set some empirical rules mainly based on human experience, different parameters are actually taken into account simultaneously. Here, the human brain of the expert volcanologist works as a qualitative pattern recognition or neural network code, identifying empirical multivariate patterns from past experience in order to establish some rules for future pre-eruptive unrest phases. However, even if the human brain is by far much more flexible and elastic than a computer in applying a pattern recognition or neural network code, it has some shortcomings that might limit the use of the human experience and favor the use of computer codes (e.g., Cammarata 1997). In particular, the human brain can hardly deal simultaneously with 3 or more variables. In other words, the higher the number of variables considered, the lower the capability of the human brain in detecting patterns (even simple ones). Also, the personal and subjective experience of a scientist is usually more difficult to extend to the rest of the volcanological community than rules obtained through a quantitative and reproducible approach. Moreover, the latter produces results that are definitely much easier to submit to rigorous scientific validation. Because of these reasons, it might also be useful to consider a computer-based approach, looking for strict objective and quantitative rules.

In fact, quantitative rules have been already sought in forecasting eruptive activity. These studies are very often based on the retrospective analysis of a single volcanic event (e.g., Shibata and Akita 2001; Gottsmann and Rymer 2002), or of events from a single volcano (e.g., Aki and Ferrazzini 2000; Londoño and Sudo 2002) that almost always take into account only one variable and not a multivariate dataset. The approach based on a single volcano, and above all, on a single eruptive episode, has an intrinsically strong limitation: the analysis, even though detailed, does not allow for the discrimination of the general pre-eruptive patterns from the peculiarities of the volcano or eruption considered. However, possible general pre-eruptive patterns are definitely the most important ones because they contain information useful for improving the knowledge of the physics of the erupting system. At the same time, their identification may furnish quantitative rules that can be profitably used to forecast the time evolution of the unrest in other volcanoes. In practice, in fact, we often have to cope with very dangerous volcanoes, for instance Mount Vesuvius, where no quantitative measurements relative to past pre-eruptive phases are available. In such cases, it becomes very difficult to understand when an unusual behavior of the volcano is really linked to an impending large eruption. Hence, the common experience acquired in other erupting explosive volcanoes (e.g., Shimozuru 1972; McNutt 1996) becomes the most relevant information.

The idea of this paper is to look for quantitative and complex (i.e., coming from a multivariate dataset) pre-eruptive patterns common to many different volcanic areas of the world. The main difficulties in reaching this goal are (1) the scarce availability and the incompleteness of pre-eruptive data, and (2) the ability of the methods used in objectively recognizing possible complex and quantitative pre-eruptive patterns.

As regards point (1), we have to collect a sufficient number of multivariate data coming from different volcanoes. In fact, some effort has been dedicated until now to collecting multivariate pre-eruptive data coming from different volcanoes into a single dataset. Remarkable examples are the catalogs compiled by Newhall and Dzurisin (1988) and Benoit and McNutt (1996), just to mention a few. The latter provides seismic data, in a time period of 10 years, relative to pre-eruptive phases on more than 100 volcanoes. In spite of the huge effort made by the authors, the collected data are rather rough, being very often categorical, strongly heterogeneous (for instance, the magnitude measurements) and in some cases qualitative.

In our opinion a definite improvement in this field can be achieved only through an international and coordinated effort, such as the WOVOdat project (www.wovo.org/wovodat.htm).

As regards point (2), the main difficulty is predominantly technical. A complication arises in the fact that the available data are usually few, often categorical, correlated, and their statistical distributions are seldom Gaussian (see Benoit and McNutt 1996). This precludes the use of all the parametric multivariate techniques successfully used in many other scientific fields (e.g., Fukunaga 1990). Note that, when few data are available, neural network codes cannot be used either, because they need a large amount of data to correctly perform the training, validating and testing steps (Tarassenko 1998). Furthermore, in this paper we are mainly interested in the physical meaning of the possible common patterns. With a neural network approach, we might be able to classify different types of unrest, but it would not be clear what the physical rules are that allow the network to discriminate between different types of unrest.

In this work, we provide a possible strategy of analysis that properly takes into account all of the issues discussed above. In particular, we apply two different non-parametric pattern recognition codes to search for common pre-eruptive patterns in a catalog containing data recorded during several volcanic unrest episodes around the world. The dataset consists of the Benoit and McNutt catalog (1996) plus all the available information concerning the seismic swarms related to the largest (VEI≥4) explosive eruptions that occurred during the last century, and other episodes of volcanic unrest for which information was found in the literature. The main goal is to provide new insights concerning the following questions:

  • Do the seismic unrest episodes occurring before volcanic eruptions have common patterns?

  • Do these possible common patterns reflect the magnitude of the following eruption?

  • Do these possible common patterns reflect the type of the following eruption or the initial state of the conduit (closed or open)?

Although the catalog used is certainly the best available, its intrinsic quality may still be too insufficient (few data, with missing measurements) for obtaining quantitative and useful rules to forecast volcanic events. At the same time, it is possible to achieve interesting scientific insights for improving our knowledge of the physics of the eruptive processes. In any case, independent from the scientific results obtained here, an ambitious aim of this paper is to introduce a new quantitative perspective in approaching the eruption forecasting issue. As soon as a worldwide catalog of volcanic unrest of good quality is available, the strategy of analysis described here can provide a very powerful tool in the recognition of quantitative rules for forecasting the temporal evolution of unrest in volcanic areas.

The dataset

The bibliography about volcanic eruptions dates back to historical times, but only for the catastrophic events (such as Vesuvius, 79 a.d.). Furthermore, in these cases, the reports are purely qualitative: morphological descriptions, eruptive products and descriptive temporal evolution. This information is not particularly useful for a quantitative approach to eruption forecasting. Thanks to the evolution in instrumentation, in the last few decades quantitative investigations have begun together with geophysical data reports. It is now possible to find a great number of different data relative to volcanic unrest in the form of seismological records, deformation measures, temperature or magnetic field variation detections, and so on. Among these, seismological information are the most available and reliable, mainly because of the great diffusion of seismometers compared to that of other instrumentation. Furthermore, among all the precursors of an eruption, volcanic earthquakes almost always characterize periods of volcanic unrest. For these reasons, we concentrated our study on the seismic data relative to the episodes of unrest which occurred in the last 50 years.

The dataset that we collected and analyzed consists of measurements relative to 217 seismic swarms in volcanic areas (see Table 1). For each swarm, we collected as many measurements as possible that are potentially related to the occurrence of a volcanic eruption and/or to the estimation of its VEI, in case an eruption occurs. Concerning the seismicity that characterizes the unrest followed by either low explosive events or not followed by eruptive activity (we call the latter isolated unrest), we mainly referred to Benoit and McNutt (1996). These two types of unrest are quite frequent, and in that catalog we were able to find a sufficient number of cases to be analyzed.

Table 1. This is a part of the dataset of seismic swarms we used in this work. In particular, all the largest post-1950 eruptions (VEI≥4) included are shown. A negative value for MXM, PRE and/or TRE means that a measurement is missing

On the other hand, due to the rarity of this type of the event, many more difficulties were encountered concerning the seismicity that characterizes the unrest before the most explosive eruptions (VEI≥4). Because of this, further research was necessary.

For the VEI≥4 events from 1950 to 1994, we primarily consulted the Volcanoes of the World (Simkin and Siebert 1994). We complete the list up to 2001 through the personal communications of Lee Siebert.

One problem was where to find information about the seismicity, characterizing the unrest that preceded the VEI≥4 eruptions. We started by systematically seeking articles about these eruptive events since 1950. Some of the articles are hardly available (not being published in an easily accessible magazine, e.g., Taylor 1957). Some others, not being related to seismicity, were not useful for our purposes (e.g., Buell and Stoiber 1976). However, all the papers provided a bibliography containing a further list of references. As regards the last events of this century, we found the most interesting information on web sites.

For every useful article found, we had to be careful in giving the right meaning to the information since the author's interpretation can greatly influence the data. In order to obtain continuity among the different types of eruptions, acknowledging that they are subjective, we interpreted the data according to the definitions of seismic swarm and its duration used by Benoit and McNutt (1996) in their database.

To summarize, we consulted the database of Benoit and McNutt (1996), the catalog of Simkin and Siebert (1994), the Bulletin of Volcanic Eruptions (1963–90), and the available literature on the large eruptions (VEI≥4) of the last century, especially in its second half (Gorshkov 1959; Gorshkov and Dubik 1970; Simkin and Howard 1970; Zobin 1971; Reeder et al. 1977; Faberov 1983; Fedotov et al. 1983; Gorel'chik et al. 1983; Zobin 1983; Decker and Decker 1981; Jensen et al. 1983; Tokarev 1985; Swanson and Kienle 1988; Smithsonian Institution's Global Volcanism Network 1990; Miller and Mc Gimsey 1998; Paolo Papale, personal communication, 2002; http://www.volcano.und.nodak.edu; http://www.volcano.si.edu; http://www.vulcan.wr.usgs.gov).

For each swarm, we found measurements of the following variables:

  • The duration (DUR) of the swarm (in days)

  • The repose time (REP) associated to the swarm, i.e., the time (in years) elapsed between the end of the last eruption and the beginning of the swarm

  • The maximum magnitude (MXM) recorded in the swarm

  • A binary indicator (PRE) of the occurrence of a previous swarm (0=no, 1=yes)

  • A binary indicator (TRE) of the occurrence of volcanic tremor (0=no, 1=yes)

  • The φ function value (PHI). Considering the k-th swarm occurring in a certain volcanic area, PHI (φ(k)) is a perturbation function (Marzocchi 2002) that mimics the stress induced on this volcanic system by all the large remote earthquakes that occurred in the 35 years preceding the k-th swarm. In particular:

$$\phi ^{{(k)}} = {\sum\limits_{j = 1}^N {M_{{0j}} } }\omega {\left( {d_{{jk}} } \right)}$$
(1)

where N is the number of earthquakes that occurred in the 35 years preceding the onset time of the k-th swarm, M 0j is the seismic moment of the j-th earthquake and ω(d jk ) is a weight function dependent on the relative distance between the location of the k-th swarm and the epicenter of the j-th earthquake (see Fig. 3 in Marzocchi 2002). The seismic data are taken from the catalog of Pacheco and Sykes (1992) for the period 1900–1989, and from the CMT Harvard catalog (Dziewonsky et al. 1981; Dziewonsky and Woodhouse 1983) for recent years. The earthquakes considered are the events with M s ≥7 and depth ≤70 km.

While DUR, PHI and REP have been retrieved for almost the totality of the catalog, PRE, TRE and MXM retrieving has been much more difficult (see Fig. 1). The magnitude measurements are in different scales for every country, and have been assumed to be consistent. For the parameters TRE and PRE, a 1 value simply means that some information regarding the feature has been reported, as in Benoit and McNutt (1996). If a report states that, for example, "TRE measurements have been conducted," the TRE feature is set to 1 in Benoit and McNutt (1996), which we have also done in our dataset, regardless of the occurrence of TRE. A 0 value means that a negative result on the occurrence of TRE or PRE was reported.

Fig. 1.
figure 1

Relative frequencies of the features in the dataset

The measurements which could not be retrieved are set to a number standing for a missing value that will not be used in the analysis. As a final remark, we emphasize that the resulting catalog, even though it is still rough and needs further improvement, is certainly the largest one available at present.

Pattern recognition analysis

Pattern recognition (PR) is a set of very powerful multivariate analysis techniques allowing, in principle, the identification of possible repetitive schemes or patterns among the objects belonging to distinct classes. While usual data analysis takes into account only one variable of the process at a time, PR methods are able to extract information from any possible combination (linear or not) of variables that are suspected to have an influence on the process. Moreover, PR methods do not need the construction of a theoretical model, but are usually based on a basic and sole hypothesis, i.e., the assumption that the phenomenon under study is governed by a finite number of complex, but repetitive patterns of the variables.

For these appealing properties, we believe that PR might also be a very promising tool in earth science. Until now, the only few remarkable efforts in this direction are CN and M8 algorithms (Keilis-Borok et al. 1988; Keilis-Borok and Kossobokov 1990), and applications to volcanology (Mulargia et al. 1991,; Vinciguerra et al. 2001). Most of these algorithms, including CN and M8, are based on a different type of PR analysis: the so-called logical PR. This type of PR analysis requires the arbitrary choice (by the user) of several parameters influencing the behavior of the algorithm. Because of this, the risk of overfitting the data increases drastically. Furthermore, any systematical evaluation of how the values chosen for the parameters influence the performance of these algorithms has not as yet been conducted. For these reasons, in this study we prefer using different algorithms based on a different approach: the so-called statistical PR. The algorithms belonging to this category do not need the selection of parameter values by the user.

From a technical point of view, the main goal of PR methods is to classify objects. Every object is represented by an array of qualitative or quantitative variables. The procedure of analysis consists of three different steps: the learning phase, the voting phase, and the control experiments. In the learning phase, a set of known and classified objects is analyzed in order to recognize all the possible patterns that characterize each class, i.e., the combinations of variables that allow for the discrimination of objects belonging to different classes. This step turns out to be very useful also from a theoretical point of view, since it allows for the recognition, among all the suspected variables, of those that really play an important role in the process under study. In the voting phase, the patterns identified during the learning are used to classify new objects, whose class is unknown to the algorithm. Finally, the control experiments allow one to check the stability of the results by repeating the learning and the voting phases with different values of the algorithm's parameters.

In the present study, the main goal of the analysis is to recognize the prominent characteristic of the seismic swarms preceding a volcanic eruption and to find possible relationships with the VEI of the impending eruption. Due to the very limited amount of data available, in this paper we will perform only the learning phase, and attempt to recognize, as a first step, all the possible patterns in our dataset. In spite of the impossibility of testing the results on independent data (voting), we have used some empirical strategies to check the presence of possible overfit in the results.

Before performing the learning phase, we first had to:

  1. 1.

    Define the objects to be analyzed and the classes involved in the problem, and

  2. 2.

    Select the statistical PR algorithm that is most suitable to the problem we are dealing with

We shall explain these two steps more accurately in the following.

Definition of the objects and of the classes

The objects of the analysis are the seismic swarms. Any object is represented by a vector that contains all the measurements (the features) that we can associate to the object. Due to the large differences between the maximum and minimum measurements in the catalog for DUR, REP and PHI, we decided to use the logarithm of these features. Thus, each vector has the following components: Log (DUR), Log (REP), MXM, PRE, TRE and Log (PHI).

Each vector has then a further component: it is the VEI associated to the eruption (if any) following the swarm described by the vector. If the swarm has not been followed by an eruption, a fictitious VEI is associated to it equal to −1. In this paper, the attribution of an object to its class depends on the VEI of the subsequent eruption (if any). Since we had eight different values for the VEI (−1, 0, 1, 2, 3, 4, 5, 6) in our catalog, in principle we had eight different classes of objects. For simplicity, the VEIs of the swarms were be grouped in order to reduce the problem to a two-class problem, i.e., class 1 versus class 2. We kept at least one unit of VEI between the lowest VEI of the upper class and the highest VEI of the lower class. For example, in order to find patterns that distinguish a swarm preceding a small eruption from one preceding a large eruption, we considered as class 1, all the swarms with VEI≥4, and as class 2, all the swarms with 0≤VEI≤2. The VEI=3 events were excluded to emphasize the distinction between the classes. In this way we avoided more safely, with no loss of generality, any kind of overlapping between the classes. Note that we are interested in the most general features distinguishing the two classes.

The complete list of the various analyses (class 1 vs. class 2) performed is provided in the following.

Selection of the most suitable statistical PR algorithms

In this paper, we will try to identify repetitive patterns between two distinct categories of objects. Many statistical PR 2-class algorithms, both parametric (e.g., maximum likelihood estimation, see Duda and Hart 1973) and non-parametric (e.g., binary decision tree, Fisher's analysis, K-nearest neighbors, linear or quadratic discriminant analysis, see Rounds 1980; Duda and Hart 1973; Fukunaga 1991), have been successfully used in other scientific fields such as engineering, biology, economy, medicine. In these disciplines, the available datasets are large and continuous, and the variables are normally distributed.

Our dataset, as well as most of the datasets in earth sciences, do not have these "nice" features. In particular, it is composed of a small amount of data, some of the variables (if not all) are not normally distributed (e.g., the duration of the swarm, and the occurrence of previous swarms and tremor), and some might be also correlated (e.g., the duration and the maximum magnitude). Moreover, some of the variables we have collected in the catalog are probably completely irrelevant to the eruptive process. Indeed, we compiled our catalog by taking the largest possible number of potentially relevant variables available for each seismic swarm, because we did not know which (if any) of these variables are important for the subsequent occurrence of a volcanic eruption, or for the determination of the VEI of that eruption.

As a result, we needed to use a statistical PR algorithm that could perform satisfactorily on small datasets and is characterized by continuous and discrete or categorical variables that are perhaps correlated. Possibly, we are including in the analysis some variables which do not affect the eruption occurrence or its VEI, thus it was necessary to make use of a statistical PR algorithm that was able to extract those variables having a predominant influence on the processes related to volcanic unrest. According to these considerations, in this work we used two statistical PR 2-class algorithms that we had previously simulated using synthetic data and that had proved capable of recognizing patterns satisfactorily on small datasets, also with correlated and/or discrete (also categorical) data, and identifying the variables having a predominant role in the process (Sandri and Marzocchi 2003). These two non-parametric algorithms are called binary decision tree (BDT; Rounds 1980; Mulargia et al. 1992) and Fisher discriminant analysis (FIS; e.g., Duda and Hart 1973). The use of both algorithms, based on very different approaches, allowed us to check if the results that we obtained are due to the type of algorithm used. Although the risk of overfit can be excluded only by voting a set of independent data, the stability of the results obtained by these two different algorithms is indirect evidence that the risk of overfit is reduced.

Algorithm BDT was originally designed for hierarchically ordered data, but it has also exhibited very good performance on different data. It builds up a decisional tree where, at each level, a threshold value for a certain variable determines which branch has to be followed. BDT provides automatically the subset of variables playing an important role in the process.

Algorithm FIS is a non-parametric method because, although it assumes that the boundary between the classes is a hyperplane, it does not make any a priori assumption on the distribution of the data. It is a type of linear discriminant analysis, in which the original data are projected along a direction maximizing the ratio of the dispersion between the two classes to the dispersion inside each class. This algorithm, according to Fukunaga (1990), is here applied through a so-called branch-and-bound technique in order to identify the relevant features of the process. The feature selection performance by the branch-and-bound technique has been previously tested on synthetic data as well.

For a more complete definition of the algorithms and of the branch-and-bound technique, see Appendixes A, B and C.

Results of the analysis and discussion

We performed three different 2-class analyses with different goals. In particular they are:

  1. 1.

    VEI≥1 vs. VEI=−1 where class 1 is represented by all the swarms followed by a volcanic eruption (VEI1) and class 2 by all the isolated swarms (VEI=−1). This analysis was done in order to recognize the general differences between the swarms preceding a volcanic eruption and the isolated swarms.

  2. 2.

    VEI≥4 vs. VEI=−1 where class 1 is represented by all the swarms followed by a strongly explosive eruption (VEI≥4) and class 2 by all the isolated swarms (VEI=−1). This analysis was done in order to recognize the differences between the swarms preceding a strong explosive volcanic eruption and the isolated swarms.

  3. 3.

    VEI≥4 vs. 0VEI2 where class 1 is represented by all the swarms followed by a strong explosive eruption (VEI≥4) and class 2 by all the swarms followed by moderate eruptions (0≤VEI≤2). This analysis was done in order to recognize the differences between the swarms preceding a strongly explosive volcanic eruption and the swarms preceding small or moderate eruptions.

In each analysis, we used only "complete" objects, i.e., the objects having no missing values for the features considered in the analysis. We start by considering all the six features. Due to the missing measurements, the analysis that considers all the six features was carried out on a low number of objects (see Tables 2, 3, 4). In order to perform the analysis on a higher number of objects, and to check the stability of the results obtained on different learning datasets, we performed two additional learning phases that concern a smaller number of features. In particular, we repeated the statistical PR analysis concerning (1) DUR, REP, MXM and PHI, and (2) DUR, REP and PHI (see Tables 2, 3, 4). The choice of the features in (1) and (2) was due to their more common reporting (allowing for a larger number of complete objects) and to their importance in the process as suggested by the analysis carried out on all the six features (see below).

Table 2. VEI≥1 (class 1) vs. VEI=−1 (class 2). In the first column, the features used in the learning phase are shown; in the second column, the number of available objects with no missing values for any of the features used is shown for each class; in the two last columns, the features recognized as relevant are reported (third column, for BDT algorithm, fourth column for FIS algorithm). An empty box means that the algorithm does not recognize any pattern
Table 3. VEI≥4 (class 1) vs. VEI=−1 (class 2). In the first column, the features used in the learning phase are shown; in the second column, the number of available objects with no missing values for any of the features used is shown for each class; in the two last columns, the features recognized as relevant are reported (third column for BDT algorithm, fourth column for FIS algorithm)
Table 4. VEI≥4 (class 1) vs. 0≤VEI≤2 (class 2). In the first column, the features used in the learning phase are shown; in the second column the number of available objects with no missing values for any of the features used is shown for each class; in the two last columns, the features recognized as relevant are reported (third column for BDT algorithm, fourth column for FIS algorithm). An empty box means that the algorithm does not recognize any pattern

Since we were interested in the recognition of possible patterns in our swarms dataset, in each analysis we used all of the available complete objects for the learning phase. This allowed us to make use of as much data as possible to define the patterns in the data.

VEI≥1 vs. VEI=−1

As shown in Table 2, both algorithms recognize the DUR as the predominant variable for the discrimination between class 1 and class 2. In particular, swarms preceding a volcanic eruption are generally longer than isolated swarms. This agrees with the results obtained by Benoit and McNutt (1996). Due to the very limited amount of data available, the parameters of the pattern (i.e., the thresholds in DUR by which the algorithms classify a swarm as pre-eruptive or isolated) have a very large uncertainty. However, just to give an idea of the magnitude, the thresholds are in the order of a few days (a week). For example, Figs. 2 and 3 show the case in which four features (DUR, MXM, REP and PHI) are considered in the analysis for BDT and FIS algorithms, respectively. The BDT plot (Fig. 2) is very intuitive. The FIS plot (Fig. 3) instead needs a little explanation. In Fig. 3, we plotted the frequency of the learning objects (class 1 and class 2 separately) as a function of the pattern found, i.e., the combination of relevant variables identified. In this case, it is only Log (DUR). The data shown are standardized (mean and variance values are given in the figure caption). Should a new object have to be voted, we should first standardize it, then project it along Fisher's criterion line (in this case it is simply the standardized Log (DUR) axis). Then, the new object to be voted will be attributed to class 1 (precursory swarm) if it falls to the right of the decision boundary (i.e., if its standardized Log (DUR) is larger than 0.075), otherwise it will be attributed to class 2.

Fig. 2.
figure 2

Pattern recognition results for BDT relative to the first experiment, in which class 1 is given by swarms preceding VEI≥1 eruptions and class 2 by isolated unrest when the features DUR, MXM, REP and PHI are considered. The classification was performed on the basis of the DUR and of the threshold value indicated. In total, 91 swarms are correctly classified and 40 are not

Fig. 3.
figure 3

Pattern recognition results for FIS relative to the first experiment in which class 1 is given by swarms preceding VEI≥1 eruptions and class 2 by isolated unrest when the features DUR, MXM, REP and PHI are considered. The classification was performed on the basis of DUR, which is the only relevant variable identified by FIS algorithm in this case. The frequency of the learning objects is given as a function of this variable. The mean of Log (DUR) for the whole set of learning data is 2.24, and the variance is 4.5. The standardized voting objects should be attributed to class 1 if they are larger than 0.075; otherwise, class 2. Out of our learning swarms, 90 swarms are correctly classified and 41 are not

We interpreted the longer pre-eruptive swarm duration as an indication of a prolonged instability during pre-eruptive unrest. We did not observe any significant difference in the magnitude of the earthquakes among the episodes of unrest belonging to the two classes. Because of this, if we assume that the seismic rate among the episodes of unrest is comparable, we might interpret the prolonged duration of precursory unrest as an indication of higher energy involved.

VEI≥4 vs. VEI=−1

As shown in Table 3, here we find again that the predominant variable is DUR for both algorithms. A swarm preceding a large explosive eruption is generally longer than an isolated swarm. The same considerations regarding the parameters of the pattern made above apply here. Figures 4 and 5 are shown as examples for the BDT and FIS, respectively, when all six features are considered in the analysis. In this particular case, FIS recognized two relevant features (the second one is PHI). Should we need to vote a new swarm, we should first standardize its Log (DUR) and Log (PHI) according to the means and variances given in the caption of Fig. 5. Then, we should project it along Fisher's criterion line, which is a linear combination of the relevant variables identified (given in the x-axis of Fig. 5). Finally, the object will be attributed to class 1 if its standardized and projected value is larger than 0.40, otherwise it will be attributed to class 2. Even if FIS recognizes two relevant variables, the largest part of the discriminating capability in Fig. 5 is given by DUR (see the much larger coefficient for Log (DUR), compared to the one for Log (PHI), in Fisher's criterion line).

Fig. 4.
figure 4

Same as in Fig. 2 but relative to the case in which class 1 is given by swarms preceding VEI≥4 eruptions and class 2 by isolated unrest when all the features are considered. The classification was performed on the basis of the DUR and of the threshold value indicated. In total, 16 swarms are correctly classified and 2 are not

Fig. 5.
figure 5

Same as in Fig. 3, but relative to the case in which class 1 is given by swarms preceding VEI≥4 eruptions and class 2 by isolated unrest when all the features are considered. The classification was performed on the basis of DUR and PHI, which are the relevant variables identified by FIS algorithm in this case. The mean of Log (DUR) for the whole set of learning data is 1.63, and the variance is 5.2. The mean of Log (PHI) for the whole set of learning data is −0.74, and the variance is 7.2. The standardized voting objects should be projected along the Fisher's criterion line, i.e., their DUR and PHI measurements should be linearly combined as x=1.36 Log (DUR)+0.49 Log (PHI). Then, they should be attributed to class 1 if their projected values are larger than 0.40; otherwise, class 2. In total, 17 swarms are correctly classified and 1 is not

Again, we interpreted this result as an indication of the prolonged instability during unrest that precedes large explosive eruptions, compared to isolated unrest. The same considerations for the above section apply here.

VEI≥4 vs. 0≤VEI≤2

As shown in Table 4, in this case, there is no evidence of magnitude or duration difference in these two seismic swarm types, which suggests that the intrinsic characteristics of the seismic swarm may not be indicative of the eruption magnitude. This result agrees with statements from Newhall and Hoblitt (2002). Here, the only (or the most) relevant feature identified by both algorithms is REP (see Table 4). Generally, the swarms corresponding to the most explosive eruptions have a longer repose time than those related to moderate eruptions (Simkin and Siebert 1994; Newhall and Hoblitt 2002), as shown in Figs. 6 and 7 (for the case in which DUR, REP and PHI are considered in the analysis for BDT and FIS, respectively). In this case, FIS recognizes both REP and PHI as relevant variables. Should a new object need to be voted, we should first standardize its Log (REP) and Log (PHI) according to the mean and variance given in the caption of Fig. 7. Then, we should project it along Fisher's criterion line, which is a linear combination of the relevant variables identified (given in the x-axis of Fig. 7). Finally, the object will be attributed to class 1 if its standardized and projected value is larger than 0.145, otherwise, class 2. Even if FIS recognizes two relevant variables, the largest part of the discriminating capability in Fig. 7 is given by REP (see the much larger coefficient for Log (REP), compared to the one for Log (PHI), in Fisher's criterion line).

Fig. 6.
figure 6

Same as in Fig. 2 but relative to the case in which class 1 is given by swarms preceding VEI≥4, class 2 by swarms preceding 0≤VEI≤2 eruptions when the features DUR, REP and PHI are considered. The classification was performed on the basis of REP and of the threshold value indicated. In total, 41 swarms are correctly classified and 11 are not

Fig. 7.
figure 7

Same as in Fig. 3 but relative to the case in which class 1 is given by swarms preceding VEI≥4, class 2 by swarms preceding 0≤VEI≤2 eruptions when the features DUR, REP and PHI are considered. The classification was performed on the basis of REP and PHI, which are the relevant variables identified by FIS algorithm in this case. The mean of Log (REP) for the whole set of learning data is 1.77, and the variance is 7.2. The mean of Log (PHI) for the whole set of learning data is −1.76, and the variance is 10.5. The standardized voting objects should be projected along the Fisher's criterion line, i.e., their DUR and PHI measurements should be linearly combined as x=1.02 Log (REP)+0.25 Log (PHI). Then, they should be attributed to class 1 if their projected values are larger than 0.145; otherwise to class 2. In total, 41 swarms are correctly classified and 11 are not

A long repose time might indicate that the volcano system had sufficient time, since the last eruption, to re-charge the system and to achieve the closed-conduit regime. In this way the volcano can accumulate a sufficient amount of gas to give a large explosive eruption. Actually, according to Newhall and Decker (2002), most large eruptions are preceded by long repose times, but most long repose times are not followed by large explosive eruptions.

As in the two previous subsections, the parameters of the pattern have large uncertainties. However, the repose time typical for unrest followed by large explosive eruptions is of a magnitude of 10 years or longer.

Concluding remarks

The main goal of this paper is to identify common pre-eruptive patterns in worldwide volcanic unrest. For this purpose we applied non-parametric pattern recognition codes to a catalog of seismic data relative to seismic swarms recorded in volcanic areas. The use of two algorithms based on very different "philosophies" allows for a checking of the stability of the results and a reduction of the risk of overfitting. We used seismic data because they were the easiest to retrieve and because seismic information is of prominent importance in characterizing unrest in volcanic areas.

The results obtained in this study are quantitative patterns distinguishing different types of volcanic unrest. However, the still poor quality of the dataset used does not allow us to use these quantitative patterns as profitable and satisfactory rules for eruption forecasting. In particular, the limited amount of data produces large uncertainties concerning the parameters of each pattern found, and does not allow us to evaluate the performance of the patterns, i.e., the percent of missed events and false alarms concerning an independent dataset.

In any case, the results reported here provide interesting insights into the physics of the pre-eruptive processes. In particular, there is evidence of a prolonged instability in pre-eruptive periods of unrest, compared to the isolated ones, both in consideration of only large explosive eruptions (VEI ≥4) and all the eruptions with VEI≥1. In considering that no significant difference is found in the maximum magnitude recorded in these two types of swarms, a longer seismic unrest might be interpreted as an indication of an energetic difference in the processes responsible for pre-eruptive and isolated swarms. On the contrary, no significant magnitude or duration difference is found between unrest episodes preceding large explosive eruptions (VEI≥4) and moderate eruptions (0≤VEI≤2), which suggests that the energy released during precursory unrest is not a good indicator of the VEI of the impending eruption. This also may indicate that the magnitude of the eruption (i.e., the VEI) can be mostly due to random factors such as that for other complex systems like earthquakes, landslides, and so on (Bak et al. 1988). Here, although less evident, the only pattern found, compared to ones that precede small to moderate events, is based on a longer time of repose preceding the unrest occurring before the largest eruptions. The correlation to a longer repose for a large eruption might be linked to the time needed to re-charge the feeding system and to reach the state of a closed-conduit volcano. In this way, the volcano can accumulate a sufficiently large amount of gas to be able to give a large explosive eruption.

As a final consideration, we want to stress that the quality and the practical usefulness (eruption forecasting) of the results can be dramatically improved by using this kind of technique on large worldwide datasets of volcanic unrest such as the one proposed in the WOVOdat project.