Definition of the Subject

Definition

The problem of earthquake prediction is to find when and where a strong earthquake will occur. A prediction is formulated as a discrete sequence of alarms (Fig. 1). The accuracy of a prediction method is captured by probabilities of errors (false alarms and failures to predict) and by the total space-time occupied by alarms. (Sect. “Error Diagram”).

Figure 1
figure 1_246

Possible outcomes of prediction

In terms of prediction studies this is algorithmic prediction of individual extreme events having low probability but large impact. This problem is necessarily intertwined with problems of disaster preparedness, dynamics of solid Earth, and modeling of extreme events in hierarchical complex systems.

Predictability (“order in chaos”). Complex systems, lithosphere included, are not predictable with unlimited precision. However, after a coarse‐graining (i. e., in a not‐too‐detailed scale) certain regular behavior patterns emerge and a system becomes predictable, up to certain limits ([13,20,24,26,36,46,52,83]). Accordingly, earthquake prediction requires a holistic analysis, “from the whole to details”. Such analysis makes it possible to overcome the geo‐complexity itself and the chronic imperfection of observations as well.

Premonitory patterns . Certain behavior patterns emerge more frequently as a strong earthquake draws near. Called premonitory patterns, they signal destabilization of the earthquake‐prone lithosphere and thus an increase in the probability of a strong earthquake. Premonitory patterns do not necessarily contribute to causing a subsequent strong earthquake; both might be parallel manifestations of the same underlying process – the tectonic development of the Earth in multiple time-, space-, and energy- scales. For that reason premonitory patterns might emerge in a broad variety of observable fields reflecting lithosphere dynamics, and in different scales.

The algorithms considered here, based on premonitory seismicity patterns, provide alarms lasting years to months. There is ample evidence that major findings made in developing these algorithms are applicable to premonitory patterns in other fields, to predicting other geological and geotechnical disasters, and probably to determining shorter and longer alarms (Sect. “Further Goals”).

Importance

Algorithmic earthquake prediction provides pivotal constraints for fundamental understanding of the dynamics of the lithosphere and other complex systems. It is also critically important for protecting the global population, economy, and environment. Vulnerability of our world to the earthquakes is rapidly growing, due to proliferation of high-risk construction (nuclear power plants, high dams, radioactive waste disposals, lifelines, etc.), deterioration of ground and infrastructure in megacities, destabilization of environment, population growth, and escalating socio‐economic volatility of the global village. Today a single earthquake with its ripple effects may take up to a million lives; destroy a megacity; trigger a global economic depression (e. g. if it occurs in Tokyo); trigger an ecological catastrophe, rendering a large territory uninhabitable; or destabilize military balance in a region. Regions of low seismicity have become highly vulnerable, e. g. European and Indian platforms, and Central and Eastern parts of the U.S. As a result the earthquakes joined the ranks of the major disasters that, in the words of J. Wisner, have become “a threat to civilization survival, as great as was ever posed by Hitler, Stalin or the atom bomb”. Earthquake prediction is necessary to reduce the damage by escalating disaster preparedness. Predictions useful for preparedness should have known, but not necessarily high, accuracy. Such is the standard practice in preparedness for all disasters, wars included.

Introduction

Earthquakes occur in some parts of the outer shell of the solid Earth, called the lithosphere; its thickness ranges from a few kilometers nearthe mid-ocean ridges to a few hundred kilometers in certain continental regions. At many continental margins the lithosphere bends downwardpenetrating underlying mantle as seismically active subduction zones. In seismically active regions a significant part of tectonic development isrealized through the earthquakes.

About a million earthquakes with magnitude 2 (energy about 1015 erg) or more are detected each year worldwideby seismological networks. About a hundred of these cause considerable damage and few times in a decade a catastrophic earthquakeoccurs.

Catalogs of earthquakes provide the data for detecting premonitory seismicity patterns. Typically for complexity studies we do not havea complete set of fundamental equations that govern dynamics of seismicity and unambiguously define earthquake prediction algorithms. This is due tothe multitude of mechanisms controlling seismicity – see Sect. “ Generalization: Complexity andExtreme Events”. In lieu of such equations “…we have to rely upon the hypotheses obtained byprocessing of the experimental data” (A. Kolmogorov on transition to turbulence). Formulating and testing such hypotheses involvesexploratory data analysis, numerical and laboratory modeling, and theoretical studies (Sect. “ General Scheme ofPrediction”).

Diversity of methods and urgency of the problem makes learning by doing a major if not the major form of knowledge transfer in prediction ofextreme events (http://cdsagenda5.ictp.it/full_display.php?da=a06219).

Reliability of the existing algorithms has been tested by continuous prediction of future strong earthquakes innumerous regions worldwide. Each algorithm is self‐adapting, i. e. applicable without any changes in the regions with different seismicregimes. Predictions are filed in advance at the websites (http://www.mitp.ru/predictions.html;http://www.phys.ualberta.ca/mirrors/mitp/predictions.html; andhttp://www.igpp.ucla.edu/prediction/rtp/).

Following is the scoring for four different algorithms.

  • Algorithms M8 [32] and MSc [44] (MSc stands for the Mendocino Scenario). Algorithm M8 gives alarms with characteristic duration years. MSc gives a second approximation to M8, reducing the area of alarm. An example of their application is shown in Fig. 2.

Figure 2
figure 2_246

Prediction of the Sumatra earthquake, June 4th, 2000, \( { M = 8.0 } \) by algorithms M8 and MSc. The orange oval curve bounds the area of alarm determined by algorithm M8, the red rectangle is its reducing made by algorithm MSc. Circles show epicenters of the Sumatra earthquake and its aftershocks. After [43]

Continually applied since 1992, algorithm M8 has predicted 10 out of 14 large earthquakes (magnitude 8 or more) which have occurred in the majorseismic belts. Alarms occupied altogether about 30% of the time-space considered. Both algorithms applied together reduced the time-space alarms to 15%,but three more target earthquakes were missed by prediction.

  • Algorithm SSE or Second Strong Earthquake [43,91]. Its aim is to predict whether or not a second strong earthquake will follow the one that had just occurred. An alarm lasts 18 months after the first strong earthquake. An example of prediction is shown in Fig. 3. Testing by prediction in advance is set up for California, Pamir and Tien Shan, Caucasus, Iberia and Maghreb, the Dead Sea rift, and Italy. Since 1989 this algorithm made 29 predictions; 24 of which were correct and 5 were wrong.

Figure 3
figure 3_246

Prediction of the Northridge, California earthquake, January 28th, 1994, \( { M = 6.8 } \) by algorithm SSE. The prediction was made by analysis of aftershocks of the Landers earthquake, June 28th, 1992, \( { M = 7.6 } \). An earthquake with \( { M = 6.6 } \) or larger was expected during the 18 months after the Landers earthquake within the 169-km distance from its epicenter (shown by a circle). The Northridge earthquake occurred on January 28th, 1994, 20 days after the alarm expired. After [43]

These scores demonstrate predictability of individual earthquakes. A predictions' accuracy is indeed limited, but sufficient to preventa considerable part of the damage.

  • Algorithm RTP or Reverse Tracing of Precursors [37,81]. This algorithm gives alarms with a characteristic duration of months. An example of this prediction is shown in Fig. 4. Testing by prediction in advance started only few years ago for California, Japan, the Northern Pacific, Eastern Mediterranean, and Italy with adjacent areas.

Figure 4
figure 4_246

Prediction of Simushir, Kuril Islands earthquakes, November 15th, 2006, \( { Mw = 8.3 } \) and January 13th, 2007, \( { Mw=8.2 } \) by Algorithm RTP. An earthquake with magnitude \( { Mw \geqslant 7.2 } \) is predicted to occur within the time interval from September 30th, 2006, to June 30th, 2007 in the area bordered by thered curve. The reddots show epicenters of an earthquake‐formingpremonitory chain. The blue stars show epicenters of the predicted earthquakes

Perspective. It is encouraging that only a small part of readily available relevant data, models andtheories have been used for prediction so far. This suggests a potential for a substantial increase of prediction accuracy.

Lithosphere as a Hierarchical Complex System

Two major factors turn the lithosphere into a hierarchical dissipative complex system [29,36,87]. The firstone is a hierarchical structure extending from tectonic plates to grains of rocks. The second factor is instability caused by a multitude of nonlinear mechanisms destabilizing the strength and stress fields.

Among extreme events in that system are the strong earthquakes. An earthquake may be an extreme event in a certainvolume of the lithosphere and a part of the background seismicity in a larger volume.

Structure

Blocks

The structure of the lithosphere presents a hierarchy of volumes, or blocks, which move relative to each other. The largest blocks are the major tectonic plates, of continental size. They are divided into smaller blocks, such as shields or mountain belts. After 15–20 consecutive divisions we come to about 1025 grains of rocks of millimeter size.

Boundary zones

Blocks are separated by relatively thin and less rigid boundary zones. They are called fault zones high in the hierarchy, then faults, sliding surfaces, and, finally, interfaces between grains of rock. Except at the bottom of the hierarchy, a boundary zone presents a similar hierarchical structure with more dense division. Some segments of the boundary zones, particularly in tectonically young regions, might be less explicitly expressed, presenting a bundle of small ruptures not yet merged into a fault, of a flexure not yet ruptured, etc.

Nodes

These are even more densely fractured mosaic structures formed around the intersections and junctions of boundary zones. Their origin is due, roughly saying, to collision of the corners of blocks [16,39,40,55]. The nodes play a singular role in the dynamics of the lithosphere. A special type of instability is concentrated within the nodes and strong earthquakes nucleate in nodes. The epicenters of strong earthquakes worldwide are located only within some specific nodes that can be identified by pattern recognition [19,22].

Nodes are well known in the structural geology and geomorphology and play a prominent textbook role in geological prospecting. However their connection with earthquakes is less widely recognized.

The formalized procedure for dividing a territory into blocks ⇒ faults ⇒ nodes is given in [2].

Fault Network  – A Stockpile of Instability

For brevity, the systems of boundary zones and nodes are called here fault networks. They range from the Circum Pacific seismic belt, with the giant triple junctions for the nodes, to interfaces between the grains of rocks, with the corners of grains for the nodes. Their great diversity notwithstanding, fault networks play a similar role in the lithosphere dynamics. Specifically, while tectonic energy is stored in the whole volume of the lithosphere and well beneath, the energy release is to a large extent controlled by the processes in relatively thin fault networks. This contrast is due to the following.

First, the strength of a fault network is smaller than the strength of blocks it separates: fault networks are weakened by denser fragmentation and higher permeability to fluids. For that reason, tectonic deformations are concentrated in fault networks, whereas blocks move essentially as a whole, with a relatively smaller rate of internal deformations. In other words, in the time scale directly relevant to earthquake prediction (hundreds of years or less) the major part of the lithosphere dynamics is realized through deformation of fault networks and relative movement of blocks.

Second, the strength of a fault network is not only smaller, but also highly unstable, sensitive to many processes there. There are two different kinds of such instability. The “physical” one is originated at the micro level by a multitude of physical and chemical mechanisms reviewed in the next section. “Geometric” instability is originated at a macro level controlled by the geometry of the fault network (Sect. “Geometric Instability”). These instabilities largely control dynamics of seismicity, the occurrence of strong earthquakes included.

“Physical ” Instability [23,29]

As in any solid body, deformations and fracturing in the lithosphere are controlled by the relation of the strength field and stress field. The strength is in turn controlled by a great multitude of interdependent mechanisms concentrated in the fault network. We describe, for illustration, several such mechanisms starting with the impact of fluids.

Rehbinder Effect, or Stress Corrosion [14,85]

Mechanism

Many solid substances lose their strength when they come in contact with certain surface‐active liquids. The liquid diminishes the surface tension μ and consequently the strength, which is proportional to \( { \sqrt \mu } \) by the Griffiths criterion. When the strength drops, cracks may emerge under small stress. Then liquid penetrates the cracks and they grow, with drops of liquid propelling forward, until they dissipate. This greatly reduces the stress required to generate the fracturing. Stress corrosion was first discovered for metals and ceramics. Then such combinations of solid substances and surface‐active liquids were recognized among the common ingredients of the lithosphere, e. g. basalt and sulphur solutions. When they meet, the basalt is permeated by a grid of cracks and the efficient strength may instantly drop by a factor of 10 or more due to this mechanism alone.

Geometry of Weakened Areas

Orientation of such cracks at each point is normal to the main tensile stress. The stress field in the lithosphere may be very diverse. However, the shape of weakened areas where the cracks concentrate may be of only a few types, determined by the theory of singularities. Some examples are shown in Fig. 5, where thin lines show the trajectories of cracks; each heavy line is a separatrix, dividing the areas with different patterns of trajectories.

Figure 5
figure 5_246

Instability caused by stress corrosion. The geometry of weakened areas depends on the type of singularity and the place where the chemically active fluid comes in. After [14]

If a liquid infiltrates from a place shown in Fig. 5 by arrows, the cracks concentrate in the shaded area, and its strength plummets. A slight displacement of the source across the separatrix may strongly change the geometry of such fatigue; it may be diverted to quite a different place and take quite a different shape, although not an arbitrary one. Furthermore evolution of the stress field may change the type of a singularity, make it disappear or create a new one, and the geometry of fatigue will follow suit.

Stress Corrosion is Highly Sensitive to Geochemistry of Fluids

For example, gabbro and dolerite are affected only in the presence of iron oxides; Kamchatka ultrabasic rocks are affected by the andesite lava liquids only in the presence of copper oxide, etc. Migration of fluids would cause observable variations of electromagnetic and geochemical fields.

Summing Up

Stress corrosion brings into lithosphere a strong and specific instability, which may explain many observed premonitory seismicity patterns. However the basic configurations of fatigue, as shown in Fig. 5 might be realizable only in not‐too‐large areas. This limitation stems from the dissipation of fluids and/or from the inhomogeneity of stress field.

Other Mechanisms

Boundary zones feature several other mechanisms, potentially as important and certainly as complicated. A few more examples follow.

Mechanical Lubrication

by fluids migrating through a boundary zone [7]. The ensuing instability will be enhanced by fingers of fluids springing out at the front of migration [6].

Dissolution of Rocks

Its impact is magnified by the Rikke effect – an increase of solubility of rocks with pressure. This effect leads to a mass transfer. Solid material is dissolved under high stress and carried out in solution along the stress gradient to areas of lower stress, where it precipitates. The Rikke effect might be easily triggered in a crystalline massif at the corners of rock grains, where stress is likely to concentrate.

Petrochemical Transitions

Some of them tie up or release fluids, as in the formation or decomposition of serpentines. Other transitions cause a rapid drop of density, such as in the transformation of calcite into aragonite. (This would create a vacuum and unlock the fault; the vacuum will be closed at once by hydrostatic pressure, but a rupture may be triggered.).

Instability is created also by sensitivity of dynamic friction to local physical environment [50], mechanical processes, such as multiple fracturing, buckling, viscous flow, and numerous other mechanisms [49,70].

Most of the above mechanisms are sensitive to variations of pressure and temperature.

Geometric Instability [16]

The geometry of fault networks might be, and often is, incompatible with kinematics of tectonic movements, including earthquakes. This leads to stress accumulation, deformation, fracturing, and the changeof fault geometry, jointly destabilizing the fault network. Twointegral measures of this instability, both concentrated in the nodes, are geometric and kinematic incompatibility [16].

Each measure estimates the integrated effect of tectonic movements in a wide range of time scales, from seismicity to geodetic movements to neotectonics.

Geometric Incompatibility

The intersection of two strike-slip faults separating moving blocks. Figure 6 is a simple example of geometric incompatibility. If the movements indicated by arrows in Fig. 6a could occur, the corners A and C would penetrate each other and an intersection point would split into a parallelogram (Fig. 6c). In the general case of a finite number of intersecting faults their intersection point would split into a polygon. Such splitting is not possible in reality; the collision at the corners leads to the accumulation of stress and deformations near the intersection followed by fracturing and changes of faultgeometry. The divergence of the corners will be realized by normal faulting.

Figure 6
figure 6_246

Geometric incompatibility near a single intersection of faults. a, b initial position of the blocks; c, d extrapolation of the blocks' movement; a, c the locked node: movement is physically unrealizable without fracturing or a change in the fault geometry; b, d the unlocked node. After [16]

The expansion of that unrealizable polygon with time, \( { S(t) = Gt^{2}/2 } \), measures the intensity of this process. Here, S is the area of the polygon, determined by the slip rates on intersecting faults; t is the elapsed time from the collision, and G is the measure of geometric incompatibility.

Such incompatibility of structure and kinematics was first described in [55] for a triple junction. The study established a condition under which a single junction can retain its geometry as the plates move, so that the stress will not accumulate. It was suggested in [39,40] that the general case, when that condition is not satisfied, the ensuing fracturing would not dissolve the stress accumulation, but only redistribute it among newly formed corners. This triggers further similar fracturing with the result that a hierarchy of progressively smaller and smaller faults is formed about an initial intersection. This is a node, recognizable by the dense mosaic structure, with probably self‐similar fractal geometry [39].

A real fault network contains many interacting nodes. Incompatibility G is additive, and can be estimated for a network as a whole. An analogue of the Stokes theorem connects the total value of G within a territory with observations on its boundary. This removes the nearly impossible task – to take into account complex internal structure of the nodes. One can instead surround the system of nodes by a contour crossing the less complicated areas. Then the geometric incompatibility can be realistically evaluated from the movements of the fewer faults that cross the contour.

Geometric incompatibility in different nodes isinterdependent, because they are connected through the movements ofblocks‐and‐faults system. A strong earthquake in a node would redistribute values G in other nodes thus affecting the occurrence of earthquakes there. Observations indicating the interaction of nodes have been described by [73,74]. These studies demonstrate phenomenon of long‐range aftershocks: a rise of seismic activity in the area, where the next strong earthquake is going to occur within about 10 years.

So far, the theory of geometric incompatibility has been developed for the two‐dimensional case, with rigid blocks and horizontal movements.

Kinematic Incompatibility

Relative movements on the faults would be in equilibrium with the absolute movements of blocks separated by these faults (one could be realized through the other) under the well known Saint‐Venant condition of kinematic compatibility [8,56,57]. In the simplest case, shown in Fig. 6, this condition is \( { K=\sum v_{i} = 0 } \), where v i are slip rates on the faults meeting at the intersection (thin arrows in Fig. 6). The value of K is the measure of the kinematic incompatibility, causing accumulation of stress and deformation in the blocks. A simple illustration of that phenomenon is the movement of a rectangular block between two pairs of parallel faults. The movement of the block as a whole has to be compensated for by relative movements on all the faults surrounding it: if, for example, the movement takes place on only one fault, the stress will accumulate at other faults and within the block itself thus creating kinematic incompatibility.

Like geometric incompatibility the values of K are also additive: one may sum up values at different parts of the network. And an analogue of the Stokes theorem links the value of K for a region with observations on its boundary.

Generalization: Complexity and Extreme Events

Summing up, dynamics of the lithosphere is controlled by a wide variety of mutually dependent mechanisms concentrated predominantly within fault networks and interacting across and along the hierarchy. Each mechanism creates strong instability of the strength‐stress field, particularly of the strength. Except for very special circumstances, none of these mechanisms alone prevails in the sense that the others can be neglected.

Even the primary element of the lithosphere, a grain of rock, may act simultaneously as a material point, a viscoelastic body, an aggregate of crystals, a source or absorber of energy, fluids, volume, with its body and surface involved in different processes.

Assembling the set of governing equations is unrealistic and may be misleading as well: A well‐known maxim in nonlinear dynamics tells that one cannot understand chaotic system by breaking it apart [12]. One may rather hope for a generalized theory (or at least a model), which directly represents the gross integrated behavior of the lithosphere. That brings us to the concept that the mechanisms destabilizing the strength of fault networks altogether turn the lithosphere into a nonlinear hierarchical dissipative system, with strong earthquakes among the extreme events. At the emergence of that concept the lithosphere was called a chaotic system [29,66,87]; the more general term is complex system [20,24,31,53,78,83].

General Scheme of Prediction

Typically for a complex system, the solid Earth exhibits a permanent background activity, a mixture of interacting processesproviding the raw data for earthquake prediction. Predictions considered here are based on detecting premonitory patterns of that activity(Sect. “Definition”).

Pattern Recognition Approach

Algorithms described here consider prediction as the pattern recognition problem: Given the dynamics of relevant fields in a certain area prior to some time t, to predict whether a strong earthquake will or will not occur within that area during the subsequent time interval (t, \( { t+\Delta) } \). Some algorithms also reduce the area where it will occur.

In terms of pattern recognition, the object of recognition is the time t. The problem is to recognize whether it belongs or not to the time interval Δ preceding a strong earthquake. That interval is often called the TIP (an acronym for the time of increased probability of a strong earthquake). Such prediction is aimed not at the whole dynamics of seismicity but only at the rare extraordinary phenomena, strong earthquakes.

Pattern recognition of rare events proves to be very efficient in that approach to prediction. This methodology has been developed by the school of I. Gelfand for the study of rare phenomena of complex origin [9,19,34,71].

Data Analysis

Prediction algorithms are designed by analysis of the learning material – a sample of past critical events and the time series hypothetically containing premonitory patterns. Analysis comprises four following steps:

  1. 1.

    Detecting premonitory patterns. Each time series considered is robustly described by the functionals \( { F_{k}(t) } \), \( { k = 1 } \), 2, …, capturing hypothetical patterns (Fig. 7). Hypotheses on what these patterns may be are provided by universal modeling of complex systems (Sect. “Fourth Paradigm: Dual Nature of Premonitory Phenomena”), modeling of Earth‐specific processes, exploratory data analysis,and practical experience, even if it is intuitive. Pattern recognition of rare events is an efficient common framework for formulating and testing such hypotheses, their diversity notwithstanding.

    Figure 7
    figure 7_246

    General scheme of prediction. After [29]

    With a few exceptions the functionals are defined in sliding time windows; the value of a functional is attributed to the end of the window. In the algorithms described here the time series were earthquake sequences.

  2. 2.

    Discretization . Emergence of a premonitory pattern is defined by the condition \( { F_{k}(t) \geqslant C_{k} } \). The threshold C k is chosen is such a way that a premonitory pattern emerges on one side of the threshold more frequently then on another side. That threshold is usually defined as a certain percentile of the functional F k . In such robust representation of the data pattern recognition is akin to exploratory data analysis developed in [86].

  3. 3.

    Formulating an algorithm. A prediction algorithm will trigger an alarm when a certain combination of premonitory patterns emerges. This combination is determined by further application of pattern recognition procedures [36,71].

  4. 4.

    Estimating reliability of an algorithm . This is necessary, since an algorithm inevitably includes many adjustable elements, from selecting the data used for prediction and definition of prediction targets, to the values of numerical parameters. In lieu of the closed theory a priori determining all these elements they have to be adjusted retrospectively, by predicting the past extreme events. That creates the danger of self‐deceptive data‐fitting: If you torture the data long enough, it will confess to anything. Validation of the algorithms requires three consecutive tests.

    • Sensitivity analysis: varying adjustable elements of an algorithm.

    • Out of sample analysis: applying an algorithm to past data that has not been used in the algorithm's development.

    • Predicting in advance – the only decisive test of a prediction algorithm.

Such tests take a lion's share of data analysis [17,19,36,93]. A prediction algorithm makes sense only if its performance is (i) sufficiently better than a random guess, and (ii) not too sensitive to variation of adjustable elements. Error diagrams described in the next section show whether these conditions are satisfied.

Error Diagram

Definition

An error diagram shows three major characteristics of a prediction's accuracy. Consider an algorithm applied to a certain territory during the time period T. During the test N strong earthquakes have occurred there and N m of them have been missed by alarms. Altogether, A alarms have been declared and A f of them happened to be false. The total duration of alarms is D.

Performance of an algorithm is characterized by three dimensionless parameters: the relative duration of alarms, \( { \tau =D/T } \); the rate of failures to predict, \( { n=N_{m}/N } \); and the rate of false alarms, \( { f = A_{f}/A } \). These three parameters are necessary in any test of a prediction algorithm regardless of a particular methodology. They are juxtaposed on the error diagrams schematically illustrated in Fig. 8. Also called Molchan diagrams, they are used for validation and optimization of prediction algorithms and for joint optimization of prediction and preparedness [59,60,61,62,63]. In many applications parameter f is not yet considered. In early applications they are called ROC diagrams for relative operating characteristics (e. g., [54]).

Figure 8
figure 8_246

Scheme of an error diagram. Each point shows the performance of a prediction method: the rate of failures to predict, n, the relative duration of alarms, τ, and the rate of false alarms, f. Different points correspond to different algorithms. The diagonal in the left plot corresponds to the random guess. Point A corresponds to the trivial optimistic strategy, when an alarm is never declared; point B marks the trivial pessimistic strategy, when an alarm takes place all the time; other points correspond to non‐trivial predictions. Best combinations (n, τ) lie on the envelope of these points Γ. After [63]

Four Paradigms

Central for determining premonitory patterns is what we know about them a priori. In other words – what are a prioriconstraints on the functionals \( { F_{k}(t) }\) that would capture these patterns (Sect. “ DataAnalysis”). These constraints are given by the four paradigms described in this section. They have been first found in the quest forpremonitory seismicity patterns in the observed and modeled seismicity. There are compelling reasons to apply them also in a wide variety ofprediction problems.

Prehistory. New fundamental understanding of the earthquake prediction problem was formed during the last 50 orso years, triggering entirely new lines of research. In hindsight this understanding stems from the following unrelated developments in the earlysixties.

  • F. Press initiated the installation of the state‐of‐the‐art World-Wide Standardized Seismographic Network (WWSSN) later on succeeded by the Global Seismographic Network (GSN). Thus a uniform data base began to accumulate, augmented by expanding satellite observations.

  • E. Lorenz discovered deterministic chaos in an ordinary natural process, thermal convection in the atmosphere [51]. This triggered recognition of deterministic chaos in a multitude of natural and socio‐economic processes; however, the turn of seismicity and geodynamics in general came about 30 years later [4,29,66,87]. The phenomenon of deterministic chaos was eventually generalized by less rigorously defined and more widely applicable concept of complexity [20,24,25].

  • I. Gelfand and J. Tukey, working independently, created a new culture of exploratory data analysis that allows coping with the complexity of a process (e. g., [19,86]).

  • R. Burridge and L. Knopoff [11] demonstrated that a simple system of interacting elements may reproduce a realistically complex seismicity, fitting many basic heuristic constraints. The models of interacting elements developed in statistical physics extended to seismology.

  • L. Malinovskaya found a premonitory seismicity pattern reflecting the rise of seismic activity [33]. This is the first reported earthquake precursor formally defined and featuring long‐range correlations and worldwide similarity.

With broader authorship:

  • Plate tectonics established the connection between seismicity and large-scale dynamics of the lithosphere [41].

  • Research in experimental mineralogy and rocks mechanics revealed a multitude of mechanisms that may destabilize the strength in the fault zones [70].

First Paradigm: Basic Types of Premonitory Patterns

The approach of a strong earthquake is indicated by the following premonitory changes in the basic characteristics of seismicity:

  • Rising: Seismic activity, earthquakes clustering in space‐time, earthquake correlation range, and irregularity of earthquake sequences. Rise of activity sometimes alternates with seismic quiescence.

  • Transforming: Magnitude distribution (the Gutenberg–Richter relation). Its right end (at larger magnitudes) bends upward, and left end bends downward.

  • Reversing: territorial distribution of seismicity.

  • Patterns of two more kinds yet less explored: Rising response to excitation and decreasing dimensionality of the process considered (i. e. rising correlation between its components).

These patterns resemble asymptotic behavior of a thermodynamical system near the critical point in phase transition. Some patterns have been found first in observations and then in models; other patterns have been found in the opposite order. More specifics are given in [15,17,30,31,35,36,67,79,80,83,84,93].

Patterns capturing rise of intensity and clustering, have been validated by statistically significant predictions of real earthquakes [43,65]; other patterns undergo different stages of testing.

Second Paradigm: Long‐Range Correlations

The generation of an earthquake is not localized about its future source. A flow of earthquakes is generated by a fault network, rather than each earthquake – by a segment of a single fault. Accordingly, the signals of an approaching earthquake come not from a narrow vicinity of the source but from a much wider area.

What is the size of such areas? Let M and L(M) be the earthquake magnitude and the characteristic length of its source, respectively. In the intermediate‐term prediction (on a time scale of years) that size may reach \( { 10L(M) } \); it might be reduced down to \( { 3L } \) or even to L in a second approximation [43]. On a time scale of about 10 years that size reaches about \( { 100L } \). For example, according to [71], the Parkfield (California) earthquake with M about 6 and \( { L \approx 10 } \) km “… is not likely to occur until activity picks up in the Great Basin or the Gulf of California”, about 800 km away.

Historical perspective. An early estimate of the area where premonitory patterns are formed was obtained in [33] for a premonitory rise of seismic activity. C. Richter, who was sceptical about the feasibility of earthquake prediction, made an exception to that pattern, specifically because it was defined in large areas. He wrote [75]: “… It is important that (the authors) confirm the necessity of considering a very extensive region including the center of the approaching event. It is very rarely true that the major event is preceded by increasing activity in its immediate vicinity.”

However, such spreading of premonitory patterns has been often regarded as counterintuitive in earthquake prediction research on the grounds that earthquakes can't trigger each other at such distances. The answer is that earthquakes forming such patterns do not trigger each other but reflect an underlying large‐scale dynamics of the lithosphere. Among the indisputable manifestations of that correlation are the following phenomena: migration of earthquakes along fault zones [47,52,58,90]; alternate rise of seismicity in distant areas [71] and even in distant tectonic plates [76]. Global correlations have been found also between major earthquakes and other geophysical phenomena, such as Chandler wobble, variations of magnetic field, and the velocity of Earth's rotation [34,72]. These correlations may be explained by several mechanisms not mutually exclusive. Such mechanisms range from micro‐fluctuations of large scale tectonic movements to impact of migrating fluids (e. g., [1,5,7,10,69,71,82,84,89]).

Third Paradigm: Similarity

Premonitory phenomena are similar (identical after normalization) in the extremely diverse environments and in a broad energy range (e. g., [1,33,36]). The similarity is not unlimited however and regional variations of premonitory phenomena do emerge.

Normalized prediction algorithms retain their prediction power in active regions and platforms, with the magnitude of target earthquakes ranging from 8.5 to 4.5. Furthermore, similarity extends to induced seismicity, and to multiple fracturing in engineering constructions and laboratory samples (e. g., [3,35,43]). Ultimately, a single but explicit demonstration of similarity was obtained for starquakes – ruptures of the crust of neutron star [45], where the conditions are extremely different than in the Earth. Altogether the corresponding elastic energy release ranges from ergs to 1025 ergs (even to 1046 ergs if the starquake is counted in).

However, the performance of prediction algorithms does vary from region to region (see [21,35,63]). It is not yet clear whether this is due to imperfect normalization, or to limitations on similarity itself.

Fourth Paradigm: Dual Nature of Premonitory Phenomena

Some premonitory patterns are “universal”, common for hierarchical complex systems of different origin; other are specific to geometry of fault networks or to a certain physical mechanism controlling the strength and stress fields in the lithosphere.

Universal patterns. These are most of the patterns so far known. They can be reproduced on models not specific to the Earth only, e. g. models of a statistical physics type (direct or inverse cascade, colliding cascades, percolation, dynamical clustering), models of critical phenomena in fluid dynamics, as well as Earth‐specific models themselves.

Complete analytical definition of premonitory patterns was obtained recently on the branching diffusion model [18]. Definition includes only three control parameters, thus strongly reducing uncertainty in data analysis (Sect. “Data Analysis”).

Reviews of such models can be found in [15,17,36,66,83,89,93]. Discussion of particular patterns is given also in [25,42,67,68,88,92].

An example of an earthquake sequence generated by a universal model is shown in Fig. 9 [17]. The modeled seismicity exhibits major features of real seismicity: seismic cycle, switching of seismic regime, the Gutenberg–Richter relation, foreshocks and aftershocks, long‐range correlation, and, finally, the premonitory seismicity patterns.

Figure 9
figure 9_246

Synthetic earthquake sequence consecutively zoomed. Shaded areas mark zoomed intervals. The model shows the rich variety of behavior on different timescales. Note that the ratio of timescales for the top and bottom panels is 102. After [17]

Earth‐specific patterns are not yet incorporated in prediction algorithms. We discuss here the patterns reflecting the state of the nodes – structures where the strong earthquakes are nucleated (see Sect. “Structure”). Quantitative characteristics of that state are geometric incompatibility G (Sect. “Geometric Instability”). It shows whether the nodes are locked up or unlocked and quantifies their tendency to fracture and change of the faults geometry. Change of G might create or dissolve such feature as asperities, relaxation barriers, weak links, and replacement of seismicity by creep or “silent” earthquakes [16]. These features would migrate from node to node with velocity typical of seismicity migration: tens to hundreds km/year [90]. All this makes monitoring of G highly relevant to detecting premonitory patterns. A simple pattern of that kind is seismic quiescence around the soon-to-break nodes (e. g., [44,58,77]).A simple highly promising possibility is considering separatelypremonitory phenomena inside and outside of nodes (e. g., [77]).

Earthquake Prediction and Earthquake Preparedness

Given the limited accuracy of predictions, how do we use them for damage reduction ? The key to this is to escalate or de‐escalate preparednessdepending on the following: content of the current alarm (what and where is predicted), probability of a false alarm, and cost/benefit ratio ofdisaster preparedness measures. Prediction might be useful if its accuracy is known, even if it is not high. Such isthe standard practice in preparedness for all disasters, war included.

Diversity of Damage

Earthquakes hurt population, economy, and environment in very different ways: destruction of buildings, lifelines, etc; triggering fires; releasing of toxic, radioactive and genetically active materials; triggering other natural disasters, such as floods, avalanches, landslides, tsunamis, etc.

Equally dangerous are the socio‐economic and political consequences of earthquakes: disruption of vital services (supply, medical, financial, law enforcement, etc.), epidemics, drop of production, slowdown of economy, unemployment, disruptive anxiety of population, profiteering and crime. The socio‐economic consequences may be inflicted also by the undue release of predictions.

Different kinds of damage are developing at different time and space scales, ranging from immediate damage to chain reaction, lasting tens of years and spreading regionally if not worldwide.

Diversity of Disaster Preparedness Measures

Such diversity of damage requires a hierarchy of disaster preparedness measures, from building code and insurance to mobilization of post disaster services to red alert. It takes different times, from decades to seconds to undertake different measures; having different cost they can be maintained for different time periods; and they have to be spread over different territories, from selected sites to large regions. No single stage can replace another one for damage reduction and no single measure is sufficient alone.

On the other hand many important measures are inexpensive and do not require high accuracy of prediction. An example is the Northridge, California, earthquake, 1994, which caused economic damage exceeding $30 billion. Its prediction, published well in advance [48], was notprecise – the alarm covered a time period of 18 monthsand an area 340 km in diameter with dramatically uneven vulnerability. However, low-cost actions, undertaken in response to this prediction (e. g. an out of turn safety inspection) would be well justified if even just a few percent of the damage were prevented.

Joint Optimization of Prediction and Preparedness

The choice of preparedness measures is by no means unique. Different measures may supersede or mutually exclude one another, leaving the decision‐maker a certain freedom of choice [38]. The definition of the prediction algorithm is not unique either. The designer of the algorithm has certain freedom to choose the tradeoff between different characteristics of its accuracy (rate of failures to predict, duration of alarms, and rate of failures to predict) by varying adjustable elements of the algorithm (Sect. “General Scheme of Prediction”). That leads to the problem, typical for decision‐making with incomplete information: to optimize jointly prediction and preparedness. Figure 10 shows the scheme of such optimization. This figure shows also advantages of a new formulation of prediction: parallel applications of several versions of an algorithm.

Figure 10
figure 10_246

Joint optimization of prediction and preparedness based on the theory of optimal control. Dots show points on the error diagram. Γ is their envelope. Thin contours (γ) show loss curves with constant value of a prevented loss. Optimal strategy is the tangent point of contours Γ and γ. After [63]

Further discussion can be found in [27,28,63,64].

Further Goals

Particularly encouraging for further earthquake prediction research is the wealth of relevant data, models, and theories that are available and yetuntapped (the want amidstplenty pattern, Conference and School on Predictability ofNatural Disasters for our Planet in Danger. A System View:Theory, Models, Data Analysis, 25 June – 6 July 2007, Trieste, ICTP, http://cdsagenda5.ictp.it/full_display.php?ida=a06204). Likely within reach isa new generation of prediction algorithms, about five- to ten‐fold more accurate than existing ones.

In the general scheme of things, this is a part ofwider developments: Emergence of the newly integrated dynamics of the solid Earth, extendingfrom a fundamental concept succeeding plate tectonics to predictive understanding and (with luck) control of geological and geotechnicaldisasters. And predictive understanding of extreme events (critical phenomena) in the complex systems formed, separately and jointly, by nature andsociety.