Keywords

1 Introduction

Environmental scientists are concerned with multidisciplinary problems that incorporate various scales in space and time. Doing, for instance, an extreme weather condition forecast on a local scale, meteorologists use data from regional studies to define a background from which the initial model and boundary conditions can be derived. Similarly, regional weather forecasts use global atmospheric models. In the same way short-time weather predictions with high temporal resolution use long-time forecasts as background. Analogous considerations hold also, e.g., for hydrogeological studies, seismological risk management and monitoring of pollution on land and in the seas. But it is not only the different scales that are interwoven; environmental problems are seldom described by a single scientific discipline. For instance, a hydrogeological study might also include meteorological models, geological and geophysical data, and geographic information. The Earth must be seen as holistic system, and therefore, the collaboration between scientists in different places and from different disciplines is useful and necessary. The grid/cloud computing paradigm was developed during the last two decades as an answer to these challenges, to facilitate collaboration and sharing of data, means, methods, and results.

A second important factor that triggered this development was the sheer amount of available data and its continuously improving quality. Today the size of datasets and the computational demand for creating and analyzing them increased dramatically. While in the year 2000 datasets of some hundreds of megabytes were a typical size, we work with terabytes of data today. The current limit for datasets that still can be managed and processed in a reasonable time lies in the order of exabytes. Scientific fields where this limit was reached are, for instance, meteorology, genomics, connectomics, complex physics simulations, and biological and environmental research (Reichman et al. 2011). The main reasons why datasets and especially environmental data grew so much are the ubiquitous mobile information-sensing devices, aerial sensory technologies (remote sensing), software logs, cameras, microphones, radio-frequency identification readers, and wireless sensor networks (Hilbert and López 2011). According to IBM, every day, 2.5 quintillion bytes of data is created—so much that 90 % of the data in the world today has been created in the last 2 years alone. For this so-called big data, big data platforms are needed that are often shared by multiple private and/or governmental organizations.

Even though also the computing power of supercomputers has increased from teraflops as used at the end of the last century to petaflops today, this increase could not keep pace with the increase of data and storage. However, the even faster increase of network connectivity could compensate for this. Research centers and universities in all parts of the world are today interconnected with fast broadband Internet connections.

For environmental science, i.e., mainly public and governmentally funded research, sharing of computing resources, applications, and storage is the natural solution that reflects the scale and complexity of the posed problems and the need for holistic solutions. The huge advances in network and communication technology that have been achieved so far strongly support this kind of effort. Over the last decades network performance doubled every 9 months, resulting in an improvement of six orders of magnitude within the last 17 years (Foster 2002). Since the exponential increase of storage capacity (doubling time 12 months) and in particular of computing power (doubling time 18 months) was falling behind the development in communication bandwidth, spatially distributed computation and storage grids became more and more attractive. As a consequence, grid computing emerged in both early 1990s as the most efficient answer to the global challenges of the twenty-first century that are threatening both natural environment and human society (Cossu et al. 2010; Hoffa et al. 2008). The basic idea was to create a distributed computing infrastructure able to provide computation in a similar way as electrical power or water is provided today. Ten years later, cloud computing emerged, mainly driven by the rise of virtualization technology, utility computing, and the fast evolution of commercial web services, as the successor of the so far coexisting standards: concurrent, parallel, distributed, and grid computing (Udoh 2011). Hashemi and Bardsiri (2012) define cloud computing as a model for enabling ubiquitous and convenient on-demand network access to a shared pool of configurable computing resources that can be allocated and released with minimal management effort. Apart from the historical development, it is difficult today to draw a clear line between grid and cloud computing. According to Hoffa et al. (2008), they became today indistinguishable from each other.

The main fields of applications of grid/cloud infrastructures are as follows: science portals, distributed computing, large-scale data analysis, computer in a loop instrumentation, and collaborative work (Foster 2002). In the following, two examples will be presented, both developed under the GRIDA3 project (Murgia et al. 2009; Lecca et al. 2011) and follow-up initiatives. The first example describes an innovative approach to real-time seismic data processing for near-surface imaging, based on combining open-source state-of-the-art processing software and cloud computing technology that allows the effective use of distributed computation and data management with administratively distant resources. We will discuss how user-side demanding hardware and software requirements can be substituted by remote access to high-performance cloud computing facilities. As a result, data processing can be done quasi in real time, being ubiquitously controlled via the Internet by a user-friendly web-browser interface. To demonstrate the functionality of this portal, we will present processing results for two different types of data obtained from seismic reflection- and multi-offset ground-penetrating radar experiments.

The second cloud computing portal presented in this chapter consists in an integrated system for weather and wildfire propagation forecast. The service may be used by the authorized personnel of fire-fighting units as an aid to coordinate their actions during an emergency, as a tool for training of personnel, and to provide scenarios that can be used for prevention activities. As an example for the application of this portal, we show the simulation of a real wildfire that has caused severe damage and compare the results with the effects of the real event. Another promising example from geoscience was recently published by Versteeg and Johnson (2013). Finally, in the conclusion, we will evaluate the completed tasks and discuss future developments.

2 GRIDA3: A Shared Resources Manager for Environmental Data Analysis and Applications

The GRIDA3 (shared resources manager for environmental data analysis and applications) project was started in 2006 as a multidisciplinary research project funded by the Italian Research Ministry. The goal was to develop an integrated grid computing environment able to deliver solutions to current environmental challenges that serves for a wide range of users, from decision makers without technological expertise to technical and scientific experts. The main idea was to manage the complexity and size of environmental systems by setting up a single web portal that allows integration and sharing of data, skills, and human know-how; high-performance computing (HPC) resources, sensors, and instrumentation; and scientific applications for simulation, inversion, and visualization over multiple sites across federated domains. The target applications are restoration of polluted sites, sustainable use of natural resources, real-time imaging of near-surface structures, high-precision forecast of extreme meteorological events, and real-time forest fire simulation. From the user point of view, the GRIDA3 portal (http://grida3.crs4.it) constitutes the entry point from where users can access different grid-empowered science sub-portals. Its web pages are built on Web 2.0 technology interacting with the grid computing framework EnginFrame that provides a simple to use, efficient, and stable infrastructure to access grid computing resources via the Internet. The EnginFrame environment acts as an agent between web server and middleware, here the load sharing facility (LSF), and fulfills tasks like user authentication/authorization, monitoring of hardware resources, control of actions like data up- and download, execution and monitoring of jobs, and gathering and transformation of results into formats required on the client side. Application interfaces can be tailored to the specific users’ skills or access rights permitting access and control to computing and engineering resources. Intuitive, standard-compliant web-browser-based user interfaces are built from service definition files that follow customizable XML schemata. Thanks to the use of advanced web widgets specifically designed for computing- and data-intensive activities, applications can be controlled in a user-friendly and effective way, preserving user productivity even when handling complex tasks. Summing up, the technological objective of the GRIDA3 portal is—following the distributed information system paradigm—to provide authorized users that are part of a system of virtual organizations a transparent, easy, and safe access point to cloud/grid computing and storage facilities, sensors, databases, and scientific software.

The grid-enabled web applications span over five domains:

  • AGISGRID, dealing with the development of a series of applications based on GIS (geographic information system) technologies

  • AQUAGRID, focused on subsurface hydrology and water resources management

  • BONGRID, related to remediation and monitoring of contaminated sites

  • EIAGRID, enabling geophysicists to perform real-time subsurface characterization in the field by on-the-fly seismic data processing

  • PREMIAGRID, centered on atmospheric modeling, offering a wide range of services for weather forecast and climate studies at regional scale

Each tool was implemented using previously developed stand-alone applications. Porting these applications to the cloud/grid environment shifted the user interaction from a single desktop to a collaboratively used web-based HPC system. This made it on the one hand necessary to adopt the core algorithm to the use of a massive parallel environment as far as this was not already done and on the other hand to dedicate attention to an appropriate design of the user interaction with the portal, particularly in view of multi-user collaboration and sharing and implementing a fine-grained access policy for different user groups and virtual organizations. The experience with different kinds of applications and their specific needs gave also impulses for new features and further development of EnginFrame, e.g., for interacting with environments and tools for building spatially enabled Internet applications (MapServer, msCross) and databases (PostgreSQL). Regarding the multi-user management, it was important to consider different user experience levels, such as guest, standard, and expert users, when designing user interfaces and possible workflows users might perform, depending on the type of input/output data and on the actions allowed for a user category. The graphical user interface (GUI) elements offered by EnginFrame can be customized and combined via XML scripts that describe, e.g., actions represented by calls to external scripts written in XML, Perl, shell, or Java through which input parameters are read and commands related to the requested service are executed. For more complex services, auxiliary Java files are called that define specific features like including descriptive figures into the input parameter fields, choosing dates from a pop-up calendar, or displaying a progress bar for file uploading. Parallel job execution on the cloud/grid environment is managed by the middleware LSF and implemented by including LSF commands for job management into the executable instructions that launch the applications. Using LSF under EnginFrame enables useful features such as displaying the job evolution or cluster load directly in the browser. As a result of the project, the ability of EnginFrame for use in science portals was demonstrated and improved by applying it to the field of earth sciences, and some of the features requested by application porting have been finally integrated in the stable release of EnginFrame. In the following section, we will briefly summarize the technologies that were necessary to build GRIDA3.

2.1 The GRIDA3 Hardware Infrastructure

The system architecture of the GRIDA3 hardware infrastructure comprises a complex set of technologies that can be structured in four layers:

  1. 1.

    Computing, storage, net connections, and basic services systems. The computing system consists of a low-latency grid of three tiny Linux PC clusters with 70 quad-core nodes in total, interconnected at 2 Gbit/s. A high-speed storage, based on a distributed file system, is directly connected to the grid, sharing the same network switch. Backup and snapshot management are connected to medium-speed storage. The basic services on this level consist of user authentication and authorization, web servers, and security.

  2. 2.

    Grid management. It is performed by the commercially supported grid portal EnginFrame. The main building blocks of EnginFrame are services, defined by an XML representation of computing-related facilities such as a seismic-imaging application or a query to the load sharing facility, e.g., to find pending jobs. The use of pluggable server-side XML services allows the decoupling of the current grid environment and the grid computing framework and enables easy integration of common workload schedulers.

  3. 3.

    Applications level. It contains the set of scripts and configuration files required to run the different applications hosted on the portal.

  4. 4.

    Web user interaction. Thanks to intuitive Web 2.0 interfaces, the end users manage their computing and engineering resources via a web browser. The technology complexity of the lower levels is completely hidden from the end user.

Accessing a service of the GRIDA3 web portal, a registered end user can log in on the web site generated by the EnginFrame server and browse service offerings and documentation as on a normal web site. He or she can monitor jobs and computing resources status, select services from the left-side frame, and interact with the selected service, e.g., upload data from remote, insert parameters, select input files, or submit jobs and view or download the job output. After the service is executed, the results are collected in a spooler zone, a system scratch area, private to the user which allows him to browse and download output files.

3 EIAGRID Portal for Real-Time Imaging of Seismic and GPR Data

The basic idea that led to the development of the EIAGRID cloud computing portal for real-time imaging of seismic and GPR data is depicted in Fig. 1. In-field processing, equipment is substituted by remote access to high-performance grid computing facilities which can be controlled by a user-friendly web-browser interface from the field using a wireless Internet connection. For the development, it was necessary to focus on two main aspects: the geophysical applications hosted on the portal and the computing framework in which these applications were embedded. Even though EIAGRID started as a grid computing portal, we will use in the following the terms grid and cloud synonymously. At the current state of development, the portal can be classified as a private cloud that provides software as a service (SaaS). With an increasing number of users and hosted applications, it could evolve to a community cloud providing a platform as a service (PaaS) for geophysical studies.

Fig. 1
figure 1

Main scheme of the cloud computing service: in-field processing equipment is substituted by remote access to high-performance grid computing facilities which can be controlled by a user-friendly web-browser interface even from the field using a wireless Internet connection

3.1 Reflection Methods for Subsurface Imaging

In order to make aim, scope, and functionality of the EIAGRID portal more transparent to readers from other fields of environmental sciences, a brief introduction in the fundamentals of seismic reflection imaging will be given in the next three sections. There, we will discuss how the data is acquired, which preprocessing steps it has to undergo before entering the imaging phase, and what imaging methods are used to produce a structural image of the subsurface together with important subsurface properties as needed by many kinds of environmental, geotechnical, hydrogeological, and archaeological studies. We use the example of reflection seismics for this introduction but want to point out that EIAGRID and the presented methods are also valid for multi-offset ground-penetrating radar (GPR) data. The main difference between these methods is the type of waves that are used to probe the subsurface and thus also the source and receiver technology. GPR uses high-frequency (~1 to 1,000 MHz, usually polarized) radio waves generated by an emitting antenna and transmitted into the subsurface. There, waves are reflected or diffracted by buried objects or boundaries with different dielectric constants. That part of the wavefield that finally reemerges at the surface is recorded by receiving antennas. The principles involved are similar to reflection seismology, except that electromagnetic energy is used instead of elastodynamic energy and reflections appear at boundaries with different dielectric constants instead of acoustic impedances. Electromagnetic waves used for GPR have a much smaller wavelength and therefore a higher resolution power (~0.01 to 1 m) but also a smaller penetration depth (~0.1 to 25 m) than the different types of elastodynamic waves used for reflection seismic surveys (resolution power ~1 to 25 m and penetration depth ~10 to 10000 m).

3.1.1 Seismic Data Acquisition

The first seismic reflection survey in history was carried out near Oklahoma City, USA, on the 4th of June in the year 1921 (Brown 2002). Since that time, seismic exploration methods have been highly evolved, and seismic data acquisition is carried out all over the world—on land as well as at sea. For marine surveys, 3D acquisition has become standard, routinely applied by the oil industry. On the open sea, sources and receivers can be distributed without problems on the measurement surface, resulting in a 5D multi-coverageFootnote 1 dataset from which a 3D image of the subsurface structure can be produced. Just to give an idea on the dimensions, in the year 2013, the multiclient data library of PGS, a big geophysical service company, covered 425,000 km2 of 3D and 294,000 km of 2D data. For land seismics, data acquisition is in general more demanding due to topography, housing, infrastructure, etc., and thus, 2D surveys, where sources and receivers follow a line and a 2D cross section of the subsurface is produced, are still frequently applied, particularly for environmental projects. The cloud computing portal presented in the following is limited to this 2D case. Therefore, also our introduction is confined to this case. A future update of the presented cloud portal services to 3D is straightforward, and 3D versions already exist for all the implemented 2D applications.

Moving-coil electromagnetic geophones that sense particle velocity or acceleration in vertical direction (P-waves) or horizontal direction (S-waves) are usually employed as receivers in land seismic acquisition. The seismic source on land is usually either a sledgehammer, an explosive planted in a borehole, a seismic shotgun, or Vibroseis, a vibrating mechanism mounted on a heavyweight.

Typically, for a reflection seismic survey, hundreds (2D) to tens of thousands (3D) of seismic source events, so-called shots, are generated. The typical geometry of a single shot experiment is depicted in Fig. 2. Each shot emits seismic energy into the subsurface that is partly transmitted and partly reflected at velocity/density discontinuities. The reflected energy which finally reaches the surface is then recorded at different distances from the source by receivers called geophones that transform ground movement or pressure into an electrical voltage. The amplitudes of this voltage, representing the subsurface response at a receiver location in a certain moment, are recorded in a time series called seismogram or trace. All traces of a certain shot use the time of the source event as reference time zero. For 2D acquisition the data can be represented by a 3D data cube, e.g., with the axes: shot coordinate, shot–receiver distance, and time. As we will see in the following, for processing, it is often preferable to use another coordinate system where the traces are sorted with respect to the midpoint position x m of shot and receiver and half-offset h, i.e., half of the distance between shot and receiver.

Fig. 2
figure 2

Common shot experiment. A common shot (CS) gather contains all traces that have one and the same shot coordinate in common. The rays depicted in blue are transmitted by the first and reflected by the second interface (Figure according to Höcht 2002)

3.1.2 Preprocessing

Before an acquired dataset can enter the processing phase (imaging/inversion), it is subject to many preprocessing steps. Initially, it contains a multitude of different wave types. For seismic reflection imaging, only primary body waves of a specified wave mode, usually compressional waves (P-waves), but sometimes also shear waves (S-waves), are considered as signal.

All other wave types including multiply reflected waves (multiples), surface waves, refracted waves, converted waves, and often also primary reflections of other wave modes are treated as coherent noise. In addition, the data contains also incoherent noise, i.e., random noise, caused, for instance, by traffic, industry, or wind shaking of trees. Examples for some of the different kinds of coherent and incoherent noise can be seen in the shot seismogram depicted in Fig. 3.

Fig. 3
figure 3

Shot gather extracted from a land seismic dataset. In addition to the reflection events (some of them marked by the letter D), various other kinds of events can be observed, e.g., the direct wave (A), head wave (B), and ground roll (surface waves) (C) (Figure according to Mann et al. 2004)

Generally speaking, the main issue of every seismic-imaging workflow is, besides the imaging itself, the removal of all components of the data which are not intended to be imaged. The first step in this direction is the preprocessing phase during which:

  • Geometry information is assigned to the traces, and bad traces, e.g., resulting from a corrupted receiver, are zeroed out.

  • Small traveltime(s) are muted, which are not expected to be related to reflection events.

  • DeconvolutionFootnote 2 is applied to increase the temporal resolution. During a predictive deconvolution, reverberations or short-period multiple reflections are removed from seismic traces by applying a prediction-error filter.

  • Band-pass filtering is used to suppress noise that lies outside the expected signal bandwidth.

  • A dip filter serves to remove coherent noise in the f–k domain, since such events can often be distinguished by their much steeper traveltime-versus-offset dip.

  • Trace balancing is applied to correct for amplitude variations along the line caused, e.g., by varying geophone ground coupling and changing near-surface conditions.

  • Field static corrections compensate the influence of the topography and the weathering layer as far as possible.

For a more comprehensive list of preprocessing steps, we refer the reader to Yilmaz (1987). In practice seismic preprocessing is cumbersome and detail-laden. The applied corrections typically vary with respect to location within the survey area, source event, source–receiver offset, and time within the seismic trace. As a result, the seismic processor must usually perform a tedious analysis of the dataset to select appropriate parameter values for every processing operation.

3.1.3 Seismic Imaging/Data Processing

Generally speaking, the final aim of seismic reflection imaging is to obtain a depth image of the subsurface from the time-domain multi-coverage prestack data. This process can be roughly divided into two principal tasks:

  • Stacking, i.e., suitable summation of the recorded prestack data with the purpose to reduce its amount for further processing and to use its inherent redundancy to attenuate incoherent noise. Assuming a perfect summation of all seismic energy that stems from the same reflection point in depth, the signal-to-noise ratio can theoretically be improved by a factor of \( \sqrt{N} \) (Yilmaz 1987), where N is the number of contributing traces. Aside from improving the signal-to-noise ratio, the illumination of one and the same subsurface point by many different experiments is necessary to obtain information about the wave propagation velocity that provides the link between traveltime and distance.

  • Migration, i.e., a transformation of the time-domain records to a depth-domain image by placing reflections at the correct reflector positions and focusing diffractions at the associated diffraction points. An intermediate process is the so-called time migration where the reflected energy is represented in time but at a location where a diffracted (image) ray would emerge vertically at the surface. Besides a view exceptions (e.g., Mann 2002; Bonomi et al. 2012), virtually all time/depth migration procedures demand at least the knowledge of an initial macro-velocity model. In contrast to the short-wavelength velocity variation which gives rise to the recorded seismic reflection events, this model can be thought of as representing the long-wavelength component of the true subsurface velocity distribution. Information about the velocity model is obtained from the seismic data itself, together with existing geological information.

The order in which stacking and migration are applied is not fixed. On the one hand, it is possible to stack the data in the time domain and to apply the migration afterward (poststack migration). On the other hand, the migration can be applied first, before the migrated data is stacked (prestack migration). In practice, both approaches are closely interrelated since the information used to build an initial macro-velocity model for migration is usually obtained by stacking in the time domain. Since no depth axis has to be introduced, a time-domain velocity model which is far easier to obtain than its depth-domain counterpart suffices for the time-migration process.

Various migration methods exist, but most of them are based on the assumption that each reflection point in the subsurface can be treated as a diffraction point. Utilizing a known macro-velocity model, the associated diffraction operator is calculated analytically, by ray tracing or by finite-difference methods, and all seismic energy along this operator is summed up. In other words, a summation over all possible reflections at the common-reflection-point (CRP) is carried out, with the assumption that only the true reflection will constructively contribute to the summation result and that everything else will be subject to destructive interference. Under this assumption, prestack migration theoretically provides the best possible image of the subsurface. Practically, the required velocity model is not known a priori. It has to be derived from an initial model by means of iterative application of prestack migration itself and sophisticated methods to update the velocity model until the migration result is consistent with the data.

In order to separate the summation of amplitudes from the migration, which requires a velocity model, a macro-velocity model-independent stacking approach can be deployed. In the ideal case the stacking process would have to identify and sum up all amplitudes related to one and the same reflection point in depth. In other words, the stack would have to be applied along the so-called CRP trajectories. However, a strict identification of the CRP trajectories and their associated reflection points in depth is generally impossible as the exact velocity distribution of the subsurface would need to be known. A pragmatic solution to solve this problem is to employ an approximate description of the CRP trajectories by parameterizing them in such a way that the parameters can be directly determined from the prestack data. Doing this allows a summation of the amplitudes pertaining to a certain reflector point performed directly in the prestack data.

The depth location of the reflection points remains unknown and, thus, subject to a subsequent migration algorithm, whereas the offset dependency of the reflection traveltime(s) associated with a single CRP then provides the information needed for the construction of a velocity model, apart from borehole measurements and geological a priori knowledge. The parameterization of the CRP response should make as little assumptions as possible regarding the subsurface structure but involve only a reasonable number of free parameters. Furthermore, a sound physical interpretation of these parameters should exist. Such a space–time adaptive, data-driven approach for stacking and migration in the time domain is used by the EIAGRID portal that will later be presented in detail. Here we will have a brief look at the fundamentals on which the methods implemented in EIAGRID are built. The classical but still frequently used approach for stacking and velocity analysis is the common-midpoint (CMP) stack, an early predecessor of the common-reflection-surface (CRS) stack used in EIAGRID. The CMP stack method introduced by Mayne (1962) was a breakthrough in the early days of seismic data processing. In these times, the available computing power was still very limited. Therefore, the parameterization of reflection events used for stacking had to be as simple as possible. Mayne assumed a horizontally layered medium, where the reflection events measured on different traces in a CMP gather stem from a CRP in the subsurface located directly beneath the CMP location (see Fig. 4a).

Fig. 4
figure 4

Common-midpoint geometry. (a) CMP geometry for a horizontal reflector. (b) CMP geometry for a dipping reflector. In (a) the model consists of a single horizontal reflector embedded into two constant velocity layers. All rays associated with one CMP location illuminate the same subsurface point. In (b) a single dipping reflector separates two constant velocity layers. In this case, the CMP experiment illuminates more than one subsurface point (reflection point dispersal) (Figure was taken from Müller 1999)

For a horizontal reflector over a homogeneous layer, the reflection response of a CRP is exactly described by the equation

$$ {t}^2(h)={t}_0^2+\frac{4{h}^2}{v_{\mathrm{NMO}}^2}. $$
(1)

This hyperbolic equation is parameterized by a single parameter, the so-called normal moveout (NMO) velocity, which is for such a simple medium equal to the root-mean-square (RMS) velocity. Such a simple geometry is of course not always met. If the reflector dip is not zero, it comes to reflection point smearing as shown in Fig. 4b, degrading the lateral resolution of the stacking result.

For CMP processing, a time-dependent NMO velocity function is created for every CMP gather (or a representative subset) by interpolating the NMO velocities detected for the most prominent reflection events. The latter is done using the so-called semblance spectra that display for every time sample and for every velocity value within a certain range the coherence between moveout prediction and measured data (see Fig. 5). Normally, handpicked maxima of these spectra provide the searched-for NMO velocities.

Fig. 5
figure 5

Stacking velocity analysis. (a) Muted CMP gather. (b) Velocity spectrum. Semblance values are plotted as a function of ZO traveltime and NMO velocity. Maxima (dark) correspond to reflection events in (a) (Figure was taken from Duveneck 2004)

If only a subset of CMPs was used, the CMP velocity functions need to be laterally interpolated to perform the NMO correction of the complete multi-coverage data. During this process, the offset-dependent reflection times are corrected to the corresponding ZO reflection time which is related to the configuration depicted in Fig. 6. After NMO correction, reflection events recorded on different traces of a single CMP gather should be flat and sum up constructively when stacked in offset direction. The result is a single trace per CMP with a signal-to-noise ratio that is much higher than that of the initial prestack traces. In other words, the prestack dataset is replaced by a much smaller poststack dataset of much higher signal quality.

Fig. 6
figure 6

In zero-offset configuration, the single experiments have coincident shot and receiver locations, a geometry that is very favorable for interpretation but not practical for seismic experiments. Standard GPR data acquisition uses this configuration (Figure according to Höcht 2002)

Later, Levin (1971) added a correction term to Eq. (1) to consider plane reflectors with dip Φ:

$$ {t}^2(h)={t}_0^2+\frac{4{h}^2{ \cos}^2\Phi}{v_{\mathrm{NMO}}^2}. $$
(2)

However, this correction removes only the influence of the dip on the velocity but does not remove the reflection point dispersal itself.

Nowadays, the so-called dip moveout (DMO) correction (see, Deregowski 1986; Hale 1991) is applied to precondition the data for CMP processing. This process can be seen as a partial migration with the aim to remove the influence of the reflector dip from the prestack data in such a way that the reflection response of a CRP is again located within the CMP gather—as in case of a non-dipping reflector. This is done for a specific ZO traveltime by summing up for each offset the contributions of all possible dips along the DMO operator and putting the result into the CMP gather. This is justified by a similar assumption as used by prestack depth migration and many other migration methods: it is assumed that only the amplitudes along the true CRP trajectory will result in a constructive summation of signal, whereas the summation along the remaining trajectories will be subject to destructive interference of noise. A drawback of such a “blind” stack approach is that no explicit knowledge of reflector dips is obtained that could later complement the NMO velocity information. DMO processing has greatly extended the accuracy and usefulness of the CMP method. While oil exploration seismics is shifting more and more to imaging workflows based on prestack depth migration, the workflow NMO/DMO/stack plus poststack migration has remained the workhorse in environmental science for its robustness and low hardware requirements.

4 EIAGRID Project

During the last two decades, the successful employment of seismic reflection surveys using P- or S-wave data has been reported for many engineering, geotechnical, environmental, and hydrogeological studies (e.g., Goforth and Hayward 1992; Woorely et al. 1993; Ghose et al. 1998; Liberty 1998; Benjumea et al. 2003; Guy et al. 2003; Bradford et al. 2006; Pugin et al. 2009). In a quite similar way, multi-offset ground-penetrating radar (GPR) surveys have demonstrated their great potential for imaging archaeological targets, geological stratigraphy, and hydrogeological structure of shallow and ultra-shallow subsurface regions (e.g., Berard and Maillol 2006, 2008; Booth et al. 2008; Perroud and Tygel 2005; Goodman et al. 2011). These developments were going hand in hand with improvements in instrumentation and acquisition methods. As a result, the acquisition both of seismic data and of multi-offset GPR data got much more economic in view of instrumentation cost and required field work (see, e.g., Van der Veen et al. (2001), Sambuelli et al. 2001; Pugin et al. 2009). Nevertheless, reflection seismics and multi-offset GPR are still not considered as standard tools in the abovementioned fields even though these high-resolution imaging methods can provide very useful results. In our opinion there are two main reasons for this.

With respect to the stacking velocity analysis and the residual static corrections, the following considerations could be drawn. For shallow and ultra-shallow surveys, seismic velocities have large and often unpredictable variations (Miller and Xia 1998 and references given therein). Velocity changes of an order of magnitude can happen within only a few meters of vertical or horizontal displacement. Huge changes of velocity are typically encountered at the interface between bedrock and overlying sediments as well as at the separation between the unconsolidated part of the vadose zone and the underlying saturated zone (Birkelo et al. 1987; Miller and Xia 1997). The searched-for normal moveout (NMO) velocities are integral values that depend on all the layers above the reflector under consideration. Therefore, depth acts as a smoothing filter, and NMO velocities in several hundred or thousand meters of depth vary much smoother than for shallow and ultra-shallow surveys where target depth lies between a few meters and a few hundred meters. Residual static corrections are usually considered to be surface consistent and thus independent of target depth. Nevertheless, we find a higher static shift-to-dominant-period ratio in shallow and ultra-shallow surveys. Obtaining a good stack section, correctly interpretable from both structural and stratigraphic points of view, depends much on the detail with which velocity analysis and residual static corrections are carried out. Carefully analyzing velocity spectra for every single CMP gather is a very time-consuming and usually the most expensive processing step for shallow seismic imaging. In addition, severe stretch muting, as required by conventional NMO correction techniques to avoid stretch-related artifacts, can harm seriously the signal-to-noise ratio, and more sophisticated methods require much time commitment (Miller and Xia 1998; Brouwer 2001; Masoomzadeh et al. 2010).

Acquisition parameters such as maximum offset, source spacing, receiver spacing, recording time, sampling interval, or source energy must be chosen with respect to the aim of the survey but also with respect to the a priori often unknown surface and subsurface conditions encountered at the site at the day of acquisition. Above all, temporal changes of the soil moisture can have a significant influence on near-surface data (Jefferson et al. 1998). Despite the large improvements in acquisition technology, finding the optimum acquisition parameters remained a difficult task that requires much experience and intuition. Till this day, in-field processing equipment able to do real-time data analysis and imaging directly in the field is mostly unavailable for near-surface studies. The reasons for this are, besides the economic constraints and the brief acquisition time frames, also the aforementioned near-surface specific difficulties for processing. Often, only the raw traces can be displayed by the recording device, and a trial-and-error adjustment of acquisition parameters in the field is not possible. If days later, when the data is finally processed, the results turn out to be disappointing because a crucial acquisition parameter was chosen wrong, the whole acquisition campaign has to be repeated or considered as a failure. For this reasons the development of real-time data processing tools for the field was anticipated by environmental geophysicists and engineers for a long time (see, e.g., Steeples et al. 1997).

In hydrocarbon exploration, where reflection seismic surveys always have been the standard high-resolution imaging tool, these problems are less severe due to the larger target depth, acquisition time frame, and budget that allow for the use of recording trucks equipped with powerful in-field processing hardware. The presented grid computing portal aims at reducing these problems also in near-surface applications and to support in this way a broader use of seismic reflection profiling and multi-offset GPR surveys.

4.1 Web Portal

The primary objective of the EIAGRID computing portal was to emulate, according to the scheme depicted in Fig. 1, heavy and expensive in-field processing equipment by a cloud computing solution that requires just a simple laptop or PC that is connected via the Internet to a cloud of high-performance computing resources. Besides this, the portal serves as a platform for sharing and remote collaboration that can be useful not only for data acquisition but also on a much broader scope. A user-friendly web-browser interface provides secure and transparent access for a group of authorized users, displays the status of the computing resources, allows the creation of projects, and—most important—controls a set of advanced visualization and processing tools. For mobile use a fast wireless Internet connection is required to upload the raw data and to ensure a smooth functioning web interface. The need for time-consuming human interaction is minimized by the implementation of computationally intensive but highly data-driven algorithms. For the parallelized applications, the huge available computing power allows to reduce processing times significantly. Furthermore, the grid deployment permits the parallel testing of alternative processing sequences and parameter settings, a feature which considerably shortens the time needed to obtain the final results.

The hosted applications were selected with the objective to construct typical 2D time-domain seismic-imaging workflows as used for shallow and ultra-shallow applications. For data visualization and preprocessing, we chose the free software package Seismic Un*x provided by the Colorado School of Mines (Cohen and Stockwell 2000). We ported tools for trace balancing, amplitude gaining, muting, frequency filtering, dip filtering, deconvolution, and image rendering as services on the cloud computing portal, each one with a customized choice of options. For structural imaging and velocity analysis, we developed a grid version of the common-reflection-surface (CRS) stack (see, e.g., Jäger et al. 2001; Mann 2002; Heilmann 2007). This data-driven imaging method can largely benefit from the hardware parallelism provided by the cloud deployment due to its high level of automation. CRS-based residual static corrections (Koglin et al. 2006) are calculated as a by-product of the stack and can be applied in an iterative way. Besides a simulated zero-offset section of high signal-to-noise ratio, also a coherence section and three stacking parameter sections are obtained. The latter provide the input for the estimation of a smooth time-migration velocity model. As a final imaging step, a parallelized prestack time-migration scheme reverses the effects of wave propagation in order to transform the preprocessed data into an image which resembles shape and location of the geological interfaces better than a simulated zero-offset section. We chose time migration because it is far less sensitive to velocity errors than depth migration. The resulting time-migrated image is still defined in the space–time domain, but diffraction events are collapsed to points and triplications are transformed to synclinal structures.

Processing can be done step-by-step or using a graphical workflow editor that can launch a series of pipelined tasks. The status of the cluster and of submitted jobs is monitored by dedicated services. An example for viewing the status of the submitted jobs is depicted in Fig. 7 that shows the My Jobs view after starting the CRS stacking service. Under My Data, all imaging results, stored in the project spooler as image files, can be downloaded or viewed directly in the browser. Processing results are stored in permanent storage folders as data files that can be chosen as input for successive processing steps.

Fig. 7
figure 7

My Jobs view of the grid portal, showing 15 jobs that are currently running (yellow), 9 jobs that are pending (gray), and 6 jobs that are already done (green)

4.2 Services

In the following, we will discuss the principal services provided by the portal: data upload and format conversion, data visualization, data preprocessing, stacking, residual static correction, velocity model estimation, and time migration.

4.2.1 Data Upload

A typical seismic 2D survey is carried through by applying a multitude of the so-called common shot experiments. The portal allows uploading the recorded data to the remote computing facilities immediately after each shot. Using the web-browser-based graphical user interface, a single shot gather or the complete range of shots can be uploaded, preprocessed, processed, and visualized while the acquisition still takes place (see Fig. 8). Particularly for the wireless data upload, high-speed 3G networks using UMTS (Universal Mobile Telecommunications System) or HSUPA/HSDPA (High-Speed Uplink/Downlink Packet Access) protocols are required. Supported data formats are SEGY and SEG2. At any stage of the data collection, newly uploaded shot gathers can be concatenated with those that were already on the server, creating a new dataset ready to be processed.

Fig. 8
figure 8

Simultaneous upload of several shot gathers in the SEG2 format

4.2.2 Visualization and Preprocessing

For GPR and seismic reflection imaging, accurate preprocessing has much influence on the reliability of the final subsurface image. During data acquisition a basic collection of preprocessing tools with a limited choice of parameter settings is usually sufficient for a quality control (QC) of the acquired data. For the visualization and preprocessing services, we chose the applications of the free software package Seismic Un*x. This package features a multitude of data visualization, manipulation, and processing tools that are applied via command line in a Unix-like manner. We included a small subset of these tools with a customized choice of options into the EIAGRID GUI (see Fig. 9). In this way the following preprocessing tasks can be performed conveniently from any device that supports a web browser:

Fig. 9
figure 9

Tools for data preprocessing and visualization. The output of the previous process (Gain) can be viewed (Display shot gathers) and used as input for the next process (e.g., Filter) to conduct a complete preprocessing workflow

  • Visualization of shot gathers, frequency–, frequency–wavenumber–, and autocorrelation spectra

  • Muting, e.g., early arrivals not related to reflection events

  • Application of a time-dependent gain function, e.g., to remove the effect of spherical divergence

  • Trace amplitude balancing, e.g., to correct for variations along the line

  • Band-pass filtering, e.g., to suppress the noise that lies outside the signal bandwidth

  • Dip filtering in the f–k domain, e.g., to suppress ground roll or air blast

  • Deconvolution to increase the temporal resolution and/or to remove reverberations and short-period multiple reflections

4.2.3 CRS Stack-Based Imaging

The graphical user interfaces for CRS stack imaging (including residual static corrections), as well as for velocity model building and time migration, are very similar to the one of preprocessing. The user has to choose one of the existing input data files and fill the field for the process parameters before he or she can press the Submit Simulation button. Shortly after, the My Jobs view will open and display the status of the submitted job and its subprocesses, as displayed in the example given in Fig. 7.

4.2.4 CRS Stack and Velocity Analysis

To date, many examples of 2D and 3D CRS stack applications in the field of oil and gas exploration have been reported from both academic and industrial research teams for many different areas of this world (e.g., Mann et al. 1999; Bergler et al. 2002; Heilmann et al. 2006; Prüssmann et al. 2008). Besides the generation of a simulated zero-offset section of high resolution and signal-to-noise ratio, also residual static corrections can be obtained from CRS results (Koglin et al. 2006; Heilmann et al. 2006), and valuable input for velocity model building is provided (Perroud and Tygel 2005; Heilmann 2007; Prüssmann et al. 2008). We believe that the CRS method offers a great potential also for shallow and ultra-shallow seismic reflection imaging. To justify this argument, we will briefly review the main concepts and principles behind CRS stacking and discuss basic processing parameters that are important for our field data examples. For a more extensive and profound description of the CRS stack method in general and its various target-specific extensions (e.g., rough topography, common offset, vertical seismic profiling, prestack data enhancement), other sources in literature are available (e.g., Müller 1998; Jäger et al. 2001; Höcht 2002; Mann 2002; Bergler et al. 2002; Hertweck et al. 2007; Heilmann 2007; von Steht 2008; Baykulov et al. 2011).

The CRS method is based on a generalized velocity analysis and stacking procedure. In case of 2D data, a three-parameter stacking surfaces is employed to obtain a simulated zero-offset section. The stacking process is not confined to single CMP gathers (stacking in offset direction), but it also includes neighboring CMPs (stacking in midpoint direction). The stacking aperture is the so-called CRS supergathers that covers all traces which contain energy reflected from a certain common reflector segment in depth centered at the theoretical reflection point of the zero-offset ray. Different to the normal moveout (NMO) or normal moveout/dip moveout (NMO/DMO) methods (Deregowski 1986; Yilmaz 1987), which assume to approximate reflection traveltime(s) a planar horizontal (NMO) or a planar dipping (NMO/DMO) reflector segment, the CRS method assumes a reflector segment of arbitrary dip and curvature including diffraction points and planar reflectors. For a more detailed comparison between these conventional methods and CRS, see Hertweck et al. (2007). The hyperbolic CRS traveltime operator is defined by the following approximation:

$$ {t}^2\left({x}_m,h\right)={\left[{t}_0+\frac{2 \sin \left(\alpha \right)\left({x}_m-{x}_0\right)}{v_0}\right]}^2+\frac{2{t}_0{ \cos}^2\alpha }{v_0}\left[\frac{{\left({x}_m-{x}_0\right)}^2}{R_{\mathrm{N}}}+\frac{h^2}{R_{\mathrm{N}\mathrm{IP}}}\right], $$
(3)

where x m and h are the midpoint and half-offset coordinates, respectively. The summation result is placed in the zero-offset section at the point (x 0,  t 0), which represents traveltime and emergence point of the zero-offset (or central) ray, i.e., the ray reflected at the center of the common reflection segment. The three stacking parameters in Eq. (3), α, R NIP, and R N, are the emergence angle of the zero-offset ray and the two wavefront curvatures of the theoretical eigen-waves denoted as normal incident point wave and normal wave (Hubral 1983; Jäger et al. 2001). These kinematic wavefield attributes are determined automatically from the prestack data by means of coherence analysis. Finally, v 0 denotes the near-surface velocity in x 0. This a priori information does not influence the stack itself but provides the link between the searched-for reflection traveltime surface and the physical interpretation of the stacking parameters, α, R NIP, and R N.

The use of a spatial stacking operator increases the number of contributing traces which allows the use of sparser surveys without loss in imaging quality (Gierse et al. 2009) and/or a more stable velocity analysis and higher signal-to-noise ratio, particularly for very shallow, strongly curved, or steeply dipping reflectors. The data-driven implementation determines independently for every zero-offset sample (x 0,  t 0) those three stacking parameter values that maximize the coherence of the prestack data along the stacking operator given by Eq. (3). Semblance is used as coherence measure (Neidell and Taner 1971). As a result, time-consuming human interaction in prestack velocity analysis such as manual picking in velocity spectra can be avoided as well as the detrimental stretching effects that result on the traces from conventional NMO correction (Mann and Höcht 2003). The latter allows to use larger offset ranges for velocity analysis and stack, and thus, a better signal-to-noise ratio can be expected.

For CRS stacking, a spatial definition of the stacking aperture is required. In practice, the choice of the right stacking apertures is crucial as it substantially affects the lateral resolution and signal-to-noise ratio. The software used for our portal is based on the CRS stack version 4.7 (see Mann 2002) but includes several additional features such as residual static corrections, redatuming, and the support of rough top-surface topography (see Heilmann 2007). It employs a tapered traveltime-dependent stacking aperture of elliptic shape in the midpoint/offset plane with user-defined half axes given by a midpoint aperture for h = 0 and an offset aperture for x m  = x 0. This choice accounts for the approximate nature of the CRS operator, which is a hyperbolic approximation of a second-order Taylor expansion of the reflection traveltime centered at h = 0, x m  = x 0. By default the software creates two stacked sections, one that corresponds to the user-defined aperture and one that corresponds to a midpoint aperture that is an approximation of the projected Fresnel zone W F calculated from the stacking parameters and the estimated dominant period T of the source wavelet (for details see Mann 2002) according to the formula

$$ \frac{W_{\mathrm{F}}}{2}=\left|{x}_m-{x}_0\right|=\frac{1}{ \cos \alpha}\sqrt{\frac{v_0T}{2\left|\frac{1}{R_{\mathrm{N}}}-\frac{1}{R_{\mathrm{N}\mathrm{IP}}}\right|}.} $$
(4)

If velocity information is needed for other tasks such as poststack and/or prestack migration, time-to-depth conversion, and/or geotechnical site characterization, it can be obtained from the kinematic wavefield attributes that are determined for each stacking operation through coherence (semblance) analysis on the prestack data. As, e.g., shown by Perroud and Tygel (2005), this process fully replaces the traditional CMP velocity analysis and allows to substitute the 1D Dix velocity conversion (Dix 1955) with a 2D tomographic inversion approach (Duveneck 2004).

4.2.5 Residual Static Correction

The estimation of surface-consistent residual static corrections is part of the CRS stack implementation. Usually, several iterations of stacking and residual static corrections have to be applied to obtain optimum results. The algorithm to estimate the static time shifts is based on a maximization of the stack power, similar to the super-trace cross-correlation method of Ronen and Claerbout (1985). The cross-correlations between stacked pilot traces and measured prestack traces are performed within the moveout-corrected CRS supergathers. The moveout correction makes use of the previously obtained wavefield attributes. Due to the spatial extent of the stacking operator, such a supergather contains many neighboring CMP gathers. For each supergather, corresponding to a specific zero-offset location, the moveout correction will, in general, be different. Since each prestack trace is included in many different supergathers, it contributes to far more cross-correlations than in methods using individual common shot or common receiver gathers. The cross-correlations of the stacked pilot trace and the moveout-corrected prestack traces are summed for each shot and receiver location. This summation is performed for all supergathers contained in the specified target zone. The searched-for residual time shifts which are finally used to correct the prestack traces are associated with the maxima in the cross-correlation stacks and can be extracted in different ways (see Koglin et al. 2006). Here we take 30 % of the global or local maximum closest to a zero time shift as minimum threshold and take the center of the area that exceeds this threshold as the estimated time shift. Furthermore, more than 50 cross-correlations have to contribute to a cross-correlation stack for the time shift to be applied. For the next iteration of residual static correction, the entire attribute search and stacking process is repeated, now using the corrected prestack dataset.

4.2.6 Velocity Model Estimation

Time-migration velocity v tm can be calculated from the CRS attributes, according to Mann (2002):

$$ {v}_{\mathrm{tm}}^2=\frac{2{v}_0^2{R}_{\mathrm{NIP}}}{2{R}_{\mathrm{NIP}}{ \sin}^2\alpha +{v}_0{t}_0{ \cos}^2\alpha }. $$
(5)

Besides the stack, coherence, and NMO velocity section, a time-migration velocity panel is created by the portal as default output. The latter is used by the velocity model building service for creating a smooth time-migration velocity model by means of an iterative 2D smoothing and regularization algorithm which fills the gaps between reflections where no reliable wavefield attributes and thus no time-migration velocity can be obtained. The algorithm is fast and highly automated. Only a coherency threshold to discriminate unreliable attributes and an initial time-migration velocity range have to be specified. For the first iteration, all low-coherency gaps in the previously obtained time-migration velocity section are filled with a 1D gradient model based on the user-given velocity range. The resulting velocity model is smoothed using the Seismic Un*x command smooth2 that smooths uniformly sampled 2D arrays in a user-defined window, via a damped least square technique. To make it more simple for the user, the window size is set by default to ten grid points in time and space direction. For the next iteration, the gaps between reliable velocity values on reflection events are filled by the result of the previous iteration. After a sufficiently large number of iterations (n > 100), the gaps are filled with a two-dimensional inhomogeneous velocity distribution that fits well to the velocities picked at the events. For a more accurate velocity model building approach, it would have been necessary to move before the regularization the zero-offset time-migration velocity values to the apex location of the corresponding diffraction operator (see, Spinner and Mann 2005), i.e., the origin of a so-called image ray that emerges vertically at the surface (Hubral and Krey 1980). We omitted this step since time migration is anyway less sensible to velocity model errors than depth migration and results in well-focused images even for a roughly approximated velocity model.

4.2.7 Time Migration

For the prestack time-migration service, we chose a parallelized implementation, described in Spinner (2007), which employs the so-called straight-ray assumption to calculate analytically the approximate diffraction traveltime surface of a scattering point in the subsurface using a double-square-root equation. It assumes a medium that can be described by an effective velocity above each scatterer. This approach leads on the one hand to significant savings in computational time since no ray tracing is needed but is on the other hand less general than time-migration schemes that utilize complex non-hyperbolic operators in order to account for ray bending (see Robein 2003 and references given therein). Since our current velocity model building approach cannot resolve velocity distributions that are laterally strongly heterogeneous, the straight-ray assumption is sufficient for our purpose. Furthermore, this implementation calculates for each zero-offset location from the previously obtained CRS results an optimum migration aperture, which corresponds to the projected Fresnel zone, centered at the stationary point, where migration operator and reflection event are tangent.

4.3 Data Examples

In the following we will present two real-data examples for which a detailed description of survey, geological setting, processing results, and their interpretation was published in Perroud and Tygel (2005) and Deidda et al. (2006), respectively. The aim of these examples is to demonstrate the ability of the presented grid computing portal to obtain results close to the published ones, which were obtained with much more elaborate and time-consuming methods. We do not want to repeat the published results here; instead we would like to invite the reader to conduct his or her own comparison. The processing parameters that we used for stacking these two datasets are displayed in Table 1. Another test that we successfully concluded with an ultra-shallow S-wave dataset (see, Deidda et al. 2012) is omitted here for the sake of brevity. So far we used preexistent data, but we will soon have the occasion to use the portal also directly during data acquisition.

Table 1 CRS stack processing parameters used for the Muravera (Cagliari, Italy) and the Larreule (France) datasets

4.3.1 P-Wave Dataset from Muravera (Cagliari, Italy)

The seismic reflection survey used for the first test was conducted in the Flumendosa River Delta, close to the South Sardinian town of Muravera (Cagliari, Italy). As described in Deidda et al. (2006), P-wave data was recorded using explosive sources along two seismic lines, both approximately 1.1 km long. The acquisition area lies only a few meters above the sea level, and the topography is flat with a maximum variation of ±0.5 m. The data was acquired using single 50 Hz geophones attached to a 48-channel seismograph system with 18 bit recording capability. The sample interval was 0.5 ms and the record length 1,024 ms. A 50 Hz low-cut filter with 24 dB/octave roll-off helped to attenuate the ground roll, while a standard common-midpoint (CMP) roll-along technique in an end-on configuration with 48 active geophones was employed for data recording. The 0.25 kg explosive sources were buried at approximately 2 m depth. With a few exceptions, all source locations lay below the groundwater table. Geophone spacing of 5 m and source spacing of 10 m provided twelvefold CMP coverage with a CMP spacing of 2.5 m. The maximum source–receiver offset of 245 m was chosen to allow the determination of stacking velocities for reflections from a depth of about 200–300 m, which was expected to be the depth of the bedrock surface. Because topography was flat, only residual static corrections were calculated and applied. We use here the data of line 1 that has a total size of 35 MB.

Starting from the raw seismograms, we applied trace balancing by dividing all amplitudes by the RMS amplitude value of the respective trace and spherical divergence correction by multiplying the amplitudes with the square root of the traveltime. To attenuate air wave and ground roll components, a dip filter rejecting phase velocities between 250 and 500 m/s and a trapezoidal band pass with the corner frequencies 30, 50, 270, and 300 Hz were applied. Finally, a deconvolution served to eliminate ringing and to reduce the temporal extent of the wavelet. To give an example of the data quality and the different preprocessing steps applied, a single shot gather is depicted in Fig. 10 in three states of preprocessing.

Fig. 10
figure 10

Shot gather no. 77 of seismic data recorded in the Flumendosa delta close to Muravera (Cagliari, Italy). Before preprocessing (left), after application of gain and trace balancing (middle), and after dip filter, deconvolution, band-pass filter, muting, and residual static corrections (right)

Similar to the conventional processing described in Deidda et al. (2006), five iterations of surface-consistent residual statics with a maximum allowable shift per iteration of ±5 ms were applied to improve the continuity of the events. The cross-correlation stacks for the source locations 100–150 and a time window of ±5 ms are displayed in Fig. 11 for the original data (a) and the data after five iterations of residual static corrections (b). In Fig. 12, the CRS stack section obtained for the Muravera (Cagliari, Italy) data is depicted. Figure 13 shows the migration velocity model created from the CRS stack results by extracting and regularizing attribute values on events with a coherence larger than 0.25. Even though this process runs sequentially on a single node, it took just half a minute. Figure 14 shows the prestack time-migration results for the Muravera (Cagliari, Italy) data. Using 20 cores it took less than 1 min to obtain this result.

Fig. 11
figure 11

Cross-correlation stack for the source locations for a P-wave dataset from Muravera (Cagliari, Italy). For the original data (top), the maxima are not centered at 0 ms but indicate residual static time shifts. The data after five iterations of residual static corrections (bottom) does hardly show any remaining residual statics. Similar pictures were obtained for the receiver positions

Fig. 12
figure 12

Stack section obtained after five iterations of residual static corrections for a P-wave dataset from Muravera (Cagliari, Italy)

Fig. 13
figure 13

Migration velocity model created for the Muravera (Cagliari, Italy) data from the coherence and stacking parameter sections by extracting and regularizing attribute values on events with a coherence larger than 0.25

Fig. 14
figure 14

Prestack time-migration results created for the Muravera (Cagliari, Italy) data obtained by using the velocity model depicted in Fig. 13

Both the stacked and the migrated sections we obtained with EIAGRID agree quite well with the ones obtained in the original work (Deidda et al. 2006). However, there are some stratigraphic and structural features that appear less clearly imaged. For instance, the bowtie features (crossing reflections) are to some extent unresolved. In addition, in some parts of the sections, the shallowest reflection shows a very ringing character. To understand these differences, one has to note that great importance was given to processing rapidity; as a consequence, a rough and not shot record-oriented preprocessing was carried out, unable to address noise and signal variations along the line as well as to effectively improve S/N ratio. Thus, the quick-and-dirty preprocessing strongly impacts both CRS stack and time migration and constitutes the main reason for the differences. The minor processing-related reasons are a mixing of reflection events with small discontinuities caused in some locations by too large midpoint aperture and artifacts caused by poorly resolved conflicting dips (Deidda et al. 2012; Garabito et al. 2012). However, the main subsurface structures, and in particular the bedrock surface, appear so well imaged as to quickly detect the primary and somewhat surprising result highlighted in Deidda et al. (2006): a maximum bedrock depth twice as deep as expected. This demonstrates clearly the pressing need for in-field real-time data analysis and imaging. Concerning data acquisition, the wrong depth of the bedrock, assumed to set up the far offset, did not allow to estimate stacking velocities with the accuracy needed to do seismic data migration well or simply to do the time-to-depth conversion correctly. Besides, Deidda et al. (2006) also noticed the closeness of the bedrock dipping reflector to the section ends, which was a further problem for seismic imaging.

4.3.2 Ground-Penetrating Radar Dataset from Larreule (France)

As a second data example, we present results of the reprocessing of a multi-offset GPR dataset recorded close to Larreule (France) for which detailed results are presented in Perroud and Tygel (2005). The authors of this publication kindly provided to us the dataset as a test case for multi-offset GPR processing. It was recorded using a Mala Geophysics RAMAC-2 four-channel control unit and two pairs of unshielded 200 MHz antennas. The aim was to obtain besides a structural image also velocity information that can be used to recover groundwater properties such as water content and water conductivity. As described in Perroud and Tygel (2005), repeated profiling was performed with four antennas mounted on a PVC cart with varying spacings. 56 CMPs were recorded, every 0.1 m on a 55 m-long profile, each one with 28 different offsets, ranging from 0.6 to 6 m. The maximum recording time was 150 ns, which corresponds roughly to a 6 m penetration depth for a mean velocity of 7.5 cm/ns. The data provided to us was already preprocessed including, besides the geometry setup, also static shifts to zero time, mean amplitude removal, tapered band-pass filtering, muting of airwaves, and amplitude balancing. The stack result obtained by using our portal is depicted in Fig. 15. In Fig. 16, the smooth time-migration velocity model is shown that was generated by iteratively smoothing from coherency windowed CRS stacking parameters and a gradient model ranging between 6.5 and 9.0 cm/ns that served as an initial model for closing the gaps. The time-migration result is depicted in Fig. 17. The turnaround times for processing this dataset were very similar to those of the Muravera (Cagliari, Italy) example. The obtained results are very close to the original work. Different to the preceding data example, preprocessing was not an issue for this test since the same preprocessed data as used by Perroud and Tygel (2005) was taken as input for the EIAGRID imaging workflow.

Fig. 15
figure 15

Stack section obtained for the GPR dataset from Larreule (France)

Fig. 16
figure 16

Migration velocity model created for the Larreule (France) data from the coherence and stacking parameter sections by extracting and regularizing attribute values on events with a coherence larger than 0.25

Fig. 17
figure 17

Prestack time-migration results created for the Larreule (France) data obtained by using the velocity model depicted in Fig. 16

4.4 Concluding Remarks

In-field real-time processing systems are a commonplace in electrical and electromagnetic surveys for several years now, as they allow efficient data acquisition and cost-effective results. Unfortunately, such systems are still a need both for shallow reflection seismics and for multichannel GPR surveys. For this reason, we have combined the powerful computational capabilities of a cloud computing architecture with a data-driven CRS stack-based imaging workflow, setting up an innovative, easy to use, and reliable imaging system. We simulated the in-field real-time processing for two different datasets, showing that our highly automated imaging workflow is able to produce stacked and migrated sections comparable to the ones obtainable with conventional CMP processing. The given examples highlight the practical applicability of the grid/cloud portal during acquisition as a tool supporting field geophysicists in improving acquisition geometry, gauging data quality, and estimating characteristic values of the subsurface, in a cost-effective way. Moreover, in case of low-quality records, it is possible to run an imaging workflow repeatedly along the line, until the operator reaches an optimum trade-off between different processing settings, obtaining the best possible image to optimize his or her acquisition setup.

In conclusion, we have devised a cutting-edge procedure that could become a commonplace for shallow seismic reflection and multi-offset GPR surveys in hydrogeological, environmental, agricultural, archaeological, and geotechnical studies, similar to what happened to electrical and electromagnetic methods in the past decades. We are aware that wireless Internet access and bandwidth are key requirements to employ our cloud computing imaging procedure. This may be a weak point, as today there still are locations with poor wireless coverage and poor capacity/speed connectivity or with no connectivity at all. However, we trust in the ongoing rapid improvement of mobile Internet connectivity and bandwidth, such as the upcoming G4 standard, which surely will push cloud computing and applications, such as the EIAGRID portal, on mobile devices. The latter will be particularly useful in the management of the huge data volume that the incoming new acquisition systems, equipped with wireless receivers and high-density shooting, will make available. Furthermore, we hope that our cloud computing proposal is a step ahead to stand up to in-field data interpretation, the next challenge for the near-surface community, as it fosters the integration of seismic data with other geophysical data, as well as gives rise to more collaboration and knowledge-sharing opportunities.

5 The Forest Fire Web Portal, PREMIAGRID Project

5.1 Introduction

In the following we briefly describe the forest fire service, an integrated system for weather and wildfire propagation forecast developed inside the PREMIAGRID portal.

The purpose of such service is to provide a tool that could be used to understand and predict the behavior of wildland fire, in order to increase the safety of the public and of the firefighters and to reduce the risk and possibly minimize the economic and environmental damage in case of an event.

The prediction of fire behavior is certainly not a new concept; several attempts to produce models have been done in the past with different levels of complexity, different purposes, and effective usage in fire suppression activities.

The simplest models are purely empirical and are based on observation and extrapolation of the fire spread velocity and fire intensity for a given type of vegetation under given conditions of wind and terrain slope.

The interpretation of these empirical data and the integration with data from controlled conditions experiment have led to the development of semiempirical models that, given some fundamental assumptions on the mechanism of fire spread, allow a good extrapolation of the fire behavior, based on measurable data on the vegetation characteristics and on the terrain and atmospheric conditions.

These models can only give point-wise estimation of the fire behavior but are extremely useful in the field, since these may give quick and efficient information to support firefighters.

Several models of this kind have been developed in the world, and of particular importance are those of the US, Canadian, and Australian forest services. The US Rothermel set of equation will be our reference model in the development of our fire propagation system (Rothermel 1972).

A second step in model complexity comes from proper consideration of the variation of the fire environment with position. The terrain slope, the characteristics of the vegetation, and the weather conditions too are clearly not constant with position and time; therefore, the fire behavior changes from place to place. It is thus possible to calculate in each geographic point the potential behavior of a fire and, by adding a fire spread model, to predict the fire propagation in a limited geographic area from a given ignition point.

These models do not require much computational power and can be used efficiently in the analysis of fire behavior with satisfactory results. They are difficult to use in real time, for instance, during an emergency, due to the long setup time required by the geographic information system that supports them for all the information on vegetation characteristics, terrain slope, and weather conditions; they furthermore suffer from the lack of fine-scale information on wind data to give maximum accuracy. An excellent example of such systems is the FARSITE model developed by the US forest service (Finney 2004).

Apart from the need of high-quality and high-resolution data, the principal limitation of such kind of approach is the lack of feedback between the fire and the weather and atmospheric conditions. A wildfire constitutes a power source for the atmospheric system; the energy release of the combustion process can affect the local winds. To include such effects in the modeling of the phenomena, a coupling between the fire spread model and the atmospheric model is required; the model is much more complex and requires significant computational costs. Several attempts are being made to obtain such coupling. The computational power required in any case does not allow to have a real-time prediction of the fire behavior and is therefore beyond the scope of our service.

It is worth reminding that there are several attempts for a full three-dimensional fluid dynamics modeling of wildfire behavior; in such models, the whole set of mass, momentum, and energy balance equations are solved for a system involving both air and fuel. Such models may allow a deeper understanding of some fire behavior but are clearly limited by the enormous computing power required for a relatively small model size.

The service we developed should be used by skilled and authorized personnel of the firefighting service to help the planning during an emergency, to help in the training of the personnel, and to provide scenarios that could be used for prevention activities. It is accessible through a web portal and is composed of a geographic information system for the management of soil and vegetation data, a limited area meteorological model chain that generates a high-resolution weather forecast, a very high-resolution fluid dynamic calculation of the wind in the fire area, and finally a fire spread model that allows to predict and visualize the evolution of the flame front position in time.

None of the skills required to manage such systems should be required to the final user, who can focus quickly on the interpretation of the results and make decisions more efficiently.

The behavior of wildfires is deeply affected by the weather, vegetation characteristics, and topography (Hanson et al. 2000).

The factors related to vegetation, the fuel of this combustion process, include the type of vegetation present in the area, its moisture, and its size and shape, density, and arrangement. The topography influences fire propagation through the terrain slope that partly determines the rate of spread of a fire front and through features such as narrow canyons which determine the direction of local winds or barriers such as creeks, roads, and areas without burnable vegetation.

These two factors are constant in time or have very limited variations; weather-related factors on the other side are difficult to predict and are subject to very high variations even during an event, not to mention the fact that the energy and gases released by the combustion can deeply influence the local characteristics of weather and winds (fire creates its own weather).

Weather influences fire propagation through winds, temperature, relative humidity, and precipitations, and even minimum variations on these quantities can have dramatic effects on the wildfire intensity and propagation direction. These factors are not constant in time as mentioned but also have strong variations in space. Winds in particular at the ground level may have strong variations in direction and intensity due to a complex topography or changes in the vegetation cover.

Wind deeply influences the rate and direction of spread of a fire front, and it is probably the most difficult and important factor to evaluate in order to obtain a reliable fire propagation forecast. It has to be determined with high precision at scales of the order of 20 m or less, comparable to the scale of flame length.

Our service is intended to give a real-time service to the user; the model complexity shall therefore be limited, and the user interaction for data input minimized.

Therefore, we set up within the GRIDA3 infrastructure the fire forest service, assembling and connecting the following modeling task (as it is shown in Fig. 18):

Fig. 18
figure 18

Forest fire service algorithm scheme. The blocks in green are relative to the meteorological models representing the three levels of nested simulations. The results constitute the input of the wind modeling application. The high-resolution wind data and the GIS data on vegetation and terrain constitute the input for the fire spread algorithm, whose constitutive blocks are shown in blue

  • A hydrostatic–nonhydrostatic limited area weather forecast chain based on the ISAC-CNR models BOLAM (Buzzi et al. 1994) and MOLOCH (Tettamanti et al. 2002), with up to four nesting levels and initial and boundary conditions from NCEP-GFS daily data (NCEP Office Note 442, 2003) or from ECMWF ERA Interim dataset (Dee et al. 2011).

  • A preprocessing and GIS module used to prepare topography and fuel input data (elevation, slope, aspect, canopy cover and fuel model) to manage raster layers and interpolate the weather forecast results at the proper resolution required for the fluid dynamic and fire propagation analysis.

  • A mass-consistent fluid dynamic analysis of the wind in the area of interest that allows to have the wind field with a spatial resolution of the order of 10 m, over a terrain of 10 km side, realized through a finite volume solver based on the public domain library OpenFOAM (http://www.openfoam.org).

  • A fire spread analysis that gives the evolution in time of the fire line and the burned area. This module takes as input the geographic data on the terrain configuration, the land use and type of vegetation, the meteorological conditions, and the wind conditions near the ground and gives as output the time required for a fire starting from a given ignition point to reach any point in the map. The solver is developed on a well-established library for the calculation of the velocity of advancement of the fire front and on an in-house developed solver based on the level set method for the calculation of the front evolution on the ground surface.

As example of the fire forest portal, we show the simulation of the wildfire that occurred near the village of Budoni (Olbia, Italy) on August 26, 2004, where thousands of people (residents and tourists) were evacuated from their lodging due to the fire danger.

5.2 BOLAM–MOLOCH Chain

For the forest fire service, we arranged a complex BOLAM–MOLOCH chain consisting of four levels of nesting, starting from the BOLAM model used at 0.33 degrees of resolution and ending with the MOLOCH model (the last level of nesting) used at 0.01 degrees (about 1 km at our latitudes).

BOLAM and MOLOCH are two models closely related: BOLAM is a hydrostatic meteorological model, with prognostic equations for the horizontal wind components, the absolute temperature, the surface pressure, the specific humidity, and the turbulent kinetic energy (TKE). Deep moist convection is parameterized using the Kain–Fritsch (Kain 2004) convective scheme. BOLAM implements a split-explicit temporal integration scheme, forward–backward for the gravity modes. MOLOCH is a nonhydrostatic, fully compressible model developed at ISAC-CNR from the BOLAM experience. The MOLOCH time integration scheme is characterized by an implicit scheme for the vertical propagation of sound waves and explicit, time-split schemes, the remaining terms of the equations of motion. It shares with BOLAM many physics parameterization (such as atmospheric radiation, sub-grid turbulence, water cycle microphysics, and a soil model), but solving equations for vertical momentum, MOLOCH can compute directly convective phenomena.

The BOLAM–MOLOCH chain can be scheduled to execute daily one or two runs (forecast up to +72 h) with GFS analysis starting at 00 UTC or 12 UTC. Setting three levels of nesting (MOLOCH at 0.033 degrees of resolution), the forecast chain can last about 4 h, depending on the grid load. MOLOCH, even though fully parallelized, is the bottleneck of the chain, partly because of its equation complexity and partly because of its dense grid and small timestep. So, if users need a fast forecast (in less than 2 h), the chain can be stopped before the MOLOCH run.

The meteorological model chain must run continuously over a given region (the Region of Sardinia in our case) to have continuously improved weather forecast data, which constitutes the initial approximation on which a limited area fast computational fluid dynamic model is run to produce the high-resolution wind data required by the fire model.

5.3 Wind Modeling

The high-resolution wind field necessary to evaluate the fire spread is calculated through a dedicated finite volume solver based on the open-source library OpenFOAM. The meteorological model chain gives the value of the wind distribution up to a scale of 3 km that is clearly not sufficient. The target spatial scale is in fact of the order of 10 m, and the area on which the wind data is required has a side of 10 km.

It is beyond the limits of the current computational power the possibility to realize a complete CFD analysis of such a domain by solving the whole set of the Navier–Stokes equations and some closure model for turbulence, in a short time, comparable to the typical duration of the wildfires in the Mediterranean area. In order to have a real-time service, it is therefore necessary to have some simplification; we found that some good approximation may be obtained through a simple mass-consistent fluid dynamic model: of all Navier–Stokes balance equations, only the mass consistency is forced; some inconsistencies in the momentum and energy balance for the wind flow are thus allowed.

The procedure that we adopted may be figured out as a clever interpolation technique for the meteorological data that fulfills the mass balance requirements and includes high-resolution data on the orography and the vegetation.

The three-dimensional wind field is calculated with a two-step procedure consisting of an initial guess based on the meteorological results followed by a correction step to force mass consistency (Ratto et al. 2002).

The initial guess of the wind field is calculated as an interpolation of the meteorological results, taking proper consideration of the higher-resolution orography and of the form of the atmospheric boundary layer that is in turn influenced by the high-resolution data on the vegetation. While the meteorological wind field satisfies the Navier–Stokes balance equation, this high-resolution interpolation introduces some inconsistencies in the field, which are partly corrected in the second phase.

The correction step takes the initial guess and modifies it, forcing the wind field to verify the mass balance equation of the Navier–Stokes set.

Let v 0 be the initial guess for the wind field, which already is a good approximation of the real conditions; the aim is to apply to it the minimum possible correction in order to make it satisfy the divergence free condition on all the computational domain Ω. We identify the corrected wind field with v. The variational problem consists in minimizing the variance of the difference between the adjusted and the initial wind field, subject to the constraint that the divergence should vanish. It is necessary to minimize the integral norm of the difference between the two vectorial fields over the computational domain

$$ E\left(\boldsymbol{v}\right)={\displaystyle {\int}_{\Omega}{\left\Vert \boldsymbol{v}-{\boldsymbol{v}}_0\right\Vert}^2\mathrm{d}\Omega, } $$
(6)

under the strong constraint of mass conservation

$$ \nabla \boldsymbol{v}=0. $$
(7)

Here, for the sake of simplicity of the formulation, the air density is assumed constant.

Introducing the Lagrangian multipliers λ, the problem may be rewritten as the minimization of the functional

$$ J\left(\boldsymbol{v},\lambda \right)={\displaystyle {\int}_{\Omega}\left({\left\Vert \boldsymbol{v}-{\boldsymbol{v}}_0\right\Vert}^2+\lambda \nabla \boldsymbol{v}\right)\mathrm{d}\Omega, } $$
(8)

which leads to the solution of the following elliptic equation:

$$ {\nabla}^2\lambda =-2\nabla {\boldsymbol{v}}_0. $$
(9)

Once solved, the corrected wind field is calculated as

$$ v={v}_0+\frac{1}{2}\nabla \lambda . $$
(10)

For this problem, suitable boundary conditions able to ensure stability of the solution are: a zero normal gradient of the wind stream on the lateral boundaries, a fixed zero value for λ on the sky boundary and the no-slip condition on the ground.

Given a good estimation of the starting wind, obtained by meteorological model chain, this procedure allows to correct the wind for mass consistency only where needed and has an estimation that is at least in part responsive to the real detailed orography.

5.4 Fire Spread

The fire spread is based on a semiempirical method for the calculation of flame front velocity, through the well-known and tested Rothermel model, and on the level set method (Osher and Sethian 1988) as the numerical tool.

The Rothermel model is a semiempirical fire model consisting of a set of equations that allow the prediction of the rate of spread (ROS) of a fire front given the characteristics of the fuel in terms of humidity, latent heat, and heat released when burned, as well as the slope and wind intensity in the direction normal to the fire front.

Fire spread calculations are implemented using the GPLv2 library “Fire Behavior Software Developer Kit” (Bevins 2006).

The level set is a powerful method to track moving interfaces originally introduced by Osher and Sethian. It uses a rich computational space for the tracking of the interface and a two-dimensional Cartesian grid to describe the evolution of the linear boundary. The method can automatically deal with topological changes that may take place during fire spreading, such as the merging of separate flame fronts or the formation of unburned “islands,” making it particularly appropriate to wildfire propagation problems.

We denote with Γ(x, t) the fire line contour; in a two-dimensional domain this can be represented as an isoline of an auxiliary function φ(x, t), i.e., Γ(x, t) = {x, t: φ(x, t) = φ 0 = constant}. Hence, the evolution in time of the isoline is given by

$$ \frac{ D\varphi}{ D t}=\frac{\partial \varphi }{\partial t}+\frac{\mathrm{d}\boldsymbol{x}}{\mathrm{d}t}\cdot \nabla \varphi =\frac{ D\varphi}{ D t}. $$
(11)

If the motion of the surface points is directed toward the outward normal direction, then

$$ \frac{ D\varphi}{ D t}=\frac{\partial \varphi }{\partial t}+\frac{\mathrm{d}\boldsymbol{x}}{\mathrm{d}t}=\boldsymbol{V}\left(\boldsymbol{x},t\right)=V\left(\boldsymbol{x},t\right)\boldsymbol{n}, $$
(12)
$$ \boldsymbol{n}=\frac{\nabla {\psi}_0}{\left\Vert \nabla {\psi}_0\right\Vert }, $$
(13)

and the evolution of φ is calculated as

$$ \frac{\partial \varphi }{\partial t}=V\left(\boldsymbol{x},t\right)\left\Vert \nabla \varphi \right\Vert, $$
(14)

which is here referred to as the ordinary level set equation.

For our purpose, we take V(x, t) as the ROS of the fire front, the velocity at which the fire contour propagates along its normal, and we let φ(x, t) be an indicator function which takes positive values for an unburned point and negative for the burned points, such as the signed distance of any point from the fire boundary. The area burned by the wildland fire may be defined as Ω(t) = {x, t: φ(x, t) < 0}. The boundary of Ω is Γ that is the front line contour of the wildland fire.

When the ROS V(x, t) depending on the environmental conditions and of the local orientation of the fire front is known, the evolution of the fire front can be efficiently simulated by the numerical solution of Eq. (14).

At any given time, the burned area is easily calculated from the value of the indicator function, and a map of the time of arrival of the fire may be easily deduced.

5.5 Budoni (Olbia, Italy) Wildfire Test Case

As an example the case of a wildfire in the hilly area close to the village of Budoni (Olbia, Italy) is examined where 145 ha were burnt on August 26, 2004. The burnt area was covered by the typical shrubland Mediterranean vegetation, and plant height in the range 1–4 m, with small surfaces covered by open wooded pastures and grasslands. The fire started at 5 p.m. in very windy weather conditions. The dominant wind came from north–west (as the MOLOCH forecast shows in Fig. 19) while locally near Budoni (Olbia, Italy) the wind came from west–southwest with an average speed of 35 km/h; the temperatures were moderate, ranging from a minimum of 20 °C to a maximum of 24 °C.

Fig. 19
figure 19

The wind conditions and the ignition point for the Budoni (Olbia, Italy) wildfire on August 26, 2004. The picture on the left shows the wind condition on Sardinia 10 m above ground at 5 p.m. when the fire was started. The picture on the right shows the orography in color and the ignition point position

The fire spread quickly toward east, driven by the wind. In the south area, potentially interested by the fire, a fire suppression action was successfully conducted. On the opposite flank, the intervention was impossible, but the fire was naturally slowed down probably thanks to the terrain slope and wind intensity. The fire lasted 6 hours and a half; afterward it was definitively stopped by means of aerial interventions and thanks to the decrease of the wind speed.

5.5.1 Fire Model Setup and Test Procedure

The computational domain for the fire spread calculation has the same size and the same resolution of the wind grid on the ground (a side of 8 km and a grid spacing of 10 m). To forecast realistic rate of spread, suitable fuel models were chosen according to the land cover inferred from the CORINE map (Bossard et al. 2000). In particular the “chaparral” standard fuel model (Scott et al. 2005) for shrubland vegetation was replaced with a custom fuel model optimized for Mediterranean maquis (Bacciu et al. 2009). Figure 20 shows the CORINE land cover for the test area: every cover class represents a basic fuel model; we remapped these classes into standard fuel models plus a custom model for Mediterranean maquis.

Fig. 20
figure 20

GIS model of the land cover and terrain usage in the wildfire area. The different colors are representative of different terrain covers; red is for Mediterranean shrubland and yellow is for grasslands

Furthermore, to evaluate the sensitivity of the fire propagator to the wind conditions, we run several tests using different wind data: wind in output from the BOLAM–MOLOCH chain, wind downscaled with CFD models, and observed wind from the Regional Agency for Environmental Protection of Sardinia (SAR-ARPAS). All of these wind conditions were in agreement with the dominant mistral wind, but only the CFD downscaling was able to achieve a realistic interaction between mistral and the very fine orography used for the fire modeling. Figure 21 shows an example of high-resolution wind obtained with the mass-consistent model.

Fig. 21
figure 21

High-resolution wind results on the area of the wildfire at the time the wildfire started. The domain spans an area of 8 km × 8 km on the ground centered at the point of ignition and 2 km above the ground. The grid resolution at which wind results are obtained is 10 m

5.5.2 Test Results

A meaningful outcome for this test case is shown in Fig. 22: the forecast of fire front is tracked every 30 min (inert fuel has been set in areas where the wildfire was suppressed), and the total burned area is in good agreement with the perimeter tracked by the firefighters (as can be seen at the bottom of the figure). The pattern of the burned area is very similar, but there is a relevant overestimation in the upper area. These differences are due to uncertainties deriving mainly from:

Fig. 22
figure 22

Budoni (Olbia, Italy) wildfire, (at the top of the figure) time evolution of predicted front and (at the bottom of the figure) the observed fire perimeter after 6 h of burning

  • The simplified modeling of weather conditions at wildfire scale

  • The nontrivial interaction among wildfire, local wind, and topography

  • The simplified modeling of fuels and relative rate of spread

  • The real position and time of the wildfire ignition

  • Human action against the wildfire

We are confident that optimal results (for a real-time fire forecasting) can be achieved, improving some of these limits (as it is described in the following section).

5.6 Concluding Remarks

The service developed appears as a good example of the application of high-performance computing to the solution of complex environmental problems.

The results so far appear really promising, in terms of efficiency and quality of the results. Some important improvements are anyway possible and foreseen in all the steps of the computational chain.

The local meteorological models are quite stable and accurate. Nevertheless, more computing power would be necessary in order to have quicker forecast and more detailed output, possibly adding a further level of nesting to the described chain.

The wind analysis module may be greatly improved by adding a parameterization of the atmospheric instability calculated from the meteorological model and a feedback from fire related to the energy release. The model can be greatly improved by including the full set of balance equations and a closure model for turbulence. At the moment this upgrading is limited by the available computing power, since the run time of the simulation would be too long for an operational use.

The fire propagation model at the moment is limited to surface fires and cannot model the transition to crown fire and some phenomena that appear in extreme conditions; these models are available and their implementation is foreseen in the future.

6 Conclusions

A grid computing portal that can be accessed from all kind of devices via web browser provides the necessary computing power and gives much more freedom and flexibility to the user. Processing speed is increased (by a factor of 10 to 100 in out test cases) and even very large results can be stored safely on the cloud storage, from where they are directly accessible for colleagues and collaborators. Since data, application software, and results are on a central server that is location independent and accessible by multiple clients, a collaboration between researchers of spatially separated institutions is strongly facilitated. For example, meetings and seminars can be conducted via the Internet where participants can view examples or perform experiments with different workflows or parameter settings without having any software locally installed on their laptops. So far our work was focused on real-time data analysis and processing. To unfold the full potential of the remote collaboration capabilities, still a lot of work has to be done. We hope to find in the future the resources to make this portal accessible to a larger community of users in order to unfold the full potential of the cloud computing solution.