Keywords

1.1 Background

Seafloor exploration is an ancient activity that started thousands of years ago with human shallow diving [1]. Nowadays, underwater surveys have numerous scientific applications in the fields of archeology [2], geology [3, 4] and biology [5], involving tasks such as ancient shipwreck prospection [6], ecological studies [7, 8], environmental damage assessment [9, 10] or detection of temporal changes [11], just to name a few.

Due to human limitations when diving at extreme depths or during long periods of time, underwater surveys are nowadays carried out by UVs. UVs can be either Autonomous Underwater Vehicles (AUVs) or Remotely Operated Vehicles (ROVs), which are manually controlled by a pilot. These vehicles are often equipped with advanced navigation sensors. Typical sensor suites may include an Ultra Short Base Line (USBL), a Long Base Line (LBL), a Doppler Velocity Log (DVL), accelerometers, inclinometers, acoustic imaging sensors and optical cameras, among others, depending on the type, size and cost of the vehicle.

Among the sensors listed above, optical imaging provides short-range high-resolution visual information of the ocean floor. In the scientific scope, archeologists, geologists and biologists can benefit from these images as they provide the most precise and accurate representation of the areas surveyed, from the cognitive point of view, enabling a detailed analysis of the structures of interest.

Fig. 1.1
figure 1

Illustration of an underwater vehicle acquiring images at low altitude due to the constraints imposed by the medium. Poor lighting conditions require the use of artificial lighting, which in addition to light attenuation leads to non-uniform lighting and induced shadows in the acquired images. Scattering and moving objects, such as fish or algae are some of the other specific challenges that appear due to the particularities of the acquisition medium

Nevertheless, the underwater medium adds particular challenges to the image acquisition task (see Fig. 1.1). When an underwater vehicle acquires images in deep waters, light attenuation has a huge impact on the visibility range and color reproduction, especially when the vehicle navigates at changing altitude (i.e. distance from the camera to the seafloor). Due to the light attenuation phenomenon, image acquisition needs to be performed close to the seabed, considerably limiting the maximum area covered by a single photograph. Hence, optically mapping large seafloor areas can only be achieved by building image mosaics from a set of reduced-area pictures.

The history of large-scale, deep-sea optical mapping starts with the French-American Mid-Ocean Undersea Study (FAMOUS) project [12], in 1974. In this survey, the Alvin submersible explored the great rift valley of the Mid-Atlantic Ridge, southwest of the Azores. The cruise was planned based on large sequences of images supplied by the US Navy, which were manually aligned on a gymnasium floor, i.e. a photo-mosaic.

Over the last decade, the relevance of photo-mosaicing has grown significantly. As a clear example, numerous off-the-shelf still cameras now include built-in algorithms to fuse several pictures from a panoramic sequence into a single wide-angle view. Furthermore, gigapixel photo-mosaics [13] of the entire Earth are easily available through the Internet, using a limited bandwidth connection. In most cases, such large mosaics are created from terrestrial, aerial or space related imagery. The common photo-mosaicing problems for this kind of image, comprehending the compensation of different exposures and non-uniform illumination, have been treated in the literature [1420].

Unfortunately, performing underwater image surveys is a challenging task with a much higher level of complexity than conventional terrestrial or aerial image photo-mosaic generation. As stated in [21], and due to constrained image acquisition conditions, both the navigation data and the images acquired have to be used to recover an accurate estimate of the camera poses during the survey. This information fusion is often performed by means of Global Alignment (GA) techniques [2124]. This is a mandatory step before generating precise visual maps of the seafloor. In most cases, the short distance between the camera and the seafloor produces parallax effects (see Fig. 1.2), which considerably affect 2D mosaicing approaches due to the violation of the planarity assumption, i.e. the assumption of a flat scene, which allows the computation of 2D transformations between images. Furthermore, suspended particles causing the scattering phenomenon [25] are commonly present. Moving elements, such as fish and algae, are examples of other common issues in underwater image processing.

Fig. 1.2
figure 2

Sequence corresponding to a straight trajectory of an AUV depicting the parallax problems. It shows the side and camera views of the robot’s trajectory. One side of the chest disappears from the frame while the other arises due to the parallax effect

Fig. 1.3
figure 3

Underwater mosaic of a dam and zoomed detail before (top) and after (bottom) the application of an image blending technique. In the blended mosaic, the elements on the dam wall (mainly algae) are clearly visible, whereas in the non-blended mosaic they are hardly distinguishable

Fig. 1.4
figure 4

The first underwater image in history taken by William Thompson in February 1856 in Dorset (England) with an almost totally submerged photographic camera (source Christian Petron, History of Underwater Image, Digital Edition, 2011)

Using the navigation data collected by the UV allows us to estimate the camera poses during the acquisition. Consequently, from these camera poses the vehicle trajectory can also be recovered. Once an initial guess of this trajectory is obtained, it can be refined through global alignment techniques by using the information from the acquired images. As a result of this processing pipeline, the acquired images can be projected and rendered into a single and common reference frame. Nevertheless, it is necessary to perform one last step to give the heterogeneously appearing image dataset a continuous and uniform appearance in the form of a single large mosaic. This is achieved by means of image blending techniques (see Fig. 1.3).

Apart from the visual appearance, blending techniques are also important for proper interpretation and scientific exploitation of seafloor imagery (e.g. [26, 27]). The structures and objects of interest may cover a wide range of scales, from a few centimeters, i.e. macrofauna or rocks which would appear in individual images, to several tens or hundreds of meters, i.e. topographic scarps or fractures spanning thousands of images. To correctly analyze such varying features, and to understand the spatial relationships that may exist (e.g. faunal assemblages associated with geological features), it is preferable to have a single, wide area photo-mosaic, in which imaging artifacts are minimized, and identified features of interest may be accurately represented regardless of their size and imaging conditions.

1.2 Challenges of Underwater Optical Imaging

According to John F. Brown [28], the first underwater picture (Fig. 1.4) was taken by William Thompson in February 1856 in Dorset (England). The photographer lowered a housed 5” \(\times \) 4” plate camera to the seabed in Weymouth Bay and operated the shutter from an anchored boat. The exposure time used to acquire the picture was 10 min during which time the camera flooded, however the film was salvaged. Scuba diving, which can be intuitively considered as a more conventional way to acquire underwater images, did not exist as a common activity until several years later.

Acquiring optical images underwater is significantly more difficult than performing conventional land photography. Submerging a camera underwater using an adequate housing and maneuvering it appropriately is a complex task by itself. However, the most important challenges are imposed by the underwater medium properties affected by several phenomena which condition the acquisition procedure. The two main underwater phenomena strongly affecting image quality and consequently the acquisition task are light attenuation and scattering [19].

Apart from these two main phenomena, the camera parametrization is another key point affecting image quality. When acquiring images underwater using a still camera, the automatic adjustment mechanism may try to slow the shutter speed and increase the aperture in order to better deal with the low light conditions. This setup is very sensitive to camera movement and thus, unsuitable for a camera mounted on an AUV or ROV. When the acquisition is performed in shallow waters, the ambient light can be sufficient to acquire quality images, but when performed in deep waters high power artificial light sources are required. Using artificial light, typically consisting of one or more directional sources, leads to another problem affecting images, especially when registering them to build a mosaic, which is non-uniform illumination of the scene. Finally, when using artificial lighting, the shadows induced on the scene create an apparent motion which is opposite to the real motion of the camera.

Fig. 1.5
figure 5

(Top-left) Example of backward scattering due to the reflection of rays from the light source on particles in suspension, hindering the identification of the seafloor texture. (Top-right) Example of forward scattering caused by the local inter-reflection of light on the suspended particles, hiding the terrain behind them. (Bottom-left) Image depicting the effects of light absorption in the underwater medium, where longer wavelengths are first absorbed, causing the bluish appearance of the scene structures at a lower depth. (Bottom-right) Effects produced by light attenuation of the water resulting in an evident loss of luminance in the regions farthest from the focus of the artificial lighting

Fig. 1.6
figure 6

Light attenuation in the visible spectrum range (from 390 nm to 770 nm) prevents sun light wavelengths from reaching long distances below the water surface. The longer wavelengths corresponding to the reddish tones are the first to be attenuated, while the shortest ones corresponding to the bluish tones are the last

Light Attenuation

Sunlight wavelengths in the visible spectrum for a typical human eye range from 390 nm (violet tones) to 770 nm (reddish tones) [29]. Light attenuation is due to the light absorption by water, which increases exponentially with depth and affects all wavelengths to varying degrees, and depends on the different water bodies [30]. Therefore, sun light cannot penetrate to any great depth and artificial lighting systems are required when acquiring images several meters below the surface (see Fig. 1.6). When using artificial light sources, such as continuous lights or strobes, the acquired images show brighter and richer detail information in the region on which these lights are focused, while rendering a darker and contrastless appearance of the surroundings (Fig. 1.5-bottom-right). This effect is accentuated due to the vignetting caused by the camera optics. Light attenuation also leads to color loss (Fig. 1.5-bottom-left). The longer wavelengths corresponding to the reddish tones are the first to be attenuated, while the shortest ones corresponding to the bluish tones are the last. This loss is the reason for the greenish or bluish appearance of objects in underwater scenes as the distance between object and camera increases. Some organic particles, such as phytoplankton frequently found in coastal waters, absorb light predominantly in the shortest wavelengths (corresponding to the blue and violet tones), allowing only the greenish tones to persist. In order to deal with light attenuation in seafloor mapping, high power and appropriately distributed artificial light sources should be used, and image acquisition should be performed as close to the seabed as possible.

Scattering

The presence of organic and inorganic particles suspended in the volume of water intersected by the field of view of the camera and the illumination source (see “scattering volume” in Fig. 1.7) is the cause of the light scattering phenomenon. This is illustrated in Fig. 1.5-top. It can be strongly noticeable when caused by a suspended sediment load (also known as turbidity). The degree of scattering depends on the distance, the wavelength, and the characteristics of the particles (i.e. shape, density and refractive index). There are two types of scattering. On the one hand, backward scattering is an additive noise in the form of “marine snow” patterns which appear due to the reflection of the light from a given natural or artificial source on the suspended particles in the direction of the camera. On the other hand, forward scattering appears due to the inter-reflections of local light among the particles, and becomes the most significant source of image degradation leading to a non-uniform loss of contrast, definition and color fidelity. The scattering phenomenon can significantly affect the acquisition of images at a short distance from seabed. The vehicle carrying the camera may also cause the displacement of particles or soil lying on the ground, increasing the probability of backward scattering.

Fig. 1.7
figure 7

The scattering effect appears in the volume of water intersected by the field of view of the camera and the illumination source

1.3 Objectives

The numerous scientific applications of underwater optical imaging require providing experts with the most informative and visually pleasant representations possible of the seafloor. Underwater surveys carried out by both AUVs or ROVs generate a large volume of navigation and optical imaging data. This information needs to be post-processed and managed in such a way that makes its study by the scientists (e.g. [26]) as easy as possible or even just feasible. In that sense, photo-mosaics are an adequate way to manage, unify and consistently fuse all this optical imaging data and unite it with the navigation data to generate georeferenced maps. Providing the maps generated with a convincing and reliable appearance has not only aesthetic but cognitive purposes. The interpretation of a given scene becomes more intuitive and effective when its representation emphasizes its features and has a global smooth and continuous overall appearance.

Building a photo-mosaic from a large set of underwater images is a challenging task. The quality of every single picture might change considerably along the sequence due to the underwater lighting phenomena described above. Furthermore, the computational requirements to process this large amount of data from a given imaging survey limit the maximum size of the map generated.

Consequently, the goal of this book is to propose a complete blending approach using state-of-the-art methods capable of generating and blending large scale optical maps. The blending technique developed is focused on two main ideas. Firstly, the richness of detail in the original images should not only be preserved but also enhanced when possible. Secondly, the algorithms should be able to deal with datasets of thousands of images covering large areas of the seafloor (to the order of several hundreds of thousands of m\(^2\)). Consequently, the processing strategy needs to deal with underwater imaging while being well-suited for large input sequences.

1.4 Outline of the Approach

A single, large image, i.e. a photo-mosaic, is easier to interpret than a long sequence of consecutive pictures or even a video record, inasmuch as it offers a spatially and photometrically consistent representation of the seabed. In order to ensure this image consistency, blending techniques are required. These techniques, which produce a seamless mosaic, enable the interpretation of the benthos by a scientist (biologist, geologist, archeologist, etc.).

There are three main concerns guiding image blending algorithms. Firstly, the effects of different illumination or exposure times between images should be minimized. Secondly, an adequate seam should be found in order to reduce the visibility of micro-registration misalignments and moving objects. Lastly, a smooth transition along the selected seam must be applied to reduce the visibility of seams between images.

The topology of a mosaic is initially estimated based on the navigation data and a feature-based pair-wise image registration. After this initial estimation, a global alignment strategy [21, 22] is required to reduce the cumulative error of a simple sequential pair-wise registration. The strength of the global alignment arises from closing-loops, because they allow us to significantly improve the camera’s trajectory estimate when re-visiting an already mapped area. In the absence of loop-closings, and considering input sequences of thousands of images, the drift accumulated by the pair-wise transformations leads to significantly inconsistent (missaligned) photo-mosaics.

Aside from exposure variations, which are a common issue in terrestrial images, the remaining problems are not directly addressed by conventional panorama generation software. To better deal with the inherent underwater imaging problems (non-uniform illumination, light attenuation, scattering, exposure variations, etc.), we perform image pre-processing, which, in our experience, is a key step, strongly impacting the quality of the final photo-mosaic rendering. A depth adaptive inhomogeneous lighting compensation algorithm is proposed to deal with the non-uniform distribution of the artificial light sources in the scene whose effects are emphasized due to the light attenuation phenomenon. Concerning image detail enhancement, a gradient based image enhancement depending on the distance from the camera to the seabed, has also been proposed. Both scattering and light absorption phenomena may lead to highly variable appearances for images depicting the same area but acquired at significantly different depths. The aim of this enhancement is to bring the closest appearance to the involved images in order to achieve a consistent fusion.

Once the images have been preprocessed, thus making them more suitable for an adequate blending, an image selection algorithm based on image quality is applied, with two main aims. Firstly, to reduce the number of images to be processed with the next step algorithm and consequently reduce the computational cost. Secondly, to avoid lower quality images negatively affecting the appearance of the regions also covered by higher quality ones.

Next, a hybrid luminance-gradient graph-cut based optimal seam finding algorithm is proposed to locate the seams which minimize the photometric and morphological differences in the image boundaries. The proposed algorithm is able to robustly deal with differently exposed images, thanks to the gradient term, especially when image preprocessing is not enough to palliate these differences.

Then, we apply a gradient blending strategy in a narrow region around the optimally computed seams in order to ensure a smooth transition between the image patches involved. Additionally, the gradient nature of the blending also allows us to compensate eventual exposure differences between images.

Finally, a gigamosaic generation strategy is presented, based on the decomposition of the large-dimension mosaics into tiles of reasonable size that can be processed in conventional computers without large amounts of resources.

1.5 Contributions

The main contributions of this book can be summarized as follows:

  • A novel full mosaicing and blending pipeline optimized for underwater imaging is proposed. The effects of underwater phenomena such as non-uniform illumination and scattering are compensated for in an adaptive way, with the main aim of not only preserving, but also emphasizing, image detail richness.

  • An adaptive image enhancement algorithm has been developed to make fine image details sharper, also providing a continuous and consistent appearance to the whole mosaic image. The enhancement of a given image is determined by the detail richness of the adjacent images, but avoids overemphasizing the result.

  • The optimal seam finding algorithm used to determine the most adequate path for the cut between images is based on both luminance and gradient information. This domain combination allows us to ensure not only the lowest photometric differences along the path but also to avoid cutting objects, even in the case of significant exposure differences between images.

  • In order to address the problem of processing large datasets, a strategy allowing us to independently process different regions of the final mosaic is proposed. The area corresponding to a large dimension mosaic is divided into a regular grid of tiles, which are then individually processed, temporarily stored and finally fused to obtain the final single image. The appearance consistency between individual tiles is ensured thanks to an exposure equalization mechanism.

  • The full processing pipeline has been devised to use parallel processing in every step where possible in order to improve the overall performance of the approach.

1.6 Book Structure

The book is divided into the following chapters:

Chapter 2:

presents an introduction to a feature-based 2D mosaicing framework. The main concepts of planar motion estimation and global alignment are introduced.

Chapter 3:

reviews the state of the art in image blending techniques, presenting the two main principles guiding the algorithms. A classification of techniques is also proposed, based on their main features. The benefits and drawbacks of the different methods are discussed, as well as their suitability for underwater imaging purposes.

Chapter 4:

details the proposed processing pipeline optimized for high resolution underwater image blending. All the steps involved, including original image preprocessing, image registration and global alignment, selection of image contribution, optimal seam finding strategy and gradient domain image blending, are described. Finally, a giga-mosaic blending strategy is presented.

Chapter 5:

shows some experimental high-resolution results, based on large datasets, which are also discussed and compared to results obtained by other state-of-the-art approaches.

Chapter 6:

presents the conclusions of this work, summarizes the contributions and identifies some future research directions.