Keywords

1 Introduction

During the last decade, projection mapping or spatial augmented reality (SAR) has been tremendously widespread over the world. The goal is to seamlessly merge physical and virtual worlds by superimposing computer generated graphics onto real surfaces. One of the biggest differentiator compared to other augmentation techniques is the capability of projection mapping to let many users directly experience the augmentation without wearing glasses or any other devices. Constant improvements in size, pricing, and brightness of projectors have allowed many people to develop their own projection mapping projects. They used a large variety of surfaces as projection targets: large buildings, cars, shoes, furniture, and even living creatures such as fish in an aquarium and human dancers. In the emerging application scenarios, there are strong demands for displaying desired appearances on non-planar, textured, and/or dynamically moving surfaces under environmental lightings.

Typically, projectors are designed and used to display images onto a planar, uniformly white, and static screen in a dark environment. Due to this fact, current projectors are not suitable for most projection mapping scenarios. Particularly, the dynamic range, frame-rate, latency, and depth-of-field (DOF) limit their applicability. These technical limitations of the projector hardware make it difficult to display desired appearances in the wanted visual quality even when the computational algorithms are applied. Researchers have applied the emerging “computational imaging and display” approach, which is a joint design of display hardware, optics and computational algorithms to overcome the limitations [26].

This invited talk summarizes the recent advances of projection mapping hardware solutions to display desired appearances onto non-optimized real surfaces in an enhanced visual quality. The following sections introduce computational display solutions to overcome the mentioned technical limitations and to achieve high dynamic range, high speed, and wide DOF projections. In addition, some works on an emerging technical issue in the projection mapping research, dynamic projection mapping, are introduced in the last part of the paper. Note that this article is written by reorganizing a previously published state-of-the-art report paper by the same author [13] for an invited talk at the IAPR Computational Color Imaging Worksop (CCIW) 2019. Therefore, some texts of this article are overlapped with the previous one.

2 High Dynamic Range Projection

The dynamic range or contrast of a projection display is defined as the ratio of the maximum to minimum luminance. The range of luminance values in the real world is extremely wide, from an outdoor scene in sunshine to an indoor scene under a candle light. Consequently, a high dynamic range (HDR) representation would be required to realistically render and display both natural and computer generated images. However, current projectors, except for laser-based devices, can support only a significantly limited dynamic range; i.e., a simultaneous in-scene contrast is typically limited to the range of between 1,000:1 and 6,000:1 [10]. Note that this section discusses the simultaneous dynamic range that is achieved without additional mechanical adjustments, such as auto-iris aperture control, which globally brightens or darkens all the pixels in a projection image and does not change the contrast within a single image. Theoretically, laser projectors are able to achieve extremely high dynamic range representations since they can be completely turn off the laser beam when displaying black pixels. Therefore, a laser projector would be one of the best choices in normal projector usage scenarios such as a theater. On the other hand, due to an eye safe issue, it is not practical in SAR or projection mapping applications where user’s eyes might be located between a projector and the augmented object. Several other solutions have been presented to overcome the contrast limitations. They can mainly be subdivided into methods trying to achieve that goal by reducing the black-level and hardware which locally amplifies the amount of photons. We will discuss them in the following paragraphs.

HDR projection has been achieved by applying the double modulation principle, by which the emission of a light source is spatially modulated twice at cascaded light blocking spatial light modulators (SLMs), e.g., a digital micromirror device (DMD) or a liquid crystal display (LCD), to reduce the luminance of dark pixels (or black level) while maintaining those of bright pixels. Researchers proposed several double modulation methods so far such as applying two LCD panels, and these successfully increased the dynamic range of a projected image by significantly lowering the black level luminance which is perceived when displaying zero intensity values (for more details, see a state-of-the-art report [8]). Recent works applied dual LCoS (Liquid crystal on silicon) designs [14, 16].

Even when an ideal projector with infinite dynamic range is applied, environmental light and/or global illumination effects such as inter-reflection increases the reflected black level which consequently decreases the dynamic range of the projection. In particular, it is natural to assume that many projection mapping applications are run with a small amount of environment light contribution and non-flat, concave surfaces might be used as projection targets. Therefore, increasing the dynamic range of a projector is not sufficient, but the whole projection system including the surface must be optimized. To this end, researchers have proposed to spatially modulate the reflectance pattern of a projection surface to suppress the elevation of black level [7, 18, 19, 38]. More specifically, the luminance of a projected light is theoretically computed as the multiplication of the reflectance of a surface and the incident light illuminance. Therefore, it is possible to avoid undesirable black level elevation by decreasing the reflectance at a place where dark image should be displayed.

Fig. 1.
figure 1

Experimental results of 3D HDR projection based on reflectance modulation using a full-color 3D printer [38].

Bimber and Iwai proposed to use printed media including an e-ink display for the reflectance modulation [7]. This was also extended to static 3D surfaces by applying a full-color 3D printer [38] (Fig. 1). Because these methods applied static or almost static reflection media, dynamic image contents such as movies are not suitable. Jones et al. proposed to optimize the surface reflectance pattern to display a short periodic movie sequence in HDR [19]. A dynamic modulation of the reflectance pattern was also investigated by Iwai et al. who proposed to cover the projection surface with a photochromic material such that the surface reflectance can be spatiotemporally controlled by applying UV illumination [18].

3 High Speed Projection

High speed projection systems enabling a much higher frame rate than a normal video rate (e.g., 60 Hz) are required in low latency scenarios. It has been achieved using DLP projectors that represent an 8-bit pixel intensity by controlling a MEMS mirror flip sequence, whether it reflects a light from a light source to the objective lens or not, at thousands of frames per second. Using the mechanism, researchers developed high speed projection systems.

Projection mapping applications generally require a precise alignment between a projected image and a physical surface. Even a small misalignment is salient, and thus, causes a significant degradation of the sense of immersion. This requirement becomes significantly more rigorous in dynamic projection mapping scenarios, in which a slight temporal delay of an even geometrically perfectly aligned projection causes a noticeable misalignment. For example, Ng et al. investigated the noticeable shortest latency for a touch panel interface [33]. They showed that participants perceived a misalignment when the latency between touch input and the display of this visual feedback on the touch position was greater than 6.04 ms. This maps to a minimum desired frame rate of approx. 165 Hz and challenging latency requirements.

Recently two solutions have been presented to overcome this latency issue. First, the direction of an image from a normal projector is rapidly controlled using a dual-axis scanning mirror galvanometer system to project images onto a moving surface without perceivable delays [34, 40]. The latest work downsizes the system and realizes a portable high speed projector-camera system (procams) [27]. However, the frame rate of the projector is about 60 Hz and cannot interactively update the projected image content according to the movement of the surface without noticeable latency. Therefore, this method assumed that the perspective projection of the surface on the projector’s image plane does not change while projecting, and consequently, the surface geometry is limited to simple shapes such as a sphere.

The second solution is to apply high-speed projectors that can display 8-bit images at several hundreds frames per second with low latencies. Watanabe et al. developed a projection device that has the ability to project 8-bit monochrome images at a frame rate of up to 1,000 Hz [42]. To achieve the 1,000 Hz projection, the DMD’s mirror flip pattern as well as temporally adapting LED intensities are used. Combined with a high speed camera (1,000 FPS) this projector is able to achieve a dynamic projection mapping onto rigid and deformable surfaces without noticeable misalignments [31, 32, 41]. Kagami and Hashimoto achieved to stick a projection image onto a moving planar surface using a customized high-speed procams [20, 21]. Bermano et al. applied high speed procams to human face augmentation [5]. For the latter, a commercially available 480 Hz projector was used. When handheld or wearable projectors are used, the projectors rather than target surfaces move. Regan and Miller proposed a technique to reduce motion blur artifacts in such situations using a high speed projector [37]. Such systems have also been used in the fields of virtual and augmented reality other than projection mapping, where researchers have tried to minimize latency [24, 45].

Fig. 2.
figure 2

Experimental results of focal sweep based extended DOF projection using an ETL [17].

The latest work developed a projector that can embed spatially varying imperceptible binary codes (max. 64 bits) in a full color 24-bit image. The imperceptible codes can be used for optical communication to control various systems such as robots by projected lights while projecting meaningful images to human observers [15].

4 Increasing Focal Depth

Projectors are inherently designed with a large aperture to minimize the loss of light emitted from the light source. However, this optical design leads to a shallow depth of focus (DOF). Consequently, an image projected on a surface with large depth variance can become blurred quickly. Therefore, extending DOF of projectors is highly demanding issue especially in dynamic projection mapping applications where projection objects and/or projectors are moving in large spaces. Previous techniques fall into two categories: single-projector and multi-projector approaches.

Single-projector approaches digitally sharpen original images before projection so that an optically defocused projection closely approximates the original (i.e., unblurred) image. Defocus blur of a projected image is explained mathematically as the convolution of a PSF (point spread function) and the original image. If the PSF of a projector on an object’s surface is estimated correctly, a defocus-free image can be displayed by digitally correcting the original image using a deconvolution method, such as the Wiener filter [9]. Zhang and Nayar formulated image correction as a constrained optimization problem [44]. However, as summarized in a state-of-the-art report [8], such techniques suffer from the loss of high frequency components because PSFs of normal projectors are generally low pass filters. In the last 10 years, new optical designs have been introduced to enhance the performance of extending the DOF of a projector. For example, researchers apply coded apertures that have two-dimensional complex patterns instead of an ordinary circular aperture to make the PSFs more broadband [12, 25]. Another strategy is to apply a focus tunable lens (FTL) (a.k.a. electrically tunable lens (ETL)) to sweep the focusing distance through the scene to make the PSF invariant to scene depths [17] (Fig. 2).

Fig. 3.
figure 3

Experimental results of extended DOF projection using a mirror array based multi-projection system [29].

As a pioneering work of the multi-projector approach, Bimber and Emmerling realize multifocal projection using multiple projectors each with a focal plane at a unique distance [6]. For each point on a projection surface, they selected an optimal projector that could display the sharpest image at that spatial point location. Their multi-projector approach does not require deconvolution. However, when an object moves, it does require the projection of spatial pattern images on the surface to estimate PSFs from every projector. In addition, the black level rises with each superimposed projection. Nagase et al. proposed a model-based method that can select the optimal projector for each surface point even when the surface moves [29] (Fig. 3). This is achieved by estimating PSFs from geometric information, such as the shape of the surface and the relative pose of the surface to projectors. Multi-projector system with focal sweep technique realized a wide field-of-view and extended DOF projection [30]. A more general solution is to apply a multi-projector light transport matrix that models the influence of each projector pixel on a camera image that is regarded as an observed image [1, 4, 43]. Each projector image can be determined by computing the inverse light transport matrix (Fig. 4).

5 Dynamic Projection Mapping

While projection mapping has been an active research field for a long time, most of the earlier research focused on the augmentation of static objects, or slowly and rigidly moving objects, since any dynamic projection system significantly adds up in system complexity and performance requirements. However, since the computational power of CPUs and GPUs evolved quickly according to Moore’s law, and high-speed cameras and projectors are now becoming commercially available, more and more dynamic projection mapping systems have been published. These methods can be classified with respect to their degree of freedom when it comes to the dynamic components of the procam system. Most of the systems define dynamic in the sense that the scene rigidly transforms (or at least the non-rigid transformation is already known), or the projector or the camera is allowed to move. These approaches – although requiring significantly low latencies to generate convincing augmentations – can be supported by the application of known rigid geometry and potentially-available tracking information.

Fig. 4.
figure 4

Experimental results of dynamic projection mapping: (left) visual markers cancellation [2, 3] and (right) scanning timing based online geometric registration for a laser projector [23].

Methods for the augmentation of rigid dynamic objects do not require a full dense online surface reconstruction, but only a pose estimation of the projector with respect to the geometry to understand how the already known, geometrically rigid computer graphics needs to be rendered correctly by the devices. Applying a visual marker achieves a stable pose estimation. However, markers attached on a projection surface disturb projected results, as we can see the markers as a texture of the surface. This issue is resolved by combining a radiometric compensation technique to visually cancel the markers [2, 3]. Other researchers replace the markers with tiny photosensors to measure the scanning timing of a projected beam from a laser projector [23]. Due to the raster-scanning mechanism, the pixel coordinate of the projected beam is uniquely identified from the measured time information. Once more than six photosensors measure the scanning timings and identify these pixel coordinates, the pose of the surface is estimated Leveraging a 1,000 Hz high speed procams (cf. Sect. 3), a visual marker-based method achieves a very low latency registration [41]. A stable marker position prediction is possible because the distance between the previous and current marker positions are short due to the small time difference (i.e., 1 ms).

A solution for dynamic projection mapping onto a deformable object is described by Punpongsanon et al. [36]: It is realized by painting invisible markers based on infrared ink onto the surface, which, being measured by an infrared camera, are used to estimate the surface’s non-rigid deformation and to adapt the projection accordingly. A high-speed camera is used to robustly track dot cluster markers drawn by the same invisible inks [32]. Alternatively, retro-reflective markers are used to measure the surface deformation in the word of Fujimoto et al. [11]. However, a fully dynamic tracking is not achieved by this method. The dot cluster markers were extended to also allow the projection onto dynamic objects as shown by Narita et al. [31].

A system to dynamically augment human faces using projection was presented by Bermano et al. [5]. It applies markerless human face tracking, estimates blend shapes describing the current expression, deforms a base mesh and applies a texture which is dynamically adapted depending on estimated expression, time, desired lighting, as well as the spatial location of the face. To simplify the overall processing pipeline, projector and camera were optically aligned allowing the whole augmentation pipeline to work in 2D space. The overall latency of the presented prototype is less than 10 ms. Although this might sound sufficiently fast, an extended Kalman filter (EKF) needed to be incorporated for motion prediction to keep the inevitable delay of the projection onto the surface below the visual perception threshold. Recently, a similar system based on the usage of depth sensors was presented [39]. While they show how such an augmentation can be carried out with optically unaligned depth cameras and multiple projectors, the latency of the incorporated depth sensors makes it currently impractical for any fast and sudden motions. However, with more advanced and faster hardware, such limitations might be overcome.

The latest work realized real-time projections onto fully non-rigid, dynamic and unknown moving projection surfaces [28]. It applied a high speed photometric stereo in IR lights to estimate the normal directions of the projection target’s surface and projected direction-dependent images onto the target.

6 Summary

This article described computational projection display technologies to overcome technical limitations stemming from projector hardware and improve projected image quality for arbitrary, imperfect surfaces beyond the capability of algorithmic solutions. It also covered the recent research trend, dynamic projection mapping. An interesting new research direction is to develop projection mapping technologies by taking into account the perceptual properties of human observers. There are several works already achieving illusiory visual effects such as deforming real objects by projection mapping [22, 35]. Another important direction is to integrate the techniques described in this article to develop an ultimate projection mapping system that can enhance the projected image quality regarding all the above mentioned technical issues. I believe the technologies introduced in this article will open up new application fields of the projection mapping and accelerate the development of useful products and services.