Contactless human–computer interaction system based on three-dimensional holographic display and gesture recognition

Yang, Yixin; Gao, Yunhui; Liu, Kexuan; He, Zehao; Cao, Liangcai

doi:10.1007/s00340-023-08128-2

Contactless human–computer interaction system based on three-dimensional holographic display and gesture recognition

Research
Published: 23 November 2023

Volume 129, article number 192, (2023)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Applied Physics B Aims and scope Submit manuscript

Contactless human–computer interaction system based on three-dimensional holographic display and gesture recognition

Download PDF

Yixin Yang¹,
Yunhui Gao²,
Kexuan Liu²,
Zehao He² &
…
Liangcai Cao^1,2

600 Accesses
Explore all metrics

Abstract

Contactless interaction provides a hygienic, safe, and intelligent way for communication between human and computers. As computing systems continue to advance, there is a growing focus on input methods and keyboard interaction approaches. However, current research on virtual keyboard systems remains confined to two-dimensional interfaces. The existing contactless interaction implementations are generally limited by handheld tools, wearable devices, or projection screens. In this work, we propose a contactless three-dimensional (3D) virtual keyboard system, which combines gesture recognition and holographic display. A hand-tracking sensor is used to collect the user’s gestures and fingertip positions as input. A spatial light modulator is utilized to generate the corresponding holographic 3D display images. The hand-tracking sensor and display module are controlled in a synchronous manner by a personal computer. The robust and real-time system is demonstrated through user-interactive experiments with a 3D virtual keyboard. The scalability of the proposed human–computer interaction system supports a wide range of applications, including education, communication, disease prevention, information security, and virtual reality.

Hand Tracking with a Near-Range Depth Camera for Virtual Object Manipulation in an Wearable Augmented Reality

A Touchless Gestural Platform for the Interaction with the Patients Data

Long-Range Hand Gesture Interaction Based on Spatio-temporal Encoding

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Human–computer interaction (HCI) is one of the rapidly growing topics in recent years, which plays a pivotal role in smart life [1], healthcare [2, 3], and virtual/augmented reality (VR/AR) [4, 5]. The HCI techniques include graphical user interface, voice control, biometric recognition, and gesture recognition. A typical information transmission bridge between human and computers is the user interfaces (UIs). UIs based on touchable interfaces are already widely used. However, touchable interfaces inevitably lead to physical contact, which increases the risk of bacteria and viruses cross-infection [6]. Contactless HCI is desirable in remote communication and control systems for its advantages in terms of hygiene, convenience, and safety [7, 8]. Various contactless HCI methods have been proposed including virtual keyboard.

Keyboards in modern computers have inevitably led to an increase in the volume of text-based communication. As computing technology expands beyond the desktop, academic and industry researchers are motivated to seek efficient text input methods to replace the conventional desktop keyboard. Mobile text input has been a recurring topic at nearly every HCI conference since the 2000s [9,10,11,12]. An alternative to soft keyboards for mobile devices was resistive touch screens in conjunction with a stylus. Then finger-operated capacitive touch screens and soft keyboards became the dominant text input method. With the progress of computing systems, numerous studies on keyboard interaction have emerged. These studies include designing new interactive keyboard approaches [13,14,15], developing advanced algorithms for gesture control systems [16], and establishing evaluation criteria and design rules to improve user experience [17]. The realization of virtual keyboards typically includes two components: gesture recognition and interface display. Gesture recognition enables various forms of interaction [8, 18,19,20,21] by detecting and analyzing the movement of users’ hands. In recent years, a variety of movement tracking devices have merged, which can be employed to implement gesture UI. The advancement of diverse display techniques offers an intuitive perceptual experience for interface display. However, most studies concerning virtual keyboard systems are focused on two-dimensional (2D) interfaces, with limited exploration of three-dimensional (3D) virtual keyboard interaction. This constrains the interactive objects and scenarios, thereby limiting the user experience.

A UI featuring 3D display can provide users with an immersive perceptual experience. 3D display techniques include binocular vision display, volumetric display, light field display and holographic display [22, 23]. Among them, binocular vision display can provide psychological and physiological cues. Its convergence-focusing conflict may cause visual fatigue and dizziness [24,25,26,27,28]. Volumetric display enables all physiological cues. It is lack of psychological cues such as occlusion, shadow, and texture [29,30,31,32]. Light field display offers psychological, binocular depth and monocular depth cues. Its focus cues are incomplete [33,34,35,36,37,38]. In contrast, holographic display can in principle provide all kinds of depth cues and thus more realistic sensation [39,40,41,42]. In its original form, holography encodes the wavefront on photosensitive materials by interference. With the development of computer technology, it is now possible to calculate the wavefront numerically and reproduce 3D scenes using programmable spatial light modulators (SLMs), which is known as computer-generated holography (CGH). Following the advancements in CGH algorithms [43,44,45,46], fast and high-quality 3D reconstruction can now be achieved, paving the way for the realization of AR/VR [23, 47, 48] as well as Metaverse [49].

Applying CGH in HCI systems is an active research trend. Holographic 3D gesture UIs are used to establish interactive color electroholography system [50], video system for rotating and scaling specific virtual objects [51], holographic projection system for drawing fingertip trajectories [52], aerial writing and erasing [53]. Shimobaba et al. proposed an interactive color electroholography system, which used the field-programmable gate array and the time division switching method for color reconstruction [50]. Adhikarla et al. reported their design of 3D gesture interaction with full horizontal parallax light field display [54]. Yamaguchi et al. used light field display to establish a 3D user interface [55]. By detecting the scattering light of user’s fingertips from the 3D floating image, they realized 3D touch interface [56]. Sang et al. presented an interactive system with a mouse based on floating light field display [57]. Yamada et al. demonstrated an interactive, full-color holographic video system, which realized rotating and scaling holographic objects [51]. Sando et al. merged a 3D holographic-display system with digital micromirror devices and rotating mirrors. They projected viewable 3D videos with mouse interaction [58]. Suzuki et al. proposed a real-time holographic projection system for drawing trajectories with fingertips [52]. Sánchez Salazar Chavarría et al. proposed a 3D user-interface based on a holographic optical element to detect scattered light of fingertips [59]. Nishitsuji et al. demonstrated a holographic display system for drawing of basic handwritten content, which used a tablet and a touch pen [60]. Takenaka et al. built a holographic aerial writing system to draw and erase finger trajectories [53]. Sánchez Salazar Chavarría et al. put forward a method to register the user’s position and the reconstructed 3D content without a calibration [61]. These studies demonstrate the advantages of holographic contactless system, including quick initiation, intuitive visual experience, and accurate interaction process. However, in these studies, handheld tools, wearable devices, and projection screens are utilized frequently, which may lead to inconvenience. There is a lack of research on 3D gesture UI as to the virtual keyboard interaction.

Drawing inspiration from the advantages of CGH, we propose a 3D virtual keyboard system to combine gesture recognition and holographic display. A hand-tracking sensor is used to collect the gestures and fingertips’ positions, and a SLM is used to generate 3D display patterns. The hand-tracking sensor and the SLM is operated synchronously with feedback controls by a personal computer. We conducted the user-interactive experiments to evaluate system’s accuracy and response time. No wearable devices, handheld tools, or projection screens are required, thereby eliminating potential user inconvenience. The robust 3D virtual keyboard system is expected to serve as a solution for mobile text input, and make contributions to the 3D user-interface in VR and AR.

2 Methodology

2.1 System configuration

The system setup is schematically shown in Fig. 1a. A 532 nm fiber-coupled laser is used as the coherent source for illumination. The emitted light from the single-mode fiber is collimated, properly polarized, and then modulated by the SLM. By uploading a pre-calculated modulation pattern to the SLM, a corresponding 3D image can be reconstructed at the target location. To facilitate convenient hand interaction, we introduce a magnification module with two lenses, L1 and L2, producing an enlarged 3D scene within the detecting area of a hand-tracking sensor. The hand-tracking sensor and the SLM are synchronously controlled by a computer. A camera is used to observe and record the interaction from the perspective of the human eyes.

Figure 1b shows the two core modules: the holographic display module and the gesture recognition module. In the holographic display module, holograms are pre-calculated and uploaded to the SLM to reconstruct the 3D object. The gesture recognition module utilizes the hand-tracking sensor to detect hand structures and accurately measure fingertip positions, which are then used to determine the corresponding interaction behaviors. Accordingly, an updated hologram is then sent to the SLM. By operating in such a closed-loop manner, the system can achieve real-time, dynamic 3D display with interactive capabilities.

The system combines holographic display and gesture recognition techniques, thereby facilitating real-time contactless interaction with 3D virtual objects. The incorporation of a magnification system effectively addresses the constraints imposed by the limited light field range of the SLM. The computer directs a harmonious coordination between the hand-tracking sensor and the SLM, providing an intuitive user experience. These collective features contribute to the system’s capability and expandability.

2.2 Gesture recognition module

3D hand gesture recognition has attracted increasing research attention in the field of HCI. This recognition can be achieved through vision-based or sensor-based approaches [62]. Techniques to obtain 3D spatial–temporal data generally involve stereo cameras, motion capture systems, and depth sensors [63, 64]. Stereo cameras are based on human binocular vision. Motion capture systems use wearable markers or motion tracking techniques for position estimation. Depth sensors include time-of-flight (ToF) cameras and structured light cameras. Popular depth sensors are ToF camera, Intel RealSense [65], Kinect [66] and Leap Motion [67]. Unlike sensors that capture full-body depth, Leap Motion, which we choose in this work, focuses on hand tracking. It has an accuracy of 0.01 mm in detecting hands and fingers. Its tracking area is an inverted quadrilateral, with a horizontal view of 140°, a vertical view of 120°, and a depth ranging from 10 to 80 cm, as shown in Fig. 2a. This device comprises two grayscale infrared cameras, four infrared LEDs, and a top filter layer that allows infrared light to pass through only.

The gesture recognition process involves data acquisition, pre-processing, segmentation, feature extraction, and classification. When a hand enters the detecting area, it is automatically tracked, and a series of data frames are acquired. The raw data is subsequently pre-processed by a built-in software. Figure 2b illustrates the hand structure measured by Leap Motion, which includes information about the fingers, gestures, position, velocity, direction, and rotation. The joints of the thumb, index, middle, ring, pinky, and wrist are recognized, along with the distal, intermediate, proximal, and metacarpal bones of each finger. Additionally, Leap Motion can capture high-speed movements at 200 frames per second.

In this work, we consider a keyboard interaction scenario, where the positions of the fingertips serve as the input. Figure 2c illustrates the criteria for pressing the button “Delete”. The coordinate origin is defined as the center point on top of the Leap Motion. The positions of the five fingertips are denoted as $(x_{n} ,y_{n} ,z_{n} ),n = 1,2,3,4,5$. The foremost fingertip, determined by the minimum position along the z-axis, is considered as the button press candidate:

$$z_{k} = \min \left\{ {z_{n} |n = 1,2,3,4,5} \right\}.$$

(1)

The activation of the button “Delete” occurs when the fingertip enters a pre-defined cubic area, where the following conditions are satisfied:

$$|z_{k} - z_{(0)} | < d,z_{(0)} = 0,$$

(2)

$$|x_{k} - x_{(0)} | < l/2,$$

(3)

$$|y_{k} - y_{(0)} | < w/2,$$

(4)

where $(x_{(0)} ,y_{(0)} ,z_{(0)} )$ denotes the center position of the “Delete” button, l and w denote the length and width of the “Delete” button, respectively. d is the depth threshold value along the z-axis. The interaction criteria for other buttons are similar to the button “Delete”.

2.3 Holographic display

In a contactless interaction system, it is desirable to create 3D scenes for decent user perceptual experiences. Holographic display can provide various depth cues, leading to a realistic sensation. With the progress of CGH algorithms, high-speed and high-quality holographic display is achieved. In this work, the 3D display module is based on CGH. Dynamic holographic display can be achieved by utilizing programmable wavefront modulation devices, such as SLMs and digital micromirror devices (DMDs). In our prototype system, a phase-only liquid–crystal SLM (LC-SLM) is used for its high efficiency compared with amplitude-based devices.

Given a target 3D scene, we employ the layer-oriented angular-spectrum method to calculate the 2D phase-only hologram (POH) [44]. Specifically, the free-space propagation of the wavefield is calculated based on the angular spectrum model as

$$U(x,y;\ z) = F^{ - 1} \{ F\{ U(x,y;\ z_{0} )\} \times H(f_{x} {,}f_{y} ;\ s,\lambda )\} ,$$

(5)

$$H(f_{x} {,}\ f_{y} ;\ s,\lambda ) = \exp ({\text{j}}ks\sqrt {1 - (\lambda f_{x} )^{2} - (\lambda f_{y} )^{2} } ),$$

(6)

where $U(x,y;z_{0} )$ and $U(x,y;z)$ denote the original and propagated 2D wavefield distribution at axial locations $z_{0}$ and $z$, respectively. $H(f_{x} ,f_{y} ;s,\lambda )$ is the transfer function. $\lambda$ denotes the illumination wavelength and $k = 2\pi /\lambda$ denotes the wave number. $f_{x}$ and $f_{y}$ are the spatial frequencies. $s = z - z_{0}$ denotes the propagation distance. $F$ and $F^{ - 1}$ represent the Fourier and inverse Fourier transform, respectively.

The 3D scene is discretized and divided into a series of 2D layers along the axial direction. For each layer, the amplitude distribution is extracted from the corresponding 2D slice. Due to the ill-posedness of POH calculation, the directly obtained phase pattern may suffer from limited display contrast and speckle noises. Therefore, we adopt the Gerchberg-Saxton (GS) algorithm [68] to improve the display quality, which proceeds as follows:

(1)
Initialize the complex amplitude at the SLM plane as
$$U_{1} = A_{0} \exp ({\text{j}}\phi_{1} ),$$
(7)
where $\phi_{1}$ is the random initial phase, and $A_{0}$ is the amplitude of the SLM plane depending on the source intensity.
(2)
Obtain the complex amplitude distribution of the target plane through forward propagation:
$$U_{2} = F^{ - 1} \{ F\{ U_{1} \} \times H(s)\} = A_{2} \exp ({\text{j}}\phi_{2} ),$$
(8)
where s denotes the propagation distance.
(3)
Preserve the phase information $\phi_{2}$ and replace with the target amplitude $A_{i}$:
$$U_{3} = A_{i} \exp ({\text{j}}\phi_{3} ),\phi_{3} = \phi_{2} .$$
(9)
(4)
Obtain the complex amplitude distribution of the SLM plane through back-propagation:
$$U_{4} = F^{ - 1} \{ F\{ U_{3} \} \times H( - s)\} = A_{4} \exp ({\text{j}}\phi_{4} ).$$
(10)
(5)
Preserve the phase information $\phi_{4}$ and replace with the SLM plane amplitude $A_{0}$:
$$U_{1} ^{\prime} = A_{0} \exp ({\text{j}}\phi_{1} ^{\prime}),\phi_{1} ^{\prime} = \phi_{4} .$$
(11)
(6)
Steps (2)–(5) proceed iteratively until the phase distribution $\phi_{m} \left( {m = 1,2,3,...} \right)$ converges, and the refined POH is obtained.

A flowchart of the iterative algorithm is illustrated in Fig. 3b. To accurately simulate the diffusion effect of real-world 3D objects, the hologram is initialized randomly. Through back propagation, a complex-amplitude optical field is obtained at the SLM plane. By adding the back-propagated field distributions of all the layers and extracting the phase, a POH is obtained. The entire process is depicted in Fig. 3a. Compared to other CGH algorithms, the angular-spectrum layer-oriented method is favored for its high computational efficiency and precise prediction of the complete diffraction field.

3 Experiment

3.1 Experimental setup

We used a fiber-coupled 532 nm laser for illumination. A reflective phase-only LC-SLM (GAEA-2, Holoeye) with a resolution of 3840 × 2160 pixels, a pixel size of 3.74 μm, and a refresh rate of 60 Hz, was used for wavefront modulation. To magnify the displayed 3D target, we introduced two lenses, L1 and L2, with focal lengths of 100 mm and 300 mm respectively. A Canon EOS 70D camera was employed to observe and record the displayed scene. The entire process, including gesture data acquisition, processing, and SLM control, was implemented with Python. The user sequentially pressed all the buttons of the keyboard using the index finger. This procedure allowed for comprehensive data collection for subsequent analysis and evaluation. The experimental setup is shown in Fig. 4 and the experimental settings are shown in Table 1.

Table 1 Experimental settings

Full size table

3.2 Experimental results

Figure 5 shows the designed pattern, depth map, calculated POH, and display patterns when the button “Delete” is pressed. The initial virtual keyboard, formed by the SLM, has a size of 10 × 10 mm. The sizes of the buttons “1” to “9” and “Dot” are 2.5 × 2.5 mm, the sizes of the buttons “Enter” and “Delete” are 2.5 × 5 mm, and the size of button “0” is 5 × 2.5 mm. The virtual keyboard plane is positioned 17.7 mm from the SLM plane, and the pressed button is located 18 mm from the SLM plane, resulting in a depth of 0.3 mm between the two layers. After the magnification, the size of the virtual keyboard becomes 40 × 40 mm, and the distance between the two layers is 15 mm. The sizes of the buttons “1” to “9” and “Dot” are 10 × 10 mm, the sizes of the buttons “Enter” and “Delete” are 10 × 20 mm, and the size of the button “0” is 20 × 10 mm. The magnified keyboard is suitable for hand interaction, given that the normal diameter of a human finger is approximately 10 mm. Setting the camera’s focal distance at z = 0 mm (Layer 1) and z = − 15 mm (Layer 2) allows the virtual keyboard and the pressed button to be focused respectively, as Figs. 5d and e show. The size of the virtual 3D keyboard is shown in Table 2.

Table 2 Size of the virtual 3D keyboard

Full size table

To demonstrate the functionality of our system, we conducted a series of experiments, sequentially pressing all the buttons. Figure 6 shows the captured images with the keyboard plane (Layer 1) in focus. It is evident that the buttons “Delete”, “0”, “1”, and “5” were successfully pressed, as indicated by the positions of the interacting fingertip, which is shown blow each image. It can be easily verified that these positions satisfy the criteria of Eqs. (2)–(4). The successful interaction behaviors validate the accurate determination, with the uploaded holograms reconstructing the corresponding virtual patterns as expected.

To evaluate the functional stability of the holographic 3D virtual keyboard system, a series of interactive experiments were conducted. The transition of the 3D virtual keyboard from its initial state, where all buttons pop up, to a state where a specific button is pressed, is referred to as a “pressed response”. If the keyboard generated a “pressed response”, the fingertip position collected by Leap Motion would be recorded. The interactive experiments were carried out by randomly pressing the virtual buttons a sufficient number of times. We collected and analyzed the statistical results of “pressed response” for buttons “Delete”, “0”, “1”, and “5”. The number of responses was counted, as shown in Table 3. The response count for buttons “Delete”, “0”, “1”, and “5” were 63, 442, 722, and 28, respectively. The total response count was 1255, and all of them were accurate. The fingertip positions were recorded and marked in a coordinate system, as shown in Fig. 7a–d, and the button areas were also indicated in the same coordinate system. It could be observed that the fingertip positions align with the predefined cubic areas of the respective buttons. The point “A” at coordinate (16.97, 108.78, 1.19) in Fig. 7a, point “B” at coordinate (− 17.15, 83.78, 0.23) in Fig. 7b, point “C” at coordinate (− 17.83, 94.47, 0.91) in Fig. 7c, and point “D” at coordinate (5.31, 103.58, 0.56) in Fig. 7d correspond to the captured images in Fig. 6a–d. Figures 7e and f show all collected fingertip positions and corresponding button areas.

Table 3 Statistical results in interaction experiment of pressing buttons “Delete”, “0”, “1”, and “5”

Full size table

As an interaction system, the response time is an important attribute. The response time of the proposed HCI system is contingent on the time required for gesture data collecting, data processing, hologram uploading, and SLM rendering. The hand-tracking sensor has a frame rate of 200 frames per second. SLM has a refresh rate of 60 Hz. The data processing and hologram uploading are executed by a personal computer, taking approximately ~${10}^{-8}$ s for computation. Then the response time for a complete interaction action is 0.02 s, achieving real-time responsiveness as observed by humans. In this work, the holograms are pre-calculated in advance to ensure real-time rendering. This puts forward higher requirements on the system memory as the quantity of display patterns increases. This can be addressed through the utilization of parallel computing techniques based on graphical processing units (GPUs) and deep learning-based CGH algorithms [69, 70].

It should be noted that in the optical system, lens L2 serves as a field lens and plays a role in widening the field of view, which helps the camera to capture the entire holographic images. The function of lens L2 and the camera is to illustrate the approach through experimental demonstration. In practical applications, this part may not be necessary, thus system complexity can be further reduced. Expanding the viewing angle of holographic 3D display remains a topic for further research, which can be achieved by using a liquid crystal grating [71], time or spatial multiplexing techniques, metasurface devices, and holographic optical elements.

4 Conclusion

In this paper, we proposed a contactless 3D virtual keyboard system to combine gesture recognition and holographic display. Specifically, we integrated a SLM, which is used to generate holographic 3D display patterns, with a hand-tracking sensor, which is used to detect the hand gestures and fingertip positions. The hand-tracking sensor and the SLM is operated synchronously with feedback controls by a personal computer. The user-interactive experiments demonstrate the system’s 3D display capability, stable performance, and real-time responsiveness. By making minor adjustments to the virtual display patterns and the interaction instructions, the proposed HCI system can accommodate a board spectrum of applications, particularly in contactless interaction scenarios during the COVID-19 pandemic. Holographic near-eye display is a potential application of bidirectional holographic display systems with haptic interfaces. The integration of holographic 3D gesture UI with VR, AR and MR offers a promising avenue for the development of user-interfaces and mobile devices.

Code availability

https://github.com/THUHoloLab/Contactless-HCI-system.

References

X. Pu, H. Guo, J. Chen, X. Wang, Y. Xi, C. Hu, Z.L. Wang, Eye motion triggered self-powered mechnosensational communication system using triboelectric nanogeneratorEye motion triggered self-powered mechnosensational communication system using triboelectric nanogenerator. Sci. Adv. 3, e1700694 (2017)
Article ADS Google Scholar
W. Gao, S. Emaminejad, H.Y.Y. Nyein, S. Challa, K. Chen, A. Peck, H.M. Fahad, H. Ota, H. Shiraki, D. Kiriya, D.H. Lien, G.A. Brooks, R.W. Davis, A. Javey, Fully integrated wearable sensor arrays for multiplexed in situ perspiration analysis. Nature 529, 509–514 (2016)
Article ADS Google Scholar
M. Amjadi, K.-U. Kyung, I. Park, M. Sitti, Stretchable, skin-mountable, and wearable strain sensors and their potential applications: a review. Adv. Func. Mater. 26, 1678–1698 (2016)
Article Google Scholar
J. Ge, X. Wang, M. Drack, O. Volkov, M. Liang, G. S. Canon Bermudez, R. Illing, C. Wang, S. Zhou, J. Fassbender, M. Kaltenbrunner, D. Makarov, A bimodal soft electronic skin for tactile and touchless interaction in real time. Nat. Commun. 10, 4405 (2019)
Article ADS Google Scholar
X. Yu, Z. Xie, Y. Yu, J. Lee, A. Vazquez-Guardado, H. Luan, J. Ruban, X. Ning, A. Akhtar, D. Li, B. Ji, Y. Liu, R. Sun, J. Cao, Q. Huo, Y. Zhong, C. Lee, S. Kim, P. Gutruf, C. Zhang, Y. Xue, Q. Guo, A. Chempakasseril, P. Tian, W. Lu, J. Jeong, Y. Yu, J. Cornman, C. Tan, B. Kim, K. Lee, X. Feng, Y. Huang, J.A. Rogers, Skin-integrated wireless haptic interfaces for virtual and augmented reality. Nature 575, 473–479 (2019)
Article ADS Google Scholar
L. Lu, C. Jiang, G. Hu, J. Liu, B. Yang, Flexible noncontact sensing for human-machine interaction. Adv. Mater. 33, e2100218 (2021)
Article Google Scholar
S. An, H. Zhu, C. Guo, B. Fu, C. Song, P. Tao, W. Shang, T. Deng, Noncontact human-machine interaction based on hand-responsive infrared structural color. Nat. Commun. 13, 1446 (2022)
Article ADS Google Scholar
W. Liu, Y. Duo, J. Liu, F. Yuan, L. Li, L. Li, G. Wang, B. Chen, S. Wang, H. Yang, Y. Liu, Y. Mo, Y. Wang, B. Fang, F. Sun, X. Ding, C. Zhang, L. Wen, Touchless interactive teaching of soft robots through flexible bimodal sensory interfaces. Nat. Commun. 13, 5030 (2022)
Article ADS Google Scholar
P.-O. Kristensson and S. Zhai, Relaxing stylus typing precision by geometric pattern matching. In: Proceedings of the 10th International Conference on Intelligent User Interfaces, (2005), pp. 151–158
P.-O. Kristensson, Five challenges for intelligent text entry methods. AI Mag. 30, 85–85 (2009)
Google Scholar
I.S. MacKenzie, R.W. Soukoreff, Text entry for mobile computing: models and methods, theory and practice. Hum. Comput. Interact. 17, 147–198 (2002)
Article Google Scholar
S. Zhai, P.-O. Kristensson, B.A. Smith, In search of effective text input interfaces for off the desktop computing. Interact. Comput. 17, 229–250 (2005)
Article Google Scholar
G.A.M. Vasiljevic, L.C. De Miranda, E.E.C. De Miranda, A case study of mastermind chess: comparing mouse/keyboard interaction with kinect-based gestural interface. Adv. Hum.-Comput. Interact. 2016, 1 (2016)
Article Google Scholar
R. Khurana, D. McIsaac, E. Lockerman, and J. Mankoff, Nonvisual interaction techniques at the keyboard surface, In: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, (2018), pp. 1–12.
M. McGill, S. Brewster, D.P. De Sa Medeiros, S. Bovet, M. Gutierrez, A. Kehoe, Creating and augmenting keyboards for extended reality with the keyboard augmentation toolkit. ACM Transact. Comput. Hum. Interact. 29, 1–39 (2022)
Article Google Scholar
S.D. Chua, K.R. Chin, S. Lim, P. Jain, Hand gesture control for human–computer interaction with deep learning. J. Electrical Eng. Technol. 17, 1961–1970 (2022)
Article ADS Google Scholar
M. Schenkluhn, C. Peukert, C. Weinhardt, Typing the future: Designing multimodal AR keyboards, In: Wirtschaftsinformatik 2022 Proceedings, (2022), p. 11.
M. Wang, Z. Yan, T. Wang, P. Cai, S. Gao, Y. Zeng, C. Wan, H. Wang, L. Pan, J. Yu, S. Pan, K. He, J. Lu, X. Chen, Gesture recognition using a bioinspired learning architecture that integrates visual data with somatosensory data from stretchable sensors. Nat. Electron. 3, 563–570 (2020)
Article Google Scholar
A. Moin, A. Zhou, A. Rahimi, A. Menon, S. Benatti, G. Alexandrov, S. Tamakloe, J. Ting, N. Yamamoto, Y. Khan, F. Burghardt, L. Benini, A.C. Arias, J.M. Rabaey, A wearable biosensing system with in-sensor adaptive machine learning for hand gesture recognition. Nat. Electron. 4, 54–63 (2020)
Article Google Scholar
X. Guo, X. Lu, P. Jiang, X. Bao, SrTiO₃/CuNi heterostructure-based thermopile for sensitive human radiation detection and noncontact human-machine interaction. Adv. Mater. 34, 2204355 (2022)
Article Google Scholar
L. Lu, C. Jiang, G. Hu, J. Liu, B. Yang, Flexible noncontact sensing for human–machine interaction. Adv. Mater. 33, 2100218 (2021)
Article Google Scholar
L. Cao, Z. He, K. Liu, X. Sui, Progress and challenges in dynamic holographic 3D display for the metaverse. Infrared Laser Eng. 51, 20210935–20210931 (2022)
Google Scholar
Z. He, X. Sui, G. Jin, L. Cao, Progress in virtual reality and augmented reality based on holographic display. Appl. Opt. 58, A74–A81 (2019)
Article ADS Google Scholar
H. Hiura, K. Komine, J. Arai, T. Mishina, Measurement of static convergence and accommodation responses to images of integral photography and binocular stereoscopy. Opt. Express 25, 3454–3468 (2017)
Article ADS Google Scholar
Z. Zhuang, L. Zhang, P. Surman, W. Song, S. Thibault, X.W. Sun, Y. Zheng, Addressable spatial light modulators for eye-tracking autostereoscopic three-dimensional display using a scanning laser. Appl. Opt. 57, 4457–4466 (2018)
Article ADS Google Scholar
B. Xu, Q. Wu, Y. Bao, G. Chen, Y. Wang, S. Ren, Time-multiplexed stereoscopic display with a quantum dot-polymer scanning backlight. Appl. Opt. 58, 4526–4532 (2019)
Article ADS Google Scholar
Y. Meng, Y. Lyu, L.L. Chen, Z. Yu, H. Liao, Motion parallax and lossless resolution autostereoscopic 3D display based on a binocular viewpoint tracking liquid crystal dynamic grating adaptive screen. Opt. Express 29, 35456–35473 (2021)
Article ADS Google Scholar
X. Li, Q. Wu, Y. Wang, Binocular vision calibration method for a long-wavelength infrared camera and a visible spectrum camera with different resolutions. Opt. Express 29, 3855–3872 (2021)
Article Google Scholar
K. Kumagai, I. Yamaguchi, Y. Hayasaki, Three-dimensionally structured voxels for volumetric display. Opt. Lett. 43, 3341–3344 (2018)
Article ADS Google Scholar
R. Hirayama, D. Martinez Plasencia, N. Masuda, S. Subramanian, A volumetric display for visual, tactile and audio presentation using acoustic trapping. Nature 575, 320–323 (2019)
Article ADS Google Scholar
M. Chul Shin, A. Mohanty, K. Watson, G.R. Bhatt, C.T. Phare, S.A. Miller, M. Zadka, B.S. Lee, X. Ji, I. Datta, M. Lipson, Chip-scale blue light phased array. Opt. Lett. 45, 1934–1937 (2020)
Article ADS Google Scholar
K. Suzuki, Y. Fukano, H. Oku, 1000-volume/s high-speed volumetric display for high-speed HMD. Opt. Express 28, 29455–29468 (2020)
Article ADS Google Scholar
L. Yang, X. Sang, X. Yu, B. Liu, B. Yan, K. Wang, C. Yu, A crosstalk-suppressed dense multi-view light-field display based on real-time light-field pickup and reconstruction. Opt. Express 26, 34412–34427 (2018)
Article ADS Google Scholar
Q. Ma, L. Cao, Z. He, S. Zhang, Progress of three-dimensional light-field display. Chin. Opt. Lett. 17, 111001 (2019)
Article Google Scholar
H.-L. Zhang, H. Deng, H. Ren, X. Yang, Y. Xing, D.-H. Li, Q.-H. Wang, Method to eliminate pseudoscopic issue in an integral imaging 3D display by using a transmissive mirror device and light filter. Opt. Lett. 45, 351–354 (2020)
Article ADS Google Scholar
M. Xu, H. Hua, Systematic method for modeling and characterizing multilayer light field displays. Opt. Express 28, 1014–1036 (2020)
Article ADS Google Scholar
Y. Chen, X. Sang, S. Xing, Y. Guan, H. Zhang, K. Wang, Automatic co-design of light field display system based on simulated annealing algorithm and visual simulation. Opt. Express 30, 17577–17590 (2022)
Article ADS Google Scholar
S. Qi, X. Sang, B. Yan, D. Chen, P. Wang, H. Wang, X. Ye, Virtual view synthesis for 3D light-field display based on feature reprojection and fusion. Optics Commun. 519, 128383 (2022)
Article Google Scholar
C. Chang, K. Bang, G. Wetzstein, B. Lee, L. Gao, Toward the next-generation VR/AR optics: a review of holographic near-eye displays from a human-centric perspective. Optica 7, 1563–1578 (2020)
Article ADS Google Scholar
X. Sui, Z. He, G. Jin, D. Chu, L. Cao, Band-limited double-phase method for enhancing image sharpness in complex modulated computer-generated holograms. Opt. Express 29, 2597–2612 (2021)
Article ADS Google Scholar
K. Liu, Z. He, L. Cao, Pattern-adaptive error diffusion algorithm for improved phase-only hologram generation. Chin. Opt. Lett. 19, 050501 (2021)
Article ADS Google Scholar
Z. Zhao, J. Duan, J. Liu, Speckle reduction in holographic display with partially spatial coherent illumination. Optics Commun. 507, 127604 (2022)
Article Google Scholar
T. Shimobaba, T. Ito, Random phase-free computer-generated hologram. Opt. Express 23, 9549–9554 (2015)
Article ADS Google Scholar
Y. Zhao, L. Cao, H. Zhang, D. Kong, G. Jin, Accurate calculation of computer-generated holograms using angular-spectrum layer-oriented method. Opt. Express 23, 25440–25449 (2015)
Article ADS Google Scholar
S. Igarashi, T. Nakamura, K. Matsushima, M. Yamaguchi, Efficient tiled calculation of over-10-gigapixel holograms using ray-wavefront conversion. Opt. Express 26, 10773–10786 (2018)
Article ADS Google Scholar
D. Blinder, Direct calculation of computer-generated holograms in sparse bases. Opt. Express 27, 23124–23137 (2019)
Article ADS Google Scholar
Z. Lv, Y. Xu, Y. Yang, J. Liu, Multiplane holographic augmented reality head-up display with a real–virtual dual mode and large eyebox. Appl. Opt. 61, 9962–9971 (2022)
Article ADS Google Scholar
F. Lu, J. Hua, F. Zhou, Z. Xia, R. Li, L. Chen, W. Qiao, Pixelated volume holographic optical element for augmented reality 3D display. Opt. Express 30, 15929–15938 (2022)
Article ADS Google Scholar
L. He, K. Liu, Z. He, L. Cao, Three-dimensional holographic communication system for the metaverse. Optics Commun. 526, 128894 (2022)
Article Google Scholar
T. Shimobaba, A. Shiraki, Y. Ichihashi, N. Masuda, T. Ito, Interactive color electroholography using the FPGA technology and time division switching method. IEICE Electron. Express 5, 271–277 (2008)
Article Google Scholar
S. Yamada, T. Kakue, T. Shimobaba, T. Ito, Interactive holographic display based on finger gestures. Sci. Rep. 8, 2010 (2018)
Article ADS Google Scholar
K. Suzuki, M. Oikawa, Y. Mori, T. Kakue, T. Shimobaba, T. Ito, and N. Takada, Holographic projection system for drawing fingertip trajectory obtained from depth camera. In: Proceedings of the International Display Workshops, (2019), pp. 2–24
M. Takenaka, T. Kakue, T. Shimobaba, T. Ito, Interactive holographic display for real-time drawing and erasing of 3D point-cloud images with a fingertip. IEEE Access 9, 36766–36774 (2021)
Article Google Scholar
V.K. Adhikarla, J. Sodnik, P. Szolgay, G. Jakus, Exploring direct 3D interaction for full horizontal parallax light field displays using leap motion controller. Sensors 15, 8642–8663 (2015)
Article ADS Google Scholar
M. Yamaguchi, R. Higashida, 3D touchable holographic light-field display. Appl. Opt. 55, A178–A183 (2016)
Article ADS Google Scholar
M. Yamaguchi, Full-parallax holographic light-field 3-D displays and interactive 3-D touch. Proc. IEEE 105, 947–959 (2017)
Article Google Scholar
X. Sang, X. Gao, X. Yu, S. Xing, Y. Li, Y. Wu, Interactive floating full-parallax digital three-dimensional light-field display based on wavefront recomposing. Opt. Express 26, 8883–8889 (2018)
Article ADS Google Scholar
Y. Sando, K. Satoh, D. Barada, T. Yatagai, Real-time interactive holographic 3D display with a 360° horizontal viewing zone. Appl. Opt. 58, G1–G5 (2019)
Article Google Scholar
I.A. Sánchez Salazar Chavarría, T. Nakamura, M. Yamaguchi, Interactive optical 3D-touch user interface using a holographic light-field display and color information. Opt. Express 28, 36740–36755 (2020)
Article ADS Google Scholar
T. Nishitsuji, T. Kakue, D. Blinder, T. Shimobaba, T. Ito, An interactive holographic projection system that uses a hand-drawn interface with a consumer CPU. Sci. Rep. 11, 147 (2021)
Article ADS Google Scholar
I.A. Sánchez Salazar Chavarría, K. Shimomura, S. Takeyama, M. Yamaguchi, Interactive 3D touch and gesture capable holographic light field display with automatic registration between user and content. J. Soc. Inform. Display 30, 877–893 (2022)
Article Google Scholar
M. Cheok, Z. Omar, M. Jaward, A review of hand gesture and sign language recognition techniques. Int. J. Mach. Learn. Cybern. 10, 131–153 (2017)
Article Google Scholar
H. Cheng, L. Yang, Z. Liu, Survey on 3D hand gesture recognition. IEEE Trans. Circuits Syst. Video Technol. 26, 1659–1673 (2016)
Article Google Scholar
J.K. Aggarwal, L. Xia, Human activity recognition from 3D data: a review. Pattern Recogn. Lett. 48, 70–80 (2014)
Article ADS Google Scholar
Intel, Intel RealSense Technology (2023). Retrieved November 12, 2023, from https://www.intelrealsense.com/
Microsoft, Azure Kinect DK (2023). Retrieved November 12, 2023, from https://azure.microsoft.com/zh-cn/products/kinect-dk/
Ultraleap, Leap Motion Controller (2023). Retrieved November 12, 2023, from https://www.ultraleap.com/product/leap-motion-controller/
R.W. Gerchberg, A practical algorithm for the determination of plane from image and diffraction pictures. Optik 35, 237–246 (1972)
Google Scholar
J. Wu, K. Liu, X. Sui, L. Cao, High-speed computer-generated holography using an autoencoder-based deep neural network. Opt. Lett. 46, 2908–2911 (2021)
Article ADS Google Scholar
K. Liu, J. Wu, Z. He, L. Cao, 4K-DMDNet: diffraction model-driven network for 4K computer-generated holography. Opto-Electron. Adv. 6, 220135 (2023)
Article Google Scholar
D. Wang, N.-N. Li, Y.-L. Li, Y.-W. Zheng, Z.-Q. Nie, Z.-S. Li, F. Chu, Q.-H. Wang, Large viewing angle holographic 3D display system based on maximum diffraction modulation. Light Adv. Manuf. 4, 1–11 (2023)
Google Scholar

Download references

Funding

National Natural Science Foundation of China (62035003); China Postdoctoral Science Foundation (BX2021140).

Author information

Authors and Affiliations

Weiyang College, Tsinghua University, Beijing, 100084, China
Yixin Yang & Liangcai Cao
Department of Precision Instrument, Tsinghua University, Beijing, 100084, China
Yunhui Gao, Kexuan Liu, Zehao He & Liangcai Cao

Authors

Yixin Yang
View author publications
You can also search for this author in PubMed Google Scholar
Yunhui Gao
View author publications
You can also search for this author in PubMed Google Scholar
Kexuan Liu
View author publications
You can also search for this author in PubMed Google Scholar
Zehao He
View author publications
You can also search for this author in PubMed Google Scholar
Liangcai Cao
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

YY and LC proposed the concept. YY and YG developed the algorithm and the experimental system. YY and KL developed the CGH algorithm. YY wrote the main manuscript. All authors reviewed the manuscript.

Corresponding author

Correspondence to Liangcai Cao.

Ethics declarations

Conflict of interest

The authors declare no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 A video of user-interactive 3D virtual keyboard is attached (MP4 49346 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yang, Y., Gao, Y., Liu, K. et al. Contactless human–computer interaction system based on three-dimensional holographic display and gesture recognition. Appl. Phys. B 129, 192 (2023). https://doi.org/10.1007/s00340-023-08128-2

Download citation

Received: 04 September 2023
Accepted: 13 October 2023
Published: 23 November 2023
DOI: https://doi.org/10.1007/s00340-023-08128-2

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Contactless human–computer interaction system based on three-dimensional holographic display and gesture recognition

Abstract

Similar content being viewed by others

Hand Tracking with a Near-Range Depth Camera for Virtual Object Manipulation in an Wearable Augmented Reality

A Touchless Gestural Platform for the Interaction with the Patients Data

Long-Range Hand Gesture Interaction Based on Spatio-temporal Encoding

1 Introduction