Abstract
Object tracking in computer vision can be done either by using a marker-less or marker-based approach. Computer vision systems have been using Fiducial markers for pose estimation in different applications such as augmented reality [5] and robot navigation [4]. With the advancements in Augmented Reality (AR), new tools such as AugmentedReality uco (ArUco) [6] markers have been introduced to the literature. ArUco markers, are used to tackle the localization problem in AR, allowing camera pose estimation to be carried out by a binary matrix. Using a binary matrix not just simplifies the process but also increases the efficiency. As a part of our initiative to create a cost-efficient, 24/7 accessible, Virtual Reality (VR) based chemistry lab for underprivileged students, we wanted to create an alternative way of interacting with the virtual scene. In this study, we used ArUco markers to create a low-cost keyboard only using a piece of paper and an off-the-shelf webcam. We believe this method of keyboard will be more beneficial to the user as they can see the keys before they are typing in the corner of the screen instead of an insufficient on the screen VR keyboard or a regular keyboard where the user can’t see what they are typing with a VR headset. As potential extensions of the base system, we have also designed and evaluated a stereo camera and an IMU sensor based system with various sensor fusion techniques. In summary, the stereo camera reduces occlusion related problems, and the IMU sensor detects vibrations which in turn simplifies the KeyPress detection problem. It has been observed that use of any of these additional sensors improves the overall system performance.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Object tracking in computer vision can be done either by using a marker-less or marker-based approach. Computer vision systems have been using Fiducial markers for pose estimation in different applications such as augmented reality [5] and robot navigation [4]. With the advancements in Augmented Reality (AR), new tools such as AugmentedReality uco (ArUco) [6] markers have been introduced to the literature. ArUco markers, are used to tackle the localization problem in AR, allowing camera pose estimation to be carried out by a binary matrix. Using a binary matrix not just simplifies the process but also increases the efficiency. As a part of our initiative to create a cost-efficient, 24/7 accessible, Virtual Reality (VR) based chemistry lab for underprivileged students, we wanted to create an alternative way of interacting with the virtual scene. In this study, we used ArUco markers to create a low-cost keyboard only using a piece of paper and an off-the-shelf webcam. We believe this method of keyboard will be more beneficial to the user as they can see the keys before they are typing in the corner of the screen instead of an insufficient on the screen VR keyboard or a regular keyboard where the user can’t see what they are typing with a VR headset.
Our setup is straightforward and consists of a webcam and a piece of paper with a keyboard-like pattern printed on it, see [4]. Basically, there is a numeric keypad with rectangular regions labeled from 0 to 9, and each region has the ArUco code for the corresponding key value. When the system is running in “live” mode, users can use this printed paper as a keypad. All “touched” key values will be translated to keypress events and the printed paper will act as a regular keyboard. This system needs both computer vision and smoothing/filtering techniques which can be fine-tuned for an average user or a specific user.
In this paper, we propose using a real-time OpenCV-based computer vision approach and a specific state-machine based fast smoothing/filtering algorithm. The filter has a parameter, N, which represents the filter strength. We have first created a dataset using six-digit numbers typed by the same user using this paper-based keyboard. Then we varied the filter strength parameter N from 1 to 10 and measured the accuracy of the proposed paper-based keyboard. For a specific trained user, and for a specific dataset of size ten, the system accuracy is measured as 0.0 for N less than 4, 0.6 for \(N=4\), 1.0 for \(N=5,6,7\), 0.3 for \(N=8\), 0.10 for \(N=9\), and finally 0.0 for \(N=10\). Optimal values seem to be \(N=5,6,7\), but if we eliminate \(N=5\) and 7 as potential boundary cases, we get \(N=6\) as the optimal choice for this specific trained user.
The ArUco keyboard used in this study is shown in Fig. 1, and the base system demo is presented in Fig. 4. As potential extensions of the base system, we have also designed and evaluated a stereo camera and an IMU sensor based system with various sensor fusion techniques. The specific stereo camera used for this research was a USB3 ZED camera, see Fig. 2, tested with a GeForce GTX 1050 Ti Max-Q 4 GB laptop running Ubuntu 18 LTS. It has been observed that the stereo camera reduces occlusion related issues, and results more robust detection performance. The IMU sensor used in this research is a GY-521 accelerometer and gyro sensor, see Fig. 3, interfaced to an Arduino Uno board over the SPI interface. The IMU sensor detects keypress/touch related vibrations and sends this information to the host computer. Most of the mobile devices used today do have a camera(s) and IMU sensors, therefore the proposed extensions to our base system is quite realistic. Basically, the IMU sensor detects vibrations which in turn simplifies the KeyPress detection problem.
In summary, it has been observed that use of any of these additional sensors, i.e. additional camera and/or IMU sensor, improves the overall system performance.
2 Base System
Our base system [1] shown in, Fig. 4, has a single webcam. The algorithm used in this base implementation is shown in Algorithm 1. In each OpenCV frame, we first detect all visible ArUco markers and then determine all blocked ArUco markers. For each frame, we also determine the highest blocked marker value. If the highest blocked marker is the same during the past N frames, then we generate a KeyPress event. A KeyRelease event is generated in the first frame having all ArUco markers visible.
The detection performance of the system depends on the value of N. For a specific trained user, \(N=6\) value is found to be optimal for a webcam running at 25 frames/s. In general, the optimal N value depends on the frame rate and the user.
3 IMU Sensor Based System
The base system presented in the previous section works by detecting blocked ArUco markers in each frame. However, this single camera based system cannot differentiate between blocked without touch and blocked because of touch cases. Because of this technical difficulty, a user should be trained not to keep his/her hand stationary for a “long” period of time (5/25 s) while being visible by the camera. Although this is technically possible, and the training process is observed to be not that difficult, we have developed an alternative approach to overcome this problem.
This new approach [2] is based on using an IMU sensor, see Fig. 5, to differentiate between blocked without touch and blocked because of touch. IMU sensors have acceleration sensors in x, y and z directions, and can be used to detect even a slight tap on a surface. We have used an InvenSense MPU6050 chip as our IMU sensor. A first order digital low-pass filter is used for smoothing, and a thresholding with hysteresis is used for tap detection. In this case, the microcontroller sends the tap events to the host device, and only after this stage the host device starts executing Algorithm 1. See the full source code given in the appendix for digital low-pass filter, thresholding and hysteresis parameters.
4 Stereo Camera Based System
As a final improvement of the proposed ArUco keyboard system, we have implemented a stereo camera based solution [3] shown in Fig. 6. A stereo camera based system provides more data which can be used to improve the overall system performance, and this is true with or without using an IMU sensor. Sometimes, we may have certain ArUco markers being blocked because of occlusion, and not because of touch. Basically, after a touch or tap is detected we may still have multiple ArUco markers being blocked. The priority scheme used in Algorithm 1 seems to work for most cases, but the failure rate is non-zero and becomes more noticeable if the ArUco keyboard is rotated significantly. A stereo camera greatly improves detection performance for such cases.
If both cameras report a particular ArUco marker as not detected, then the probability of failure, i.e. being not-detected because of occlusion, will be smaller compared to a single camera system. Therefore, use of a stereo camera reduces false KeyPress events and also key value errors. But it requires more processing power and more complex hardware which may not be practical for all possible use cases.
5 Conclusion
In this paper, we have presented a paper based numeric keypad using ArUco markers. Full details of all source codes are given in the appendix. This system can be quite useful as a low-cost disposable keyboard for VR systems and mobile devices equipped with a camera. It has been observed that, the use of an IMU sensor greatly improves the overall system performance. Since almost all mobile devices, whether it is a phone or a tablet, do have IMU sensors, the improved IMU based keyboard can be used without any additional sensor or equipment. We have also implemented a stereo camera based system, but to the best of our knowledge mobile devices with stereo cameras are not widely available. The stereo camera based implementation is a feasible alternative for VR systems.
References
ArUco keyboard demo video: Base system. https://youtu.be/tnKc6zvXliY
ArUco keyboard demo video: IMU sensor based version. https://youtu.be/sIuhZQpu0AE
ArUco keyboard demo video: Stereo camera version (USB3 ZED camera). https://youtu.be/ssbv2NqfAJg
Bacik, J., Durovsky, F., Fedor, P., Perdukova, D.: Autonomous flying with quadrocopter using fuzzy control and ArUco markers. Intell. Serv. Robot. 10(3), 185–194 (2017). https://doi.org/10.1007/s11370-017-0219-8
Billinghurst, M., Clark, A., Lee, G.: A survey of augmented reality. Found. Trends Hum.-Comput. Interact. 8(2–3), 73–272 (2015). http://dx.doi.org/10.1561/1100000049
Garrido-Jurado, S., Muñoz-Salinas, R., Madrid-Cuevas, F., Marín-Jiménez, M.: Automatic generation and detection of highly reliable fiducial markers under occlusion. Pattern Recogn. 47(6), 2280–2292 (2014). https://doi.org/10.1016/j.patcog.2014.01.005
Acknowledgments
Funding is provided by NSF-1919855, Advanced Mobility Institute grants GR-2000028, GR-2000029, and Florida Polytechnic University startup grant GR-1900022.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
Appendix I: ArUco Code Detection Module aruco_tools.py
Appendix II: Base System minikdb_mono.py
Appendix III: IMU Based System minikbd_imu.py
Appendix IV: Stereo Camera Based System minikbd_zed.py
Appendix V: IMU Sensor Code for Arduino Uno
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Toker, O., Karaman, B., Demirel, D. (2022). A Paper-Based Keyboard Using ArUco Codes: ArUco Keyboard. In: Kurosu, M. (eds) Human-Computer Interaction. Technological Innovation. HCII 2022. Lecture Notes in Computer Science, vol 13303. Springer, Cham. https://doi.org/10.1007/978-3-031-05409-9_15
Download citation
DOI: https://doi.org/10.1007/978-3-031-05409-9_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-05408-2
Online ISBN: 978-3-031-05409-9
eBook Packages: Computer ScienceComputer Science (R0)