Keywords

1 Introduction

Camera calibration is an important procedure that is necessary for application of any machine vision algorithms in robotics tasks, which require high precision of manipulations that are coordinated using digital cameras as a primary sensor. Calibration produces intrinsic and extrinsic parameters of a camera, which define correspondence between 2D coordinates of an object point in the image plane and its 3D coordinates in a particular world frame, and provides distortion coefficients to alleviate cameras lens imperfections. Classical calibration methods require a human involvement in calibration process; the human holds a classical checkerboard pattern [14] and manually moves it in front of a camera.

Fig. 1.
figure 1

Russian humanoid robot AR-601M.

However, there is a modern way of camera calibration that implies use of fiducial marker systems. Fiducial marker systems are popular in many application areas, including physics, medicine, and augmented reality (AR). In robotics, fiducials find their application in navigation, localization, camera pose estimation and camera calibration. Such systems have high performance under classical chessboard system due to a more specific approach of pattern recognition and detection. However, each system has strengths and drawbacks and cannot be effective in all the above-mentioned application fields. We investigate how effectively could fiducial markers perform in different environment conditions, how a size of the markers influences their recognition rate and identify possible strengths and drawbacks of each selected fiducial marker: ARTag, AprilTag and CALTag. The focus of our research was to select the most suitable marker system for a Russian humanoid robot AR-601M (Fig. 1) autonomous camera calibration. By comparing these systems (ARTag, AprilTag, CALTag) with each other we could understand strengths and drawbacks of each system. Fiducial systems have a set of criteria, which determine a performance of a marker system with regard to each criterion. In our case, we plan to place the markers on robot manipulators in order to allow autonomous calibration without a human assistance, even though it may naturally increase possibilities of marker’s arbitrary occlusion.

In this paper we systematically estimate performance potential of selected fiducial marker systems and further analyze how each system is applicable in various scenarios (i.e., with small sizes of the marker, limitations of it position and orientation induced by robot kinematic constraints, distance to the marker and uneven lighting conditions in field environments). Specifics of this application imposes particular requirements on a marker system, i.e., the selected system should be at least resistant to some degree of a marker overlap with other parts of the AR-601M robot.

2 Overview of Fiducials: Related Work

Each fiducial marker system is designed in a such way that its marker (or fiducial) could be automatically detected by a camera with a help of the detection algorithm. Particular design of a marker directly depends on specific application area and, in most cases, a developed for certain purposes fiducial may not be suitable for another application. However, most fiducials have general shape: an external envelope (often a square or a circle) and an interior marking (an internal image), which encodes useful information (e.g., an identification code).

One of the first fiducial marker system that was created for augmented reality applications is ARToolKit system (its first release was in 1999 [8]). ARToolKit has a simple approach in marker recognition in space. Firstly, ARToolKit system transforms an image into grayscale and uses a threshold parameter for image binarization. After this steps, the system extracts edges and corners of the image. Basing on the identified corners, the system calculates 3D coordinates of a marker and defines its position. To identify a marker, a symbol (i.e., an image) inside the marker is matched against the set of ARToolKit templates. If the system succeeds finding a match for the template, it retrieves the ID of the marker and projects a corresponding 3D virtual object (knowing the position and orientation of the marker) into a video frame. Digital interior recognition was absent in original ARToolKit system, but it was implemented in future marker systems (including ARToolKit Plus that was the next version of ARToolKit [4]).

Drawbacks of ARToolKit were listed by Mark Fiala, who later has developed a new ARTag system, which uses a digital approach in pattern recognition [4]. Digital approach is utilized in many fiducial marker systems: an internal pattern of a marker represents a grid of black and white square cells interpreting a bit sequence, which is referred as a marker ID. At the moment, a large variety of different types of fiducial markers exists: markers with a general square [2] or a circle shape, [9] markers that consist of dots, [13] of a certain picture [6].

For our investigation we had selected three marker systems: ARTag (Fig. 2, left), AprilTag (Fig. 2, center) and CALTag (Fig. 2, right). The ARTag [3] system is based on ARToolKit [8], but uses a digital approach to read an internal pattern that is a binary code (barcode) [7]. AprilTag is visually similar to ARTag (square with a binary code inside) but has a different approach to marker detection and recognition. CALTag was proposed as an alternative solution for camera calibration [1] after analysis of classical chessboard-based camera calibration and fiducial markers approach.

Fig. 2.
figure 2

From left to right: ARTag (ID 2), AprilTag (ID 4) and CALTag fiducials.

This work is an extension of our previous work on fiducial marker comparison under various types of occlusion and rotation (Fig. 3). In [11] we used simple experiment design and cheap web camera Genius FaceCam 1000X to investigate markers’ performance when low cost video-capture equipment is used. For the experiments we printed markers on a white paper and fixed them on a flat surface of a neutral color to avoid marker’s false positive effect. The experiment design consisted of systematic and arbitrary occlusion experiments. For systematic occlusion (Fig. 3 shows the example of experiments) each tag was covered with a white paper template starting from the bottom so that the template was occluding K percent of the marker’s area. Occluded area K was gradually increased while taking a value from the 5-values array [0, 10, 20, 50, 70]. In the case of arbitrary occlusion, each tag was randomly overlapped with one of two different objects (i.e. metal scissors and white strip object) so that an object was entirely located within tag’s area and thus the overlap percentage was always kept constant. In [12] we used AR-601M front facing camera Basler acA640-90gc (Fig. 5) in a more complicated experiment design, where we added rotations of a marker (Fig. 4).

Fig. 3.
figure 3

ARTag ID 3 (top set of images), AprilTag ID 4 (middle) and CALTag 4x4 (bottom) occlusion for 10, 20, 50, 70 percent (from left to right) using FaceCam 1000X.

Fig. 4.
figure 4

Experiment design of marker rotation experiments.

All experiments in [11, 12] were conducted manually and this imposed limitations on the work: a reasonable (but small) number of trials, an accuracy of a marker rotation angles measurement during the experiments, and an amount time that was consumed by the experiments. For this reason, we continued our work using similar experiment design approach but partially automated the experiments with a humanoid robot as explained in the next sections.

Fig. 5.
figure 5

Rotation of ARTag (ID 34) marker regard to Z axis using Basler acA640-90gc.

3 Experimental Setup

For the experiments we used AR-601M humanoid robot (Fig. 1) that was developed by Russian company “Android Technics” [10]. The robot has 41 active degrees of freedom (DoF) and each of its two manipulators has 5 DoFs. Its head is equipped with two Basler AG cameras: one camera is a rear view camera (Basler acA1300-60gc) and another is a front camera (Basler acA640-90g). For this work we used only front camera and both robot’s manipulators, and controlled servo drives of the neck and the head. To see robot front camera view of AR-601M, we used Pylon Viewer program to take and save images; every twentieth camera frame was stored in order to get frames with different marker and manipulator positions and orientations for further use in image processing.

Official source code of AprilTag and CALTag code were compiled and utilized for the experiments. For ARTag we used ArUco library, which also detects and recognizes various kinds of other tag families [5]. For field experiments the tags were printed on a white paper with the following sizes:

  • ARTag: 5.6\(\,\times \,\)5.6 cm, total area 31.36 cm2

  • AprilTag: 5.8\(\,\times \,\)5.8 cm, total area 33.64 cm2

  • CALTag 4x4: 4.9\(\,\times \,\)4.9 cm, total area 24.01 cm2

Each ARTag and AprilTag marker has its own unique ID, which is encoded in the internal pattern of the tag. We randomly selected ARTag markers with IDs 2, 3, 6, and 34 and AprilTags with IDs 4, 6, 8, and 9 for all experiments (laboratory and pseudo field experiments).

The small size of all field experiments markers is explained by the size of the end-effector (the palm of AR-601M robot arm), where the tags were placed.

Fiducial systems have a set of criteria, which determine the performance of a marker system with regard to each criterion. The design of markers directly depends on their intended application area and, in most cases, a developed for certain purposes marker may not be suitable for another application. In our case, we plan to place the marker on robot manipulator end-effectors; thus, the possibility of marker’s arbitrary occlusion increases. This imposes the requirement that a marker should be resistant to an overlap. We performed experimental work in order to compare ARTag, AprilTag, and CALTag markers resistance to occlusions, which is defined as a partial overlapping of the marker with other objects in the scene, potentially including other parts of the robot.

The experiments consisted of laboratory and pseudo field experiments; the technical design and light conditions were the same for both types of experiments (i.e., we carried out the experiments in daylight and switched on ceiling lamps), but they differed in the conditions under which the experiments were carried out. To conduct an experiment with fiducial markers, firstly we set desired joint angles that provide an initial pose (position and orientation) of AR-601M end-effector to allow a good visibility of a marker as well as to select initial pose for the neck and for the head of AR-601M (Fig. 6). Each marker was printed on a small white paper and fixed on the back of the palm of both AR-601M manipulators. Both manipulators performed marker rotations in order. We used AR-601M software shell to control robot systems servo drives, check joint states and manage robot configuration. After a set of rotations (with simultaneous capturing of camera frames) was completed we replaced the current marker with a new one and repeated the procedure. This way for each marker ID we obtained 14 distinct images (frames) with the marker. Finally, these images were used by detection and identification software for each corresponding fiducial marker.

Fig. 6.
figure 6

Example of an experiment with AprilTag marker.

4 Experimental Results

Table 1 demonstrates results of laboratory experiments (Fig. 7). For each marker ID the robot moved its palm for 3 min, and for each marker ID 200 frames were captured with random delays of 0,1 to 2 s between the frames. Next, we randomly selected a subset \(F_{s}\) of 14 different frames from the set of 200 frames in the following manner: the first frame was selected completely at random and added to the set \(F_{s}\), while every next frame should have a significant difference in its content (in pixels) and at least 0,5 s time difference from all frames that are already in \(F_{s}\). Next, this subset \(F_{s}\) of 14 frames for each marker was used for marker detection and identification using the appropriate algorithms.

As a result, ARTag marker system was the most resilient in laboratory conditions and the same (best) success rate of the system was detected for markers ID2, ID3 and ID34 at the level of 92.8% (i.e., 13 recognized markers out of 14 input frames). At the same time, CALTag and AprilTag showed still satisfactory but significantly lower level of success (e.g., CALTag 4x4 recognition succeeded in 9 out of 14 frames and AprilTag succeeded in 9 to 11 out of 14 frames with varying number of successful frames for different IDs). In laboratory conditions many factors affected the recognition of a marker: a size of the marker, the manipulator pose with regard to the robot camera, and input set of images.

Fig. 7.
figure 7

Experiments in a laboratory (left) and a pseudo field (right) environments.

Table 1. Laboratory experiments with AR-601M humanoid robot.

Table 2 presents the results of a (pseudo) field experiment (Fig. 7). For this type of experiments we used 28 frames for each marker ID, which were selected randomly. In comparison with the results of previous experiments, AprilTag showed better results with 97.3% average success rate. ARTag and CALTag success rate was significantly behind: 83% and 64.3% accordingly. We believe that the results of the field experiments strongly depended on frames selection and lighting conditions and this issue is further discussed in Sect. 5.

Table 2. Pseudo field experiments with AR-601M humanoid robot.

5 Conclusions and Future Work

Fiducial markers are becoming a new alternative method for a camera calibration instead of using the classical method of a checkerboard pattern and its variations. In this paper we investigated CALTag, ARTag and AprilTag fiducial marker patterns in laboratory and field environment. We conducted a series of experiments to study weakness and strengths of the selected markers. Rotations and occlusions were selected as a comparative quality criteria as these are the most frequent situations that occur in real world operating; e.g., a marker could be occluded by some object between a robot and the marker or there could be various rotations of the marker with regard to the robot camera that appear due to robot locomotion within its workspace.

After series of the experiments we concluded that a marker detection and experimental identification results obtained for ideal conditions could not be directly transferred to the real world calibration tasks. According our previous work [11, 12] the idealized condition experiments on marker detection and recognition demonstrated that AprilTag and ARTag have high sensitivity to edge overlapping, while CALTag, due it’s design and detection algorithm, can be detected with overlapped pattern’s edge up to 50% of marker’s area. Moreover, while AprilTag, CALTag and ARTag all showed resistance to overlapping of their interior by small complex objects and small geometric objects, CALTag system demonstrated the best resistance to such overlapping. Overall, manual experiments with large size markers showed that the best performance among AprilTag, ARTag and CALTag markers should be expected for CALTag marker.

In the similar to the previously conducted manual experiments we were expecting similar results. Yet, the new pilot experiments in the laboratory and the (pseudo) field environments demonstrated almost the opposite results. ARTag demonstrated the highest success rate of 89.25% in average for the laboratory experiments. AprilTag demonstrated the highest success rate of 97.3% in average for the field experiments. And CALTag this time had the lowest success rate of 64.3% in average for the laboratory experiments as well as for the field experiments. We believe that the reason of the CALTag failure was its weak resistance to scaling of the marker size. Our ongoing work concentrates on extending these pilot experiments in order to confirm the obtained results within statistically significant number of laboratory and field experiments. Finally, this should lead to establishing of a new framework for self-calibration of cameras and manipulators of a humanoid robot AR-601M, that could be further extended to other types of robots.