Keywords

1 Introduction

Augmented reality (AR) is with the development of virtual reality technology and produce a new kind of computer application and the technology of human-computer interaction [1]. It combines the computer-generated virtual environment with the real scene of the user by means of photoelectric display technology, interactive technology, multi-sensor technology, computer graphics and multimedia technology, so that the user can confirm from the sensory effect that the virtual environment is an integral part of the surrounding real scene. Unlike immersive virtual reality, augmented reality technology is mainly based on the existing real world, to provide users with a new sensory composite visual effects, to expand the human cognitive and perceived ability of the world. Augment reality technology not only has the strong sense of reality, modeling on small workload, and more secure and reliable.

Mobile augmented reality applications require mobile device software and hardware, using the device camera to capture real-world images, calculate the relevant information, integration of virtual scenes, and finally output to the screen, projectors and other display devices [4]. When the two-dimensional barcode image is acquired and decoded by the camera of the mobile phone, the acquired two-dimensional code is inevitably inclined. When the tilt angle exceeds a certain range, the decoded barcode can not be decoded correctly. Therefore, it is necessary to tilt the two-dimensional code.

The traditional two-dimensional code correction algorithm: Hough transform, Fourier transform [6]. The Fourier transform method uses the two-dimensional barcode tilt angle corresponding to the azimuth angle with the largest density of the Fourier space. The computational complexity is very high and is rarely used at present. Hough transform is the most commonly used method to detect the inclination angle. For the real-time processing of mobile phones using mobile phones to enhance the real-time display of 3D models may appear unpredictable error, this approach is first recorded two-dimensional code contour space coordinates, record the coordinates, the application of the minimum. The linear regression of these points by LSM gives the slope of the line, and the slope of the two-dimensional code is known. This can effectively improve the computation time.

2 Overview of 2D Code

As a new information storage and transmission technology, two-dimensional code can be used to express text and image processing information on many languages, and has the characteristics of high capacity, high density, strong error correction capability, fast positioning and automatic encoding and decoding [3]. It can not only encode the scene information associated with it to the landmarks, but also obtain the scene information on the decoding landmarks, which is obviously different from the traditional ARToolKit marker points for locating and recognizing [7]. The common mark point has QR code, like Fig. 1 shows. As a kind of two-dimensional code, QR code is based on the computer image processing technology, combined coding principle and so on, and has the function of automatic recognition read processing.

Fig. 1.
figure 1

Two kinds of mark points

2.1 QR Code Internal Structure

The QR code is a matrix-type two-dimensional code consisting of a square block consisting of a coding region (consisting of version information, format information and data and error correction code words) and function graphics (composed of a seek pattern, a delimiter, and a correction pattern component) [5], the coding region is a region for coding data or error correction code words, and the functional pattern refers to a specific pattern in the symbol for symbol localization and feature recognition, an imaging pattern located at three corners, It can help to determine the position, size and inclination of the symbol. The symbol is surrounded by a blank area. Figure 2 shows the internal structure of the QR code. In the symbol, the dark module represents the binary “1”, the light module represents the binary “0” [9].

Fig. 2.
figure 2

QR code structure

3 The Lean Correction Principle Based on LSM

According to the structure of the QR code, firstly locate the position mark and record the space coordinates of its left boundary, construct the data set {X, Y}, where X, Y is the input vector, used to store the spatial coordinates of the boundary point of the contour [2].

After rotation, the left edge of the 2D barcode must be perpendicular to the X axis, then:

$$ \mathop {\hbox{min} }\nolimits_{{}} \sum\nolimits_{i = 1}^{n} {\left( {\omega \left( {x_{i} ,y_{i} } \right) - \omega \left( {\overline{x,y} } \right)} \right)^{2} } $$
(1)

According to the data set {X, Y} constructed, it can be linear regression according to f(x) = ωx + b, we can see that the regression error of sample points is:

$$ e_{i} = f\left( x \right) - Y = \omega x = b - Y $$
(2)

Substitute Eq. 1 into Eq. 2

$$ \begin{aligned} & \quad \quad \mathop {\hbox{min} }\limits_{{}} \sum\limits_{i = 1}^{n} {\left( {\omega \left( {x_{i} ,y_{i} } \right) - \omega \left( {\overline{x,y} } \right)} \right)^{2} } \\ & = \mathop {\hbox{min} }\limits_{{}} \sum\limits_{i = 1}^{n} {\left( {e_{i} - b - Y - \left( {\overline{e - b - Y} } \right)} \right)^{2} } \\ & = \mathop {\hbox{min} }\nolimits_{{}} \sum\nolimits_{i = 1}^{n} {\left( {e_{i} } \right)^{2} } \\ \end{aligned} $$
(3)

Therefore, only the data set {X, Y} regression. When the inaccuracy \( e_{i} \) variance is minimum, the parameter ω is the tilt vector. In this way, the optimization problem of the image tilt angle is transformed into the process of regression of the data set {X, Y} to identify the parameter ω.

Suppose there exists a univariate regression model f(x) = ax + b, and the regression of the random variable Y on the independent variable is f(x). The univariate regression model is:

$$ {\text{y}} = {\text{ax}} + {\text{b}} +\upvarepsilon\quad\upvarepsilon\sim {\text{N}}\left( {0,\sigma^{2} } \right) $$

Here parameters a, b, on the independent variables X are independent.

Let the left boundary of the boundary contour of a two-dimensional code be independent of each other by N characteristic points [6]. The deviation between the estimated value and the sample is:

$$ \delta_{i} = y_{i} - f\left( {x_{i} } \right) = y_{i} - ax_{i} - b $$

By Eq. 3 we can see that when the deviation is the minimum angle is the tilt, the value is:

$$ \mathop {\hbox{min} }\nolimits_{{}} \delta^{2} = \mathop {\hbox{min} }\nolimits_{{}} \sum\nolimits_{i = 0}^{n - 1} {\left( {y_{i} - ax_{i} - b} \right)^{2} } $$
(4)

According to KKT conditions to obtain the optimal a, b value, that is

$$ {\text{a}} = \frac{{{\rm N}\sum\nolimits_{{{\text{i}} = 0}}^{{{\text{n}} - 1}} {{\text{x}}_{\text{i}} y_{i} } - \sum\nolimits_{{{\text{i}} = 0}}^{{{\text{n}} - 1}} {{\text{x}}_{\text{i}} } \sum\nolimits_{i = 0}^{n - 1} {y_{i} } }}{{{\rm N}\sum\nolimits_{{{\text{i}} = 0}}^{{{\text{n}} - 1}} {{\text{x}}_{\text{i}}^{ 2} - \sum\nolimits_{{{\text{i}} = 0}}^{{{\text{n}} - 1}} {{\text{x}}_{\text{i}} } \sum\nolimits_{i = 0}^{n - 1} {{\text{x}}_{i} } } }} $$
(5)
$$ {\text{b}} = \frac{{\sum\nolimits_{{{\text{i}} = 0}}^{{{\text{n}} - 1}} {{\text{x}}_{\text{i}}^{ 2} } \sum\nolimits_{i = 0}^{n - 1} {y_{i} } - \sum\nolimits_{{{\text{i}} = 0}}^{{{\text{n}} - 1}} {{\text{x}}_{\text{i}} } \sum\nolimits_{i = 0}^{n - 1} {{\text{x}}_{i} y_{i} } }}{{{\rm N}\sum\nolimits_{{{\text{i}} = 0}}^{{{\text{n}} - 1}} {{\text{x}}_{\text{i}}^{ 2} - \sum\nolimits_{{{\text{i}} = 0}}^{{{\text{n}} - 1}} {{\text{x}}_{\text{i}} } \sum\nolimits_{i = 0}^{n - 1} {{\text{x}}_{i} } } }} $$
(6)

The value of a obtained here is the slope of the straight line after the left edge of the fitting, and the inclination angle of the two-dimensional code in the horizontal direction can be obtained.

4 QR Code Correction Process

In order to realize the detection of QR code in mobile augmented reality, we need to perform binary preprocessing of the obtained 2D code to make the later operation more accurate. And then obtains the coordinates of the boundary points of the QR code, and obtains the final image through the distortion correction, the skew correction and the bilinear interpolation mapping. Finally, the 3D model is obtained by decoding the 2D code image, and output to the screen, projector and other display devices to enhance Reality display. The flowchart is shown below (Fig. 3):

Fig. 3.
figure 3

Flowchart

4.1 Image Preprocessing

There are two kinds of formulas for converting the color image into grayscale image:

$$ {\text{Y}} = 0.299{\text{R}} + 0.587{\text{G}} + 0.114{\text{B}} $$
(7)
$$ {\text{Y}} = \left( {{\text{R}} + {\text{G}} + {\text{B}}} \right)/3 $$
(8)

Here, we use the formula Y = (R + G)/2 and ignore the B component of the two-dimensional code image gray processing. The adaptive threshold segmentation algorithm is adopted in the binarization threshold processing. The basic idea of the algorithm is that the average gray value of S pixels is calculated when traversing the whole image. When the pixel value of a pixel is lower than this value, set to black, otherwise set to white.

4.2 Barcode Location

After the bar code is binarized, the bar code needs to be positioned. According to the structure of the QR code, we take the location of the QR code, record its spatial coordinates, and construct the data set {X, Y}, where X and Y are the input variables.

4.3 Distortion Correction

In practice, the camera in take of QR code image will have different degree of distortion. Any geometric distortion can be defined by an equation that transforms the non-distorted coordinate system (X, Y) into the distortion coordinate system \( \left( {\text{X}^{'} ,\,\text{Y}^{'} } \right) \), which is generally of the form:

$$ \left\{ {\begin{array}{*{20}c} {x^{{\prime }} = h_{1} \left( {x,y} \right)} \\ {y^{{\prime }} = h_{2} \left( {x,y} \right)} \\ \end{array} } \right. $$
(9)

Let f(x, y) be the original image without distortion and g(x, y) be the result of f(x, y) distortion. The distortion process is known, and is defined by the functions h1(x, y) and h2(x, y)

$$ {\text{g}}\left( {{\text{x}}^{{\prime }} ,{\text{y}}^{{\prime }} } \right) = f\left( {{\text{x}},{\text{y}}} \right) $$
(10)

Equation (10) that should appear in the image in pixels (x, y) on the grey value due to the distortion, and appear in the (x, y), the distortion problem might be solved by mapping transformation. In the case of known g\( \left( {\text{x}^{\prime } ,\,\text{y}^{\prime } } \right) \), h1(x, y) and h2(x, y), the restoration process is as follows:

  1. 1.

    Find the corresponding positions in g\( \left( {\text{x}^{\prime } ,\,\text{y}^{\prime } } \right) \) for each point (x0, y0) in f(x, y): (M, N) = [h1(x0, y0), h2(x0, y0)]. M and N are the coordinate values of the spatial points, respectively. (M, N) does not coincide with any point in g\( \left( {\text{x}^{\prime } ,\,\text{y}^{\prime } } \right) \), since M and N are not necessarily integers.

  2. 2.

    Find the point \( \left( {\text{x1}^{\prime } ,\,\text{y1}^{\prime } } \right) \) nearest to (M, N) in \( \text{g}\left( {\text{x}^{\prime } ,\,\text{y}^{\prime } } \right) \), let f(x0, y0) = \( \text{g}\left( {\text{x1}^{\prime } ,\,\text{y1}^{\prime } } \right) \), that is, give the grayscale values of g\( \left( {\text{x1}^{\prime } ,\,\text{y1}^{\prime } } \right) \) to f(x0, y0), according to this way point by point until the entire image is finished, and the geometry is corrected.

4.4 Tilt Correction

After the distortion correction, we record the left border of the bar code, using (5,6) to calculate the tilt angle \( \alpha = \texttt{arc} \,\texttt{tan}\, \textit{a}; \) After the bar code is obtained, the rotation formula, that is

$$ \left[ {\begin{array}{*{20}c} {x_{new} } \\ {y_{new} } \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {\cos \,a} & {\sin \,a} \\ { - \sin \,a} & {\cos \,a} \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} {x_{old} } \\ {y_{old} } \\ \end{array} } \right] $$
(11)

If the Eq. (11) is used directly, for a wide (W) * high (H) image, the rotation requires 4WH multiplication and 2WH addition, and the algorithm is highly computationally. In fact, for a height of H for the two-dimensional code image, the vertical direction of the projection maximum deviation is:

$$ Y_{max} = H\,\tan \,a $$

The deviations for the jth column and the ith row in the image are, respectively (Fig. 4).

Fig. 4.
figure 4

Spatial mapping of control points

$$ \Delta y_{j} = \left( {\frac{j}{{Y_{max} + 1}}} \right) \times W $$
$$ \Delta x_{i} = \left( {\frac{i}{{X_{max} + 1}}} \right) \times H $$

4.5 Bilinear Interpolation

Some points that are not in the integer position may be generated during the image rotation transformation. This requires an algorithm for gray-scale interpolation to produce a smooth mapping that maintains continuity and connectivity. Interpolation methods include the nearest neighbor interpolation, bilinear interpolation and high order interpolation. The nearest neighbor interpolation is a simple interpolation algorithm, but the nearest neighbor interpolation algorithm will produce a clear image of the zigzag boundary. For the QR code with only black and white, bilinear interpolation can be used to produce a satisfactory image restoration effect [8]. The mathematical model is shown in Fig. 5:

Fig. 5.
figure 5

Mathematical model of bilinear interpolation

First, by first-order linear interpolation:

$$ {\text{f}}\left( {{\text{x}},0} \right) = {\text{f}}\left( {0,0} \right) + {\text{x}}\left[ {{\text{f}}\left( {1,0} \right) - {\text{f}}\left( {0,0} \right)} \right] $$

The same can be drawn:

$$ \begin{aligned} {\text{f}}\left( {{\text{x}},1} \right) & = {\text{f}}\left( {0,1} \right) + {\text{x}}\left[ {{\text{f}}\left( {1,1} \right) - {\text{f}}\left( {0,1} \right)} \right] \\ {\text{f}}\left( {{\text{x}},{\text{y}}} \right) & = {\text{f}}\left( {{\text{x}},0} \right) + {\text{y}}\left[ {{\text{f}}\left( {{\text{x}},1} \right) - {\text{f}}\left( {{\text{x}},0} \right)} \right] \\ \end{aligned} $$

Merge the above three formulas:

$$ \begin{aligned} {\text{f}}\left( {{\text{x}},{\text{y}}} \right) & = \left[ {{\text{f}}\left( {1,0} \right) - {\text{f}}\left( {0,0} \right)} \right]{\text{x}} + \left[ {{\text{f}}\left( {0,1} \right) - {\text{f}}\left( {0,0} \right)} \right]{\text{y}} \\ & \quad \quad \quad \quad + \left[ {{\text{f}}\left( {1,1} \right) + {\text{f}}\left( {0,0} \right) - {\text{f}}\left( {0,1} \right) - {\text{f}}\left( {1,0} \right)} \right]{\text{xy}} + {\text{f}}\left( {0,0} \right) \\ \end{aligned} $$

After this series of operations, the original two-dimensional code image into a more standard image, as shown (Fig. 6):

Fig. 6.
figure 6

Compared before and after correction

4.6 Download the 3D Model and Display It

As shown in Fig. 7, QR code content with the soldier URL address, we can easily download from the server to the corresponding three-dimensional model, and through the camera to enhance its reality display.

Fig. 7.
figure 7

Enhanced reality display

5 Conclusion and Outlook

This paper mainly focuses on the tilt correction method of 2-D codes in mobile AR. The LSM effectively improves the decoding time and decoding success rate. Start from the server to download the three-dimensional virtual model loaded into memory. Then, the model rendering output to the phone screen, and ultimately achieve the target QR code to enhance the reality show. However, this method only supports the identification of a single QR code. The next step is to consider a method that supports multiple QR code recognition and combine with the current excellent local feature point algorithm to realize fast and effective recognition of multiple QR codes.