Keywords

1 Introduction

Barcode technology has been widely used in many industries since 1940s [1]. In retail industry, almost every item is labeled with at least one form of barcode. Barcodes are always considered as the identity of merchandise and have brought great convenience for commodity management. Recognition of barcode using dedicated scanner, such as laser scanner, is the most common way to obtain the information of a barcode. Yet, the dedicated scanner is inefficient since it relies on manual operations to locate each barcode. Hence, how to access the information of barcode efficiently and effectively becomes a key problem. In this paper, we discuss the scheme of detection and recognition multiple barcodes in the image in order to improve working efficiency.

Research interests and activities in accessing the information of barcodes based on images have been increased significantly. Given an image that contains a barcode, detection and recognition are two essential steps to obtain the information of it. Several approaches to locate the position of barcode have been proposed. Chai and Hock identified parallel line patterns at block level to determine the location of the barcode [2]. Zhang et al. proposed a method for automatically localizing the barcode in complex scenes based on edge orientation histograms [3]. Alexander Tropf and D. Chai proposed to localize the barcode in DCT domain, under the assumption that the barcode occupies more than 10 percent of the whole image [4]. As for barcode recognition, binarization and edge detection are two typical methods. For example, Wachenfeld et al. used a single scanning line, binarized by the adaptive threshold [5]. Liu computed the average gray value of each column and binarized the values by a single threshold [6]. Kresic-Juric et al. found the potential edges by computing the derivatives of the gray value on the scanning line [7]. Several approaches to identify the barcode using peak locations of waveform have been proposed [8, 9]. The waveform was obtained by scanning a barcode with the dedicated scanner. In contrast to previous approach, Orazio Gallo et al. proposed to recognize the barcode image using deformable templates without binarization [10], which have made significantly contributions to the development of barcode recognition.

However, most of these methods focus on the detection and recognition of a single barcode in an image. It is expected that an algorithm can process multiple barcodes in an image, so that the efficiency of barcode processing can be improved on a large scale. An example of the image containing multiple EAN-13 barcodes is shown in Fig. 1. It can be seen from this figure, the area of each barcode is quiet small compared to other commodity items, which brings great interference for barcode detection. The width of each barcode in the image is about 200 pixels. An EAN-13 barcode is composed of 95 modules while the bars and spaces are made up of \( i \) modules, where \( i = \{ 1,2,3,4\} \). Therefore, the width of the narrowest bar (or space) of an EAN-13 barcode is about 2 pixels. The exact width of each bar can hardly be obtained in pixels under such resolution. All of these factors, possibly combined with defocus and noise, make barcode detection and recognition difficult in such situation.

Fig. 1.
figure 1

An example of complex image that contains multiple barcodes

In the paper, we present an effective method for detecting and recognizing multiple barcodes in complex scenes. Experiments have been conducted for the detection and recognition algorithm on a variety of complex images. Satisfactory results have been achieved in the paper.

This paper is organized as follows: in the next section, the barcode detection algorithm is described in detail. Section 3 presents the specific procedures of barcode recognition. The experiment results of the approach and discussions are given in Sect. 4. Section 5 gives the summary of the paper.

2 Detection and Localization of Barcodes in Complex Scenes

As is shown in Fig. 1, the complex images are composed of various commodities and relevant barcodes. The area of each barcode is quiet small compared to the complex scenes. In order to locate the barcode areas in the complex scenes, we divide the images into sub-blocks of equal size and extract the features of each sub-block separately. Assuming that the size of the image is \( M \times N \), the size of each sub-block is selected as \( m \times n \) with \( m \ll M \) and \( n \ll N \). \( m \) and \( n \) are selected according to the resolution of original images and the area of each barcode. Then the feature map can be constructed by the feature values and the barcodes can be located by binarizing the feature map.

2.1 Feature Extraction

In order to detect barcode areas, we need to extract the feature of each sub-block. Barcode is composed of a collection of bars and spaces. Thus, the barcode image is characterized by strong horizontal gradients and weak vertical gradients due to its special texture feature. The horizontal gradient and the vertical gradient of the \( k \)-th sub-block can be defined as:

$$ H_{k} = \frac{1}{m \times n}\sum\limits_{j = 0}^{n - 1} {\left( {\left| {\sum\limits_{i = 0}^{m - 1} {f\left( {x_{k} + i,y_{k} + j} \right)} - \sum\limits_{i = 0}^{m - 1} {f\left( {x_{k} + i,y_{k} + j - 1} \right)} } \right|} \right)} $$
(1)

and

$$ V_{k} = \frac{1}{m \times n}\sum\limits_{i = 0}^{m - 1} {\left( {\left| {\sum\limits_{j = 0}^{n - 1} {f\left( {x_{k} + i,y_{k} + j} \right) - } \sum\limits_{j = 0}^{n - 1} {f\left( {x_{k} + i - 1,y_{k} + j} \right)} } \right|} \right)} $$
(2)

Where \( \left( {x_{k} ,y_{k} } \right) \) denote the coordinates of the pixel at upper-left corner of the sub-block. \( f\left( {x_{k} ,y_{k} } \right) \) denotes the gray value of the pixel at \( \left( {x_{k} ,y_{k} } \right) \). The sub-blocks that located in the barcode regions will produce large horizontal gradients and small vertical gradients, and the ratios of the horizontal gradients to the vertical gradients will be larger. While the sub-blocks that located in other regions may produce large horizontal gradients or small vertical gradients or even neither, the ratios will be smaller than that of barcode regions. Hence, we use the ratio of the horizontal gradient to the vertical gradient as the feature parameter. That is,

$$ r_{k} = \frac{{H_{k} }}{{\hbox{max} (V_{k} ,1)}} $$
(3)

Note that \( \hbox{max} \left( {V_{k} ,1} \right) \) in (1) is used as the denominator because the value of \( V_{k} \) may be close to zero under some special circumstance. The areas with large ratio values are regarded as barcode region candidate.

2.2 Barcode Detection

The ratios of all sub-blocks in the image can be obtained as mentioned above. The ratio map is constructed by assigning the ratio value of each sub-block to all the pixels in it. The ratio map of Fig. 1 is shown in Fig. 2a. The ratio map illustrates that the gray values of the barcode areas are much larger than other regions, since the barcodes possess large ratio values. In order to separate the barcode regions from the background, we binarize the ratio map with an adaptive threshold. The statistical distribution illustrates that most of the values distribute among the range of \( (\mu - 3\sigma ,\mu + 3\sigma ) \). The ratio values of barcode areas are much larger than that of other regions. Hence, the threshold is selected as:

Fig. 2.
figure 2

An example of the ratio map and the binary map

$$ T_{s} = \mu + 3\sigma $$
(4)

Where \( \mu \) and \( \sigma \) are the mean and standard deviation of the values of all pixels in the ratio map, respectively. Figure 2b shows the binary map of Fig. 2a. In the binary map, both the actual barcode regions and irregular interference areas are segmented from the background. The geometrical information of each connected domain can be obtained, such as the edges, the width and the height. The connected domains that satisfy the geometrical feature of barcode can be regarded as barcode areas.

The detailed operation steps of barcode detection are as follows:

  1. (i)

    Divide the original image into non-overlapping sub-blocks of equal size \( m \times n \).

  2. (ii)

    Compute the ratio of the horizontal gradient to the vertical gradient in each sub-block.

  3. (iii)

    Construct the ratio map by assigning the ratio value of each sub-block to all the pixels in it.

  4. (iv)

    Segment the potential barcode regions by binarizing the map with an adaptive threshold \( T_{s} \), which is define by the Eq. (4).

  5. (v)

    Search the connected domains in the binary map and regard the connected domains which satisfy the geometrical feature as barcode regions.

In this way, multiple barcodes in a complex image can be detected. Figure 3 gives the result of barcodes detection with our algorithm. The detected barcodes are highlighted in red frames. It can be seen that all of the barcodes in this complex image have been located successfully.

Fig. 3.
figure 3

The result of barcodes detection with our algorithm

3 Barcode Recognition

The information of a barcode is encoded in the width of bars and spaces. Hence, how to determine the width of each bar becomes a key question for barcode recognition. Unfortunately, the barcodes that detected from complex scenes are often of low quality. Figure 4 shows the example of the detected barcode image. It can be seen that the boundaries of adjacent bars are difficult to be detected in such low quality images. In order to obtain the accurate width of each bar, we propose to construct a binary barcode image based on the projection curve of the detected barcode.

Fig. 4.
figure 4

An example of the detected barcode

3.1 Barcode Projection Curve

In contrast to the previous barcode identification approaches which analyzed a single scan line of the barcode [710], we adopt the projection curve, which can be acquired by averaging the gray values of the image along the vertical direction, to represent the whole barcode image. However, the orientations of barcodes are not always vertical, in this case, the projection should be processed along the slanted directions. Moreover, the bars share different slanted angles within a single barcode under some special circumstance, which is shown in Fig. 5. The slanted direction of the barcode can’t be represented by a unitary angle. Hence, we divide the barcode image into three equal parts and compute the slanted angel of each part separately.

Fig. 5.
figure 5

A barcode with different slanted angles are divided into three equal parts

The slanted angle of each part is defined as:

$$ \theta_{i} = \arg \,\mathop {\hbox{max} }\limits_{\theta } \;V_{i} (\theta ),\,i = 1,2,3 $$
(5)

Where

$$ H_{i} (\theta ) = \sum\limits_{{y = \frac{w}{3}(i - 1)}}^{{\frac{w}{3}i - 1}} {\left( {\left| {\sum\limits_{x = 0}^{h - 1} {f(x,y + x \cdot \tan \theta ) - \sum\limits_{x = 0}^{h - 1} {f(x,y + x \cdot \tan \theta - 1)} } } \right|} \right)} ,\,i = 1,2,3 $$
(6)

is the horizontal gradient of the \( i \)-th part along the slanted angle of \( \theta \) with \( - 15^{ \circ } \le \theta \le 15^{ \circ } \). Where \( w \) and \( h \) are the width and the height of the barcode image, respectively. The angle \( \theta_{i} \) that produces the biggest horizontal gradient of the \( i \)-th part is regarded as the slanted angle of it.

In order to avoid the abrupt change of the slanted angles between different parts, we adopt proper adjustment towards the slanted angles of each column, so that the slanted angles can change in a linear fashion. Assuming that the slanted angles of the middle column of each part are \( \theta_{1} \), \( \theta_{2} \) and \( \theta_{3} \), respectively. The slanted angle of the \( j \)-th column between the middle column of the first part and the second part is defined as follows:

$$ \theta_{j} = \frac{{3(\theta_{2} - \theta_{1} )}}{w}(j - \frac{w}{6}) + \theta_{1} ,\quad \frac{w}{6} < j < \frac{w}{2} $$
(7)

Where \( w \) is the width of the detected barcode. Similarly, the slanted angle of other columns can be computed with the same approach.

In this way, the slanted angle of each column in the barcode can be obtained. The projection value of each column can be computed by averaging the gray values of pixels on its slanted direction. Let \( L(n) \) denotes the projection value of the \( n \)-th column of the barcode. All the projection values (\( L(n) \), \( n = 1,2, \ldots ,w \)) are connected by a curve, which is called the barcode projection curve, as is shown in Fig. 6. The horizontal coordinate of the curve denotes the column of the barcode, while the vertical coordinate denotes the projection value of the corresponding column. By comparing the projection curve with the original barcode, we find that the local extrema of the curve locate somewhere near the intermediate position of the bars and spaces. In addition, the peaks in the curve correspond to the spaces of the original barcode, while the valleys correspond to the bars. The location and number of local extrema are important features for barcode recognition. Moreover, it’s easier to locate the local extrema of the curve than to detect the edges of adjacent bars, especially for low-resolution barcode. Hence, the positions of local extrema are located prior to the edges of adjacent bars.

Fig. 6.
figure 6

An example of barcode projection curve

3.2 Localization of Local Extrema

In order to localize the entire barcode region, the first and the last local extrema, which correspond to the leftmost and rightmost bars of the barcode, need to be located at first. Since the leftmost and rightmost bars are bordered by the white regions, the values of the first and the last extremum points are much smaller than the projection values of the white regions. We proceed inward from each end of the projection curve and stop when we find a local minimum point whose value is less than \( Th_{1} \), which is defined as

$$ Th_{1} = \mu - \sigma $$
(8)

where \( \mu \) and \( \sigma \) are the mean and standard deviation of \( L(n) \) (\( n = 1,2, \ldots ,w \)), respectively. We use \( (e_{1} ,L(e_{1} )) \) and \( (e_{N} ,L(e_{N} )) \) to denote the first and last local extremum points in the projection curve, respectively.

Then, the rest local extrema can be located between \( (e_{1} ,L(e_{1} )) \) and \( (e_{N} ,L(e_{N} )) \). Generally, the total number of local extrema equals to summation of the number of bars and spaces. Owing to the ripples introduced by noises and illumination distortions, excessive extrema may occur in the projection curve, as is shown in Fig. 6. It can be seen from this figure, three extrema occur within a single space with 104 \( \le n \le \) 112. But only one extrmum can be regarded as effective extrmum. Another threshold is introduced to get rid of redundant extrema. The threshold is defined as

$$ Th_{2} = \sigma $$
(9)

Where \( \sigma \) is the standard deviation of \( L(n) \) (\( n = 1,2, \ldots ,w \)). A local extremum is treated as an effective extremum only when the absolute difference between the value of the current extremum and the previous effective extremum larger than \( Th_{2} \). In this way, the excessive extrema that introduced by noises can be eliminated. All the effective extrema can be represented as \( \{ (e_{1} ,L(e_{1} )),(e_{2} ,L(e_{2} )), \) \( \ldots ,(e_{N} ,L(e_{N} ))\} \).

3.3 High-Resolution Reconstruction of Barcode Image

The barcodes located in complex scenes suffer from blur and low-resolution. Most of these barcodes have a module width of 1–2 pixels. The exact width of each bar can hardly be obtained in pixels under such resolution. R. Shams and P. Sadeghi proposed to achieve sub-pixel accuracy by splitting the pixels that close to the boundaries of bars [9]. We adopt image interpolation algorithm to improve the resolution of barcode and reconstruct a binary barcode image by binarization, so that the exact width of each bar can be obtained.

Firstly, we extend every two adjacent midpoints in the projection curve linearly. The midpoints that corresponding to the spaces are extended to 255, while that corresponding to the bars to 0. Other pixels between two adjacent midpoints are extended linearly. That is,

$$ \hat{L}(n) = \left\{ {\begin{array}{*{20}c} {0,\quad {\kern 1pt} {\kern 1pt} {\kern 1pt} \quad \quad \quad L(n) \le \hbox{min} \left( {L\left( {e_{i} } \right),L(e_{i + 1} )} \right)} \\ {255,\quad \quad \quad \quad L(n) \ge \hbox{max} \left( {L\left( {e_{i} } \right), L(e_{i + 1} )} \right)} \\ {\frac{{L(n) - \hbox{min} (L(e_{i} ),L(e_{i + 1} ))}}{{\left| {L(e_{i} ) - L(e_{i + 1} )} \right|}} \times 255,\quad \quad else} \\ \end{array} } \right. $$
(10)

where \( e_{i} \) denotes the \( i \)-th effective extremum of the projection curve. \( (n,L(n)) \) denotes the point located between \( (e_{i} ,L(e_{i} )) \) and \( (e_{i + 1} ,L(e_{i + 1} )) \). \( \hat{L}(n) \) is the extended value of \( (n,L(n)) \).

Then, we interpolate \( k \) points between adjacent points in \( (n,\hat{L}(n)) \) by cubic spline interpolation. \( k \) is selected according to the form of barcodes. For example, EAN-13 code is composed of 95 modules, then \( k \) can be selected as 94, so that the module width of each barcode can be the integer multiple of pixel after interpolation. Moreover, the values of \( \hat{L}(e_{2k + 1} ) \) (\( k = 0,1,2, \ldots ,29 \)) are 0 and the values of \( \hat{L}(e_{2k} ) \) (\( k = 1,2, \ldots ,29 \)) are 255. Thus, the points within every three adjacent elements in \( (e_{i} ,\hat{L}(e_{i} )) \) (i.e. \( (e_{1} ,\hat{L}(e_{1} )) \), \( (e_{2} ,\hat{L}(e_{2} )) \), \( (e_{3} ,\hat{L}(e_{3} )) \); \( (e_{3} ,\hat{L}(e_{3} )) \), \( (e_{4} ,\hat{L}(e_{4} )) \), \( (e_{5} ,\hat{L}(e_{5} )) \); …) can be regarded as one cycle in a periodic function. Hence, the periodical condition is used as the boundary condition of cubic spline interpolation in the paper.

Finally, we binarize the interpolated sequence by a single threshold, which is selected by the method proposed by Otsu [11], to locate the edges of adjacent bars. In order to visually represent the result of binarization, the 0’s and 1’s in the binary sequence are replaced with a column of pixels with the gray value of 0 and 255, respectively. Figure 7 gives an example of the original barcode and the reconstructed barcode. The reconstructed barcode can be recognized according to the coding rules mentioned in [12], since the width of each bar has been obtained accurately.

Fig. 7.
figure 7

A sample of the original detected barcode and the reconstructed image with our algorithm

4 Experiments and Results

In order to assess the performance of the presented approach, experiments have been conducted a variety of practical images taken in supermarkets, which contain complex commodities as well as several EAN-13 barcodes. The images, which share the identical resolution of 3264 × 2448, are divided into two subsets according to the image quality. The sample set 1 is composed of 64 images with 183 barcodes, while the sample set 2 is composed of 62 images with 252 barcodes. All the images in the sample set 2 contain blurry barcodes, which are caused by noise or defocus, while all the images in the sample set 1 are clear. The experiment contains two parts: barcode detection and barcode recognition. Tables 1 and 2 show the numerical results of the barcode detection experiment and the barcode recognition experiment, respectively.

Table 1. The result of barcode detection
Table 2. The result of barcode recognition

As shown in Table 1, most barcodes in the images are correctly detected. Yet, there are miss detection and false detection in both subsets. As a consequence of low quality images, both the miss detection rate and false detection rate are higher in the sample set 2 compared with that in the sample set 1. Nevertheless, our localization algorithm guarantees a 97 % detection rate of the barcodes, even in the low quality subset.

As for the recognition stage, the barcodes, which are detected as mentioned above, are used as the input images. Hence, the number of barcodes to be recognized equals to the number of detected barcodes.

After obtaining the binary barcodes with the presented reconstruction algorithm, we decode the barcodes according to the coding rules mentioned in [12], and judge the results of barcodes recognition by the method mentioned in [2]. Our algorithm is tested against the recognition algorithm proposed by Liu [6]. Table 2 shows the numerical results of our tests.

As shown in Table 2, our algorithm shows higher correct rate than that of the approach proposed by Liu [6] in both of the sample sets. Owing to the low quality of the images, the correct rates of both algorithms falls significantly in the sample set 2. Unfortunately, 18 % of the barcodes fail to be recognized in the sample set 2. The image and the projection curve of the unrecognized barcode are shown in Fig. 8. It can be seen that, several adjacent bars and spaces overlap completely. Multiple extrema in the projection curve fail to be detected due to highly distortion of the barcode image. This kind of barcodes fails to be recognized with both algorithms.

Fig. 8.
figure 8

The image and the projection curve of the unrecognized barcode

5 Summary

In this paper, we proposed an effective algorithm for barcodes detection and recognition in complex scenes. The detection algorithm is designed based on the distinctive texture feature of the barcode. The ratio of horizontal gradient to vertical gradient was used as the feature parameter and extracted on each sub-block of the images. The feature map was constructed and binarized by the adaptive threshold to locate the accurate position of each barcode. In order to obtain the exact width of each bar, the projection curves of the barcodes were used to represent the detected barcode images. Prior to the edges of bars, the intermediate positions of the bars and spaces were located by extracting the local extrema of the projection curves. The projection curve was used to reconstruct the high-resolution binary barcode. The reconstructed barcodes were recognized according to the coding rules.

In the paper, the experiments for barcodes localization and recognition algorithm were conducted on two subsets of images. The experiment results given in the paper showed that more than 97 % of barcodes were successfully detected. The results also illustrates that 97 % of barcodes in sample set 1 were recognized correctly while the correct rate is lower in sample set 2 due to the low quality of images. These results confirm the effectiveness of the barcode localization and recognition approach presented in the paper.