1 Introduction

Register is defined as the accuracy of the relative positions between any two color images in polychrome printing. In the transfer process, if the image information deviates from the predetermined position, it is called “register error”, “register difference”, “mis-register” or” out of register”, which is a kind of common printing trouble. Figure 1 shows examples of register errors in several printed materials:

Fig. 1
figure 1

Register Errors in Printed Matters: (a adapted from Fig. 3 of [37]) The whole picture is a double image due to register error, and the printing defect is the most obvious in the red box; (b adapted from Fig. 2 of https://wenku.baidu.com/view/d56bdfbd240c844769eaee78.html) Several register errors on stamp patterns, The longitude and latitude lines in the left and middle pictures deviate from the earth, In the right figures, the white background is leaked due to register deviation;(c adapted from Fig. 1 of http://ebook.keyin.cn/magazine/pt-bzzh/201208/27-972928.shtml) Arrow indication: register error between color and edge

As can be seen from Fig. 1, the register errors in printed matters are usually very smooth and the features are not obvious. The traditional manual inspection mainly inspects the register lines, which are cross lines. When the register marks deviation is too large, it is considered that there are register errors, then checked candidate images to find out the defective images. The cross lines for register are shown in Fig. 2.

Fig. 2
figure 2

Cross Lines for Register: (a) The layout of cross lines is located at the four corners of the press sheet; (b) Cross lines are composed of cyan, magenta, yellow and black, if there are errors, cross lines do not coincide (c). When register is accurate, cross lines overlap and appear black

Manual inspection for register is labor intensive and unreliable [27]. Automatic detection instead of manual is the development trend. At present, there are relevant facilities on the market for printing flaw detection and even register measurement, such as Nota SaveCheck in Switzerland, ErgoTronic series and QualiTronic from Koenig-Bauer(KBA) in Germany. Perhaps due to the reasons of commercial confidentiality, the public algorithms of quality inspection equipment from each company have not been found. Although it is not clear about the details of quality inspection equipment on the market, register detection methods can be generally inferred to adopt two directions. One is to detect the coincidence of cross lines, which is a traditional method. The other method is to detect images directly based on computer vision, which has a high degree of automation and precision.

The main problems of existing register control, measurement and defect detection can be listed as follows:

  1. (1).

    Cross lines are mainly used for manual inspection, with high labor intensity and unstable work quality. The printed matters usually be printed several same small formats on a press sheet. Although the press sheet is not matched properly, due to the scalability of the paper, the deviation of individual small images may be allowed in a certain level of printing quality. At the other extreme, there will be cross is accurate, but the small images are not accurate. At these time, it is necessary to manually inspect the small sheet products which achieve the standard or not, which will increase the labor intensity and reduce the detection efficiency.

  2. (2).

    Computer vision, the influence of noise is great and feature extraction is difficult. This problem belongs to image preprocessing. The detection effect of register measurement depends on the position features of image contours. However, due to the light, camera (sensors) accuracy and shooting angle, image digitization and other reasons, it is often difficult to obtain the ideal position information. Therefore, filtering noise, location feature extraction and amplification of printed image are the key points for register measurement in the process of image preprocessing.

  3. (3).

    High false error rate from computer vision. This problem belongs to image classification. Existing high-precision algorithms for automatic register measurement are complex. Nevertheless, there are still missing and false detection. Missing detection is understood to the defective image is judged as good images, while false detection is understood to the good image is judged as defective image. In some special printing industries, such as maps, certificates, banknotes and security printing, quality security must be assured. The circulation of defective printed matters in the market will cause serious political, economic and social impact. Therefore, in order to avoid missing detection of defective printed products, the strict thresholds are often set by special printing enterprises in machine detection. Although this kind of practice can avoid the defective printed products being put into the market to a certain extent, it also makes many qualified printed products be intercepted as defective, resulting in the enhance of false error rate, increasing the waste and cost of enterprises. According to the statistics from a certain type machines for quality inspection of printed products, the false error rates in classification fluctuate between 6.8% and 21.4% for every 100,000 sheets, which is still considerable.

  4. (4).

    Cost of computer vision is high. Computer vision needs performance hardware, which leads to high investment; Computer vision still needs a lot of time for sample learning, during which the detection equipment can not be applied to the productive process, resulting in a blank window period. These factors lead to higher costs.

Based on the above analysis and description, this paper investigates the register control and measurement from the perspective view of image recognition. The main contributions of this paper include:

  1. (1).

    A new approach, denoted as Zernike-CNNs method, is proposed. It is a new research region to apply computer vision for printing register detection. Computer vision is more advanced and reliable than cross lines. This method can not only distinguish the standard products and defective products, but also subdivide the types of defective products, which can satisfy the needs of defective analysis. The steps are: Color images are converted to CMYK mode; The Y-gray image of printed matter is preprocessed by Zernike moments (ZMs), and then the convolution neural networks (CNNs) are used for image recognition.

  2. (2).

    In order to overcome the disadvantage of complex factorial calculation for ZMs radial polynomials, a fast recurrence algorithm, called Kintner method for Zernike moment radial polynomials, is derived in this paper. In order to further improve the classification accuracy, an improved CNNs is proposed. On the basis of CNN, paralleled CNN for local feature extraction and auxiliary classification for classification layer are proposed.

  3. (3).

    Zernike method (ZM) is compared with Sobel method, Laplacian of Gaussian (LoG) method, Small Univalue Segment Assimilating Nucleus (SUSAN) method, Finite Impulse Response (FIR) method, Multi-scale Morphological Gradient (MMG) and other preprocessing methods, The results show that ZMs belongs to integral operation, which are good behaviors in noisy conditions; it can locate at sub-pixel level with high accuracy; ZMs can be good candidates for rotational and scaling images, which is very beneficial to the detection of images collected with various deformations by the printing press at high-speed.

  4. (4).

    Zernike-CNNs method mainly detects the register defects of printed matter, as well as has certain detection ability for other types of printing defects. This approach is more adaptable to printing defective detection.

The rest of this paper is organized as follows: Section 2 introduces the academic research on register control, measurement and computer vision; In Section 3, the background information of the research methods in this paper is described, a fast Zernike algorithm is derived, and proposes an improved CNNs. Section 4 presents the technique route for register measurement. The experimental results are analyzed and discussed in Section 5. The conclusions and some future research activities are drawn in Section 6.

2 Related works

The academia has carried out a lot of research to improve the effect of automatic register measurement and control. Yoshida [39, 40] established the mathematical model of the interaction between upstream and downstream register control, and demonstrated the validity of the model through experiments. Lee [14, 22,23,24] illustrated the influence of thermal, register marks, edge-detection and algorithm on register control and measurement. On this basis, the optimization method on the register measurement error is studied. Kang [16, 17] proposed a prediction models for cross directional (CD) and machine directional (MD) register in multilayer printing, and verified the correctness of the models through experiments. For improving the register performance, Kim [19] proposed the register measurement system and related control method. Sugimoto [33] developed a register controller on the gravure press in response to the construction of the printing press and printing conditions such as the pass length between the printing units, the compensator roller displacement velocity, the printing speed, so as to achieve the purpose of register control. These studies focused on the prediction and prevention of register error, but do not involve the detection of register error. Therefore, these methods can not timely and completely correct the causes of defects in the printing process, which is easy to form tendentious defective products.

In the field of printing, the relevant academic papers and achievements on quality inspection are relatively rare. In fact, printing quality inspection can be regarded as the category of computer vision. Computer vision is widely used in authentication [9, 31], classification [38], inspection and detection [34], measurement and positioning [30], and other fields. There are many researches related to computer vision, the researches on computer vision have important reference significance for the detection on printing quality defects. The researches of computer vision mainly focus on:

  1. (1)

    Preprocessing of sample data set, which aims to optimize the calculation amount of data, reduce noise interference and enhance data features.

Dixit [9] proposed a forgery detection method for digital images based on stationary wavelet transform (SWT) with singular value decomposition (SVD), However, this method did not solve the problems of forgery detection after image rotation, rescale and reflection. Yasaka [38] studied the analysis accuracy of different CT images with Laplacian of Gaussian (LoG) spatial band-pass filter. However, when LoG enhances image features, it is easy to bring noise to image edge profile. Verma [34] proposed smallest univalue segment assimilating nucleus (SUSAN) principle and bacterial foraging algorithm (BFA) for edge detection. The performance of SUSAN operator is not affected by template size, and the selection of parameters is very simple. Compared with other edge detection algorithms, this algorithm was proved to be more effective by t-test. But SUSAN is mainly used for corner detection and has poor anti-noise ability, so it is not suitable for edge smoothed images. Müller [30] used the Sobel method to process deep optical images of satellite galaxy. But this method is still differential filtering, which is easy to amplify noise and produce false contour. Lupek [28] used finite impulse response (FIR) filter to remove spectral noise and background broadband distortion in Raman spectra. The experimental results show that FIR filter is an effective method to process the whole Raman spectra. But this method has a large amount of calculation. Diop [8] proposed to use the image edge functions as the local weights of inhomogeneous Hamiltonians to obtain multi-scale morphological operators. The method has good performance in image processing, but the algorithm can not completely avoid the situation of over segmentation. Firuzi [12] mentioned Histogram of Oriented Gradient(HOG), Kumar [21] mentioned Scale-Invariant Feature Transform(SIFT), These algorithms extract image features using artificially designed image descriptors, The algorithms can only be used for specific objects and are not universal.

Moments are a common method for image processing. The commonly used moments are Hu moments and Zernike moments (ZMs). Fernando [11] thought Hu moment is the best method of image processing. Li [26] proposed sub-pixel edge location based on Zernike with the error function edge model. The method has good robustness and location accuracy, but the improved algorithm increases the amount of computation. In addition, the algorithm must change the edge model as well as different objects. Compared with Hu moments, ZMs have the following advantages: it is easy to construct arbitrary high-order moments; it is not sensitive to noise.

  1. (2)

    Algorithm researches on data feature recognition and classification. The commonly used methods are classical methods and deep-learning-based methods [7].

Support vector machine (SVM) is a common classifier. Vidi [35] applied SVM to predict breast tumors benign versus malignant, classify breast cancer subtypes, and achieved good results. In the fingerprint identification technology, Kaur [18] combined ZMs with SVM, used ZMs to extract the features of fingerprint image. Then, a weighted-SVM is used to train and test the evaluated features. Experimental results show that the performance of this method is significantly improved compared with the existing technology. SVM is more suitable for binary classification, and the applicability for multi classification remains to be observed. Parameters selection is also the main constraint factor for the effect. In addition, there are other classification algorithms. CHEN [4] mentioned k-nearest neighbor(KNN) algorithm and naïve Bayes classifier(NBC) in behavior classification, and achieved better results. However, the calculation of KNN leads to long processing time, each sample to be classified has to calculate its distance from all known samples in order to get its k-nearest neighbor. NBC needs the assumption that the feature conditions are independent, otherwise the classification accuracy will be reduced.

Deep learning has excellent performance in image classification. Lu Leng [25] proposed a threshold segmentation scheme to preprocess the images of biological samples, and used a light-weight shallow CNN to classify the images. This method improves the computational efficiency of image detection and classification. At present, neural network represented by deep learning has become the main method of image classification.

In summary, register measurement and control belongs to computer vision, which needs image preprocessing and classification. In the preprocessing stage, because register detection is to extract the subtle edges from image, wavelet and other texture feature extraction algorithms are not applicable to register measurement; As the image edges related to register are generally smooth, so HOG, SIFT and other local feature extraction algorithms are not applicable to this object; because the sheets are running at a high speed in the printing press, the collected images are affected by light, angle, paper tension and other factors, there are many noise, differential type filters will suffer from well performing under the noisy conditions. ZMs are an integral operation with invariance and strong noise against ability, and ZMs can extract edges at sub-pixel level, so ZMs are adopted as a preprocessing method for register detection. In this paper, compared the integral operation represented by Zernike with the differential filter represented by Sobel, second order differential represented by LoG, the circular template represented by SUSAN, the wavelet is represented by FIR, nonlinear filter represent by MMG to verify their performance. In the classification stage, because the printed matter should not only distinguish good products from defective products, but also distinguish the types of defective products, so it belongs to multi classification. The binary classification of SVM has a large amount of calculation, and the classification effect is not good. Therefore, CNN, represented by deep learning, is adopted to classify the printed in this paper.

3 Approach

3.1 CMYK mode

The four colors offset pressing uses cyan (C), magenta (M), yellow (Y) and black (K) as the basic colors, and realizes color printing according to the subtractive process. The printing image obtained by the detection unit is generally in RGB space, so the image data need to be converted into the CMYK space under color separations [10, 15]. Among them, C-M-Y gray values are complementary to the gray value of red (R), green (g), blue (b). Due to the gray values range of RGB [0,255], the extraction process of C-M-Y gray value is given by

$$ \Big\{{\displaystyle \begin{array}{l}{C}_{ij}=255-{R}_{ij}\\ {}{M}_{ij}=255-{G}_{ij}\\ {}{Y}_{ij}=255-{B}_{ij}\end{array}} $$
(1)

where CIJ, Mij, Yij and Rij, Gij, Bij are the gray values of pixels under different color separations. Then the C-M-Y components are extracted to form black (k) components, and the C-M-Y components are modified. The correction process is given by

$$ {\displaystyle \begin{array}{l}\begin{array}{l}{K}_{ij}=\min \left({C}_{ij},{M}_{ij},{Y}_{ij}\right)\\ {} if\kern1em {K}_{ij}!={C}_{\mathrm{K}}\\ {}\Big\{\begin{array}{l}{C}_{ij}=\alpha \times \left({C}_{ij}-{K}_{ij}\right)/\left({C}_{\mathrm{K}}-{K}_{ij}\right)\\ {}{M}_{ij}=\alpha \times \left({M}_{ij}-{K}_{ij}\right)/\left({C}_{\mathrm{K}}-{K}_{ij}\right)\\ {}{Y}_{ij}==\alpha \times \left({Y}_{ij}-{K}_{ij}\right)/\left({C}_{\mathrm{K}}-{K}_{ij}\right)\end{array}\end{array}\\ {} else\\ {}{C}_{ij}={M}_{ij}={Y}_{ij}=0\end{array}} $$
(2)

where CK and α are empirical values. In this paper, CK is 100 and α is 1. According to the above Eq.(1 ~ 2), the gray value extraction in CMYK mode is completed.

3.2 Zernike moments

Edge detection of candidate image can effectively enhance image features, recognize patterns, and reduce the amount of computation. Theoretical research shows that ZMs are an integral form, which have good noise tolerant ability, and can construct any order of orthogonal moments in the unit circle to convolute the image [2]; The ZMs have translation-, rotation- and scale-invariance [3], so the ZMs are a common edge detection algorithm [13]. The Zernike moment (ZM) is defined as

$$ {\displaystyle \begin{array}{l}{Z}_{nm}=\frac{n+n}{\pi}\underset{x^2+{y}^2\le }{\int}\int f\left(x,y\right)\ast {V}_{nm}^{\ast}\left(x,y\right) dxdy\\ {}=\frac{n+1}{\pi }{\int}_0^{2\pi }{\int}_0^1f\left(\rho, \theta \right)\ast {V}_{nm}^{\ast}\left(x,y\right)\rho d\rho d\theta \end{array}} $$
(3)

where Znm is the m-fold repetition of n-th moment for Zernike, and N-|m| is even, n ≥ |m|. f(x, y) is the pixel value of the image at (x, y) pixel point. Vnm* is the conjugate of Zernike polynomials. Znm is obtained by convolution of the pixel value f(x, y) of pixel (x, y) with Vnm*. For an image with N × N pixels, Zernike polynomials Vnm is defined as

$$ {\displaystyle \begin{array}{l}{V}_{nm}\left(x,y\right)={R}_{nm}\left(\rho \right)\exp \left( jm\theta \right)\\ {}\Big\{\begin{array}{l}{R}_{nm}\left(\rho \right)=\sum \limits_{s=0}^{\left(n-|m|\right)/2}\frac{{\left(-1\right)}^s\left(n-s\right)!{\rho}^{n-2s}}{s!\left(\frac{n+\mid m\mid }{2}-s\right)!\left(\frac{n-\mid m\mid }{2}-s\right)!}\\ {}\rho =\frac{\sqrt{{\left(2x-N+1\right)}^2+{\left(N-1-2y\right)}^2}}{N}\\ {}\theta ={\tan}^{-1}\left(\frac{N-1-2y}{2x-N+1}\right)\end{array}\end{array}} $$
(4)

where Rnm(ρ) is the Zernike radial polynomials, ρ is the distance from the reference point to the pixel, and θ is the counter clockwise angle of the pixel relative to the positive direction of the x-axis. The relationship between the ZMs before and after rotation be shown that

$$ {Z}_{nm}^{\hbox{'}}={Z}_{nm}{e}^{- jm\varphi} $$
(5)

where Znm is the ZM after an image rotated angle φ. The Eq. (5) proves the rotational invariance of the ZMs. When the image pixel is rotated around a reference point, the module value of the pixel relative to the reference point remains unchanged except for the change of the phase angle. Some important information of the image can be determined by ZMs before and after rotation, and the information expression is shown as

$$ \Big\{{\displaystyle \begin{array}{l}\varphi ={\tan}^{-1}\frac{\operatorname{Im}\mid {Z}_{11}\mid }{\operatorname{Re}\mid {Z}_{11}\mid}\\ {}l=\frac{Z_{20}}{Z_{11}^{\hbox{'}}}\\ {}k=\frac{3{Z}_{11}\exp \left(- j\varphi \right)}{2{\left(1-{l}^2\right)}^{3/2}}\\ {}h=\frac{Z_{00}-\frac{k\pi}{2}+k{\sin}^{-1}(l)+ kl\sqrt{1-{l}^2}}{\pi}\end{array}} $$
(6)

Where φ is the phase angle of the pixel rotation, l is the distance between the reference point and the image contour, k is the gray step height on both sides of the contours, and h is the background gray level. If the pixel satisfies the requirements of K ≥ kt∩l ≤ lt, where Kt and lt are the threshold values, then it can be judged that the pixel (x, y) is the contour of the image. The selection of threshold has a great impact on the image edge location. lt is generally set to be less than 0.5, and Kt needs to be set experimentally or empirically.

According to Eq.(4), the factorial terms of radial polynomial Rnm(ρ) causes the complexity of ZMs, which hinders the application of ZMs in real time [6]. In order to reduce the burden of these factorial terms, Wee [36] and Mukundan [29] have, respectively, introduced several fast computations of the ZMs, for example: Kintner and Prata methods, and so on. However, the derivation process of the fast computation is not given in these literatures. Chen [5] proposed the application of Riemann sum to approximate Zernike radial function.

In the previous discussion, a fast algorithm of Zernike radial polynomials, which is called Kintner method, is very famous, but it has not seen to deduce this method by any article. In this paper, Kintner method is derived.

Zernike radial polynomials can be regarded as a special case of Jacobi polynomials. There is a corresponding relationship between Zernike radial polynomials and Jacobi polynomials as follows:

$$ {R}_{nm}\left(\rho \right)={\rho}^m{P}_{\frac{1}{2}\left(n-m\right)}^{\left(0,m\right)}\left(2{\rho}^2-1\right) $$
(7)

Where \( {P}_{\frac{1}{2}\left(n-m\right)}^{\left(9,m\right)}\left(2{\rho}^2-1\right) \) are Jacobi polynomials. Jacobi polynomials have a recurrence relations as shown below

$$ {\displaystyle \begin{array}{l}2{n}^{\ast}\left(\alpha +\beta +{n}^{\ast}\right)\left(\alpha +\beta +2{n}^{\ast }-2\right){P}_{n^{\ast}}^{\left(\alpha, \beta \right)}(x)=\\ {}\left(\alpha +\beta +2{n}^{\ast }-1\right)\left[{\alpha}^2-{\beta}^2+x\left(\alpha +\beta +2{n}^{\ast}\right)\left(\alpha +\beta +2{n}^{\ast }-2\right)\right]{P}_{n^{\ast }-1}^{\left(\alpha, \beta \right)}(x)\\ {}-2\left(\alpha +{n}^{\ast }-1\right)\left(\beta +{n}^{\ast }-1\right)\left(\alpha +\beta +2{n}^{\ast}\right){P}_{n^{\ast }-2}^{\left(\alpha, \beta \right)}(x)\end{array}} $$
(8)

Let \( {n}^{\ast }=\frac{1}{2}\left(n-m\right) \),α = 0,β = m, Using the Eq. (7 ~ 8) have the following relation

$$ {\displaystyle \begin{array}{l}\frac{1}{2}\left(n-m\right)\left(n+m\right)\left(n-2\right){P}_{\frac{1}{2}\left(n-m\right)}^{\left(0,m\right)}(x)=\\ {}\left(n-1\right)\left[-{m}^2+\left(2{\rho}^2-1\right)n\left(n-2\right)\right]{P}_{\frac{1}{2}\left(n-m\right)-1}^{\left(0,m\right)}(x)-n\left(n-m-2\right)\left(\frac{1}{2}n+\frac{1}{2}m-1\right){P}_{\frac{1}{2}\left(n-m\right)-2}^{\left(0,m\right)}(x)\end{array}} $$
(9)

In Eq. (9), let x = 2ρ2 − 1, Jacobi polynomials can be transformed into Eq. (10) from Eq. (7).

$$ \Big\{{\displaystyle \begin{array}{l}{P}_{\frac{1}{2}\left(n-m\right)}^{\left(0,m\right)}(x)={P}_{\frac{1}{2}\left(n-m\right)}^{\left(0,m\right)}\left(2{\rho}^2-1\right)={\rho}^{-m}{R}_{n,m}\left(\rho \right)\\ {}{P}_{\frac{1}{2}\left(n-m\right)-1}^{\left(0,m\right)}(x)={P}_{\frac{1}{2}\left(n-2-m\right)-1}^{\left(0,m\right)}\left(2{\rho}^2-1\right)={\rho}^{-m}{R}_{n-2,m}\left(\rho \right)\\ {}{P}_{\frac{1}{2}\left(n-m\right)-2}^{\left(0,m\right)}(x)={P}_{\frac{1}{2}\left(n-4-m\right)}^{\left(0,m\right)}\left(2{\rho}^2-1\right)={\rho}^{-m}{R}_{n-4,m}\left(\rho \right)\end{array}} $$
(10)

The fast recurrence relation for ZM polynomials can be obtained from Eq. (9 ~ 10). See Eq. (11) for fast recurrence relation:

$$ {\displaystyle \begin{array}{l}\frac{1}{2}\left(n-m\right)\left(n+m\right)\left(n-2\right){R}_{n,m}\left(\rho \right)=\\ {}\left(n-1\right)\left[-{m}^2+\left(2{\rho}^2-1\right)n\left(n-2\right)\right]{R}_{n-2,m}\left(\rho \right)-n\left(n-m-2\right)\left(\frac{1}{2}n+\frac{1}{2}m-1\right){R}_{n-4,m}\left(\rho \right)\end{array}} $$
(11)

However this relation as shown in Eq. (11) is not applicable for cases (n = m) and (n-m = 2). For these cases, the following two relations are used:

$$ {\displaystyle \begin{array}{l}{R}_{n,n}\left(\rho \right)={\rho}^n\\ {}{R}_{n,n-2}\left(\rho \right)-n{R}_{n,n}\left(\rho \right)-\left(n-1\right){R}_{n-2,n-2}\left(\rho \right)\end{array}} $$
(12)

The direct method of Eq. (4) can be replaced by the fast computation method in Eq. (11 ~ 12) for the Zernike radial polynomials. Eq.(11 ~ 12) are the Kintner method.

When the Zernike radial polynomials are from 0 to the specified orders, and under the specified order, the efficiency comparison between the fast computation method and the direct method is shown in Table 1.

Table 1 Comparison between fast computation method and direct method

In Table 1, Kintner method takes shorter time than the direct method in Eq. (4). The higher the orders, the more efficient the fast computation method is.

3.3 Convolution neural network

Convolutional Neural Network (CNN) is an efficient deep learning algorithm. The network can extract the object features by local data. The network has certain translation-, rotation- and scale- invariance, and has been used to features recognition [1] and classification function [20]. The network can not only detect register measurement of printed matter, but also detect other defects, so it has a wide range of application prospect in the detection of printing defects.

This paper is based on Le net-5 CNN, and its architecture is shown in Fig. 3:

Fig. 3
figure 3

Architecture of convolution neural network

The components of a CNN are composed of several convolutional layers (C1, C2…Cn), several pooling layers (S1, S2…Sn), fully connected layer and classification layer. In general, the convolutional layer and pooling layer appear alternately in pairs. For the basic idea and operation of CNNs are given by Ref. [32].

One of the challenging tasks from CNN is that, a particular network cannot be applied to continuously increasing detection environments. The parameters and structure of the network must be improved according to the characteristics of the research object, or combine different algorithms to improve the work efficiency. This paper focuses on improving the classification accuracy of the network, an improved neural network is proposed. Its structure is shown in Fig. 4.

Fig. 4
figure 4

Architecture of improved CNNs

The improved CNNs include three parts: original part (OP), parallel part (PP) and auxiliary classification part (ACP).

The original part (OP) is classic CNN. For shorten the training time, the CNN in this paper has two convolution layers, but one pooling layer is reduced. Its architecture is: convolutional layer C1, pooling layer S1, convolutional layer C2, fully connected layer and classification layer. In the classic CNN, Softmax is used as classification function, and Mean square error (MSE) between network output y and expected output as a loss function. MSE can be defined as:

$$ E={d}_1\left({y}_m,\tilde{y}_{m}\right)=\frac{1}{2m}\sum \limits_1^m{\left({y}_m-\tilde{y}_{m}\right)}^2 $$
(13)

Where, m is the number of categories; the loss function of classification is calculated by gradient descent method to update the weight. But in the improved CNNs, the loss function from OP-CNN adopts a new model different from Eq. (13).

Based on the original structure, the improved CNN adds two parts: parallel structure part and auxiliary classification part.

The parallel part (PP) has the same structure as the OP, Different from OP, PP’s input image is local image, while OP’s input image is global image. The relationship between OP and PP is established through the similarity function of output features from convolution layers. The similarity function and the output loss function constitute the training function for OP:

$$ {\displaystyle \begin{array}{c}{E}_{op}=\alpha \times {d}_1\left({y}_m,{\tilde{y}}_m\right)+\beta \sum \limits_1^n{d}_2\left({C}_{op,k},{C}_{pp,k}\right)\kern1.75em k=1,2\dots N\\ {}=\frac{\alpha }{2m}\sum \limits_1^m{\left({y}_m-{\tilde{y}}_m\right)}^2+\frac{\beta }{n}\sum \limits_1^n\sqrt{\sum \limits_{pq}\left({z}_{i,j}-{\tilde{z}}_{ij}\right)\begin{array}{c}2\\ {}k\end{array}}\\ {}={E}_1+{E}_2\end{array}} $$
(14)

Where, the first term on the right-hand side of the representation is the output classification loss function; The second term on the right-hand side of the representation is the convolution outputs similarity function, Cop,k, Cpp,k denote the outputs of OP and PP in the k-th (k∈[1,N]) layer from convolution layer. Zij, \( {\overset{\sim }{Z}}_{ij} \) represents the values of Cop,k and Cpp,k in row i and column j respectively. α, β are the weights of two items respectively. The output loss function and the similarity function are used as the loss function of OP, and the weights of OP network are updated by gradient descent method. It is a well-known process that the updating formula of the weights and thresholds from the Eq. (13), as well as the first term on the right-hand side from the Eq. (14), can be obtained by the chain derivation rule. In the second term on the right-hand side from the Eq. (14), the chain derivation rule can also be used to update the weights and thresholds. The weights and thresholds update can be written as follows:

$$ {\displaystyle \begin{array}{l}\frac{\partial {E}_{op}}{\partial {\omega}^k}= rot180\left({C}_{op,k-1}\right)\ast \left(\frac{\partial {E}_1}{\partial {C}_{op,k}}+\frac{\partial {E}_{2,1}}{\partial {C}_{op,k}}+...+\frac{\partial {E}_{2,n}}{\partial {C}_{op,k}}\right)\\ {}\frac{\partial {E}_{op}}{\partial {b}^k}=\sum \limits_{p,q}\left(\frac{\partial {E}_1}{\partial {C}_{op,k}}+\frac{\partial {E}_{2,i}}{\partial {C}_{op,k}}+...+\frac{\partial {E}_{2,n}}{\partial {C}_{op,k}}\right)\\ {} where\kern1em \frac{\partial {E}_{2,n}}{\partial {C}_{op,k}}=2\left({C}_{op,k}-{C}_{pp,k}\right)\end{array}} $$
(15)

This structure will make CNNs pay more attention to the region of interest (ROI) of images, so as to improve the ability of image recognition and classification.

Aiming at improving the classification ability of CNNs, the classification layer weights are proposed as the auxiliary classification part. In some specific image recognition and classification, there are other means which have stronger ability than CNNs. These classification methods can be applied to CNNs as an auxiliary. This paper proposes that, the auxiliary classification algorithm classifies the candidate images, and the results are loaded into the classification layer of CNNs as weights to strengthen the correct classifications of CNNs and weaken the wrong classifications of CNNs.

In the next section, the detailed use of improved CNN in register detection will be described.

4 Realization of register measurement

4.1 Inspection system of register measurement

Machine for the automatic inspection and classification in this paper is shown in Fig. 5.

Fig. 5
figure 5

Hardware Design of Inspection System on Register measurement

The system consists of four important units: LED light source, which provides continuous and stable illumination for the inspection system restraining noisy, reduces the deviation of detection result caused by the fluctuation of surrounding illumination; CCD sensor, collects printed matter images and inputs them to high-speed industrial computer; high-speed industrial computer, processes and classifies the collected printed matter, and coordinates the components of the detection system operation; PLC, complete the action control of detection system and data exchange with industrial computer.

4.2 Experimental study

In this paper, the images to be checked for the inspection machine are taken as shown Fig. 6.

Fig. 6
figure 6

Printed Images to be Checked: (a) is standardized image without defects and flaws; (b) ~ (g) are defective printed images, respectively corresponding to the upper and lower shift, left and right shift, leakage, font overlap and other printed defects

Figure 6(a)~(g) are divided into seven categories. In this paper, 500 sheets of each type are collected, and the total number of samples is 3500 as the raw data.

According to section 1.1, color separations of the images to be inspected are carried out in CMYK mode, and the gray scales are extracted as shown in Fig. 7.

Fig. 7
figure 7

CMYK Gray Scale Images of Printed: (a) is the gray scale images from ideal printed processed by CMYK, (b) is the gray scale images from defective printed processed by CMYK

Considering that there are only black and white pixels in Y gray image, in this paper, the Y gray scale images are used for feature extraction and classification.

In order to test the ability of the algorithm to against noise, salt and pepper noise is added to the images, and edge detection is carried out in ZMs, The calculation of ZMs follows Eq. (3 ~ 6) and Eq. (11 ~ 12), where the edge weight lt is set to 0.11 and KT is determined to be 24. Sobel method, LoG method, SUSAN method, FIR method, MMG method and other edge detection methods are used for comparison. The edge detection results of each method are shown in Fig. 8.

Fig. 8
figure 8

Edge detection of printed images: From left to right, it shows the feature extraction effect with different edge detection; from top to bottom, it corresponds to several types of defective printed

It can be seen from the Fig. 8 that: 1) the methods of Zernike, Sobel and LoG are unaffected by additive noise. However, the edge of the image processed by Sobel method has obvious discontinuity, which indicates that the edge information is lost more. LoG is too sensitive and the edge features contain too much redundant information. Correspondingly, the edge features extracted by Zernike method have higher accuracy and less information loss. Theoretical studies show that Zernike moments have higher accuracy in image edge location than Sobel; 2) FIR and MMG can extract image edge features, but the robustness against noisy is not ideal, moreover, the accuracy of the edge extracted by MMG method is low, and the image edges are overlapped, which is difficult to distinguish; 3) SUSAN failed to extract the edge features of the image effectively.

After data preprocessing, the images used to input the networks have obvious contour features, which provide a good condition for the CNNs to classify. Next, the images will be classified by improved CNNs.

Initial setup from OP-CNN: convolutional layer C1 is 20-channel and 5 × 5 kernels matrix with a stride of 1, activation function is sigmoid function; pooling layer size is 2 × 2 with a stride of 1, using mean pooling; convolutional layer C2 is 100-channel and 12 × 12 kernels matrix with a stride of 1; the number of neurons in fully connected layer is 100, and The output number of the classification layer is 7, corresponding to the ideal product and six kinds of defective printed in Fig. 6. Softmax is used as the classification function. But the loss function of OP takes the form from Eq. (14), the weights α = 0.9 and β = 0.1, the classification number m = 7, the number of convolution kernels n = 120. The loss function is calculated by gradient descent method to update the weight. Network learning rate is 0.001. Every time a sample is learned, the weights and thresholds are updated once. In the seven groups of images, 400 images are extracted from each group to form 7 × 400 training samples, and the remaining 7 × 100 images are used as test samples. The image size is modified to 28 × 28 and then input into OP-CNN. The training steps are very important parameters for CNNs. If the training steps are too few, the network training is not enough, and the training steps are too many, the network is prone to over fitting, meanwhile, training should also increase excessive computation cost. After many experiments, the training steps set to 25 steps can achieve better performance.

Initial setup from PP-CNN: PP-CNN has the same structure as OP-CNN and parameter setting is basically the same. But MSE between network output y and expected output is a loss function in the form from Eq. (13), and the input is a local image. In practice, the register inspection from printed image is not concerned with the whole picture, but more concerned with the ROI of the images. Taking the research objects of this paper as an example, the ROI are shown in Fig. 9.

Fig. 9
figure 9

ROI of printed images for register: the seven pictures correspond to the seven classification forms in this paper, The register detection focuses on the blue box areas in real applications, which are the ROI of the images

Within the blue box, concentrating on the relative position between the pictures and the contours, the types of register defects can be determined, without involving the patterns outside the regions. Inspired by practical work, Taking ROI as the input of PP-CNN and ignoring other regions, which can reduce the network computation and help to extract key features.

Auxiliary classification part from improved CNNs: In the introduction part, it has been explained that cross lines are a common method for register detection under practical considerations. Although cross lines can not identify other printed defects, it has a good effect on identifying register error. In this paper, cross lines are used as an important means to optimize the register classification based on CNNs. The specific process is as follows:

The cross lines corresponding to the images to be inspected are extracted. Harris method is employed to detect the corner of cross lines as shown in Fig. 10.

Fig. 10
figure 10

Corner dots extracted from cross lines by Harris Method: The red dots are the corner dots that are distinguished by Harris method. Zone 1 is the available corner dots and zone 2 is invalid

Harris method identifies corner dots in zone 1, by calculating the relative position of X-axis and Y-axis between corner dots, whether the register is X-axis deviation or Y-axis deviation can be judged. Harris method also identifies some invalid corner dots, such as zone 2. If the relative positions between the corner dots in zone 2 and other corner points are calculated, it will cause misjudgment on whether there is deviation and deviation direction. Eliminate the abnormal dots and determine the direction of register deviation by the following:

$$ {\displaystyle \begin{array}{l} if\beta >\max \left(|{x}_i-{x}_j|,|{y}_i-{y}_j|\right)>\alpha =1,\mathrm{2...}N-1:j=2,\mathrm{3...}N, There\kern0.17em is\kern0.17em registererror\\ {}\max \left(|{x}_i-{x}_j|\right)>\max \left(|{y}_i-{y}_j|\right)\;\mathrm{There}\kern0.17em \mathrm{is}\;\mathrm{X}-\mathrm{axis}\ \left(\mathrm{left}\ \mathrm{or}\ \mathrm{right}\right)\mathrm{deviation}\ \mathrm{in}\ \mathrm{register}\\ {}\max \left(|{y}_i-{y}_j|\right)>\max \left(|{x}_i-{x}_j|\right)\;\mathrm{There}\kern0.17em \mathrm{is}\;\mathrm{Y}-\mathrm{axis}\ \left(\mathrm{up}\ \mathrm{or}\ \mathrm{down}\right)\mathrm{deviation}\ \mathrm{in}\ \mathrm{register}\end{array}} $$
(16)

xi and xj are the X-axis coordinates, yi and yj are the Y-axis coordinates. α is the minimum deviation allowed for register, and β is used to eliminate the abnormal dots. In reality, register deviation can not be very large, if the calculated distance exceeds β, it can be considered as a distortion. After the experiments, it is determined that α is 4 and β is 30. Cross lines can judge whether there is X-axis deviation or Y-axis deviation, but the exact type of register can not be answered, up or down, left or right, and computer vision is needed to determine the final type.

The results of Eq. (16) are sent to the classification layer of OP-CNN as the weights, According to the result of cross lines, different weights are set in the classification layer. The weight rules are as shown in Fig. 11.

Fig. 11
figure 11

Weight rules from classification layer in OP-CNN: 1st-7th correspond to the seven classifications, and 1st, 6th, 7th share weight ω1, 2nd,3rd share weight ω2, 4th,5th share weight ω3

When the cross lines indicate overlaps, the classifications of register errors are locked, that is, ignoring the judgment from CNNs on the probability for register defects, and only the non-register-defects are classified; when cross lines indicate deviation, the classifications of non-register-defects are locked, that is, ignoring the judgment from CNNs on the probability for non-register-defects, only the register defects are classified. This weights used in this paper assignment values are based on the trust of cross lines classification. In the case that some auxiliary classification is not very trusted, the conservative weights assignment method can be recommended, In other words, ω1, ω2, ω2 are given specific weights (0 < ω < 1) instead of 0 or 1.

The specific process of improved CNNs in register detection is shown in Fig. 12.

Fig. 12
figure 12

Flow of improved CNNs in register detection: In the 1st step, PP-CNN is trained in advance to obtain the optimal convolution layer outputs. The outputs of PP-CNN convolution layers are determined; in the 2nd step, OP-CNN is trained, and the training function is constructed by Eq. (14) to update the convolution layer weights and thresholds from OP-CNN; in the 3rd step, the cross lines are classified by Harris method and Eq. (16), and update the classification layer weights from OP-CNN; Finally, the classification results are obtained by OP-CNN

5 Results

The above different preprocessing methods for edge detection and classification methods are employed to verify the effect of register detection. The correct classification rate (CCR) and the MSE from the loss function of classification results in the training process of CNNs are used as indicators. The comparison plots are shown as in Fig. 13.

Fig. 13
figure 13

Performance Comparison. (a) ~ (b) are the plots of classification precision from CNNs under the different preprocessing methods. (c) ~ (d) is the plots of MSE from output of CNNs under the different preprocessing methods. For comparison with the classical CNN, MSE of improved CNNs comes from the first term on the right-hand side of Eq. (14) and does not involve the second term

It can be seen from the comparison plots that the images preprocessed by Zernike, Sobel and LoG belong to the group with better performance, while the images preprocessed by SUSAN, FIR and MMG belong to the group with poor performance. After 25 steps of training, the CCR of the training samples processed by Zernike are always the highest in the classic CNN or the improved CNNs, as well as the minimum MSE is also obtained after 7(types) × 400(sheets) × 25(steps) samples training, which show the excellent processing ability from Zernike. The images processed by Sobel and LoG also have better CCR, but not as good as Zernike. The images processed by SUSAN, FIR and MMG can not effectively extract images features, which make the classification accuracy of CNNs unsatisfactory, especially the CCR from MMG training samples fluctuates greatly during network training. The above three methods can hardly be employed in register detection. The data comparison of several methods is shown in Table 2.

According to Table 2, the same conclusion can be obtained as in Fig. 13. It is particularly noted that the classification performance of the improved CNNs are better than that of the classical CNN. In terms of time index, due to the images processed by Zernike have good noise tolerant ability, the classification training from CNNs needs less data to process, so the training time is also shorter. Figure 13 and Table 2 demonstrate the performance of the Zernike-CNNs is over other methods in all aspects.

Table 2 Performance comparison of several edge detection methods in CNNs

The printed images processed by Zernike are input to CNNs, and the process in CNNs is shown in Fig. 14.

Fig. 14
figure 14

Processing of print image in CNNs. from top to bottom, corresponding to the ideal printed image and six types of mis-register (a) Convolution Layer C1 Outputs (b) Pooling Layer S1 Outputs (c) Convolution Layer C2 Outputs

Figure 14(a)~(c) respectively show the images processing results of printed matter in convolution layers C1, pooling layers S1 and convolution layers C2 of CNNs. The classification results of test samples by Zernike-CNNs are shown in Table 3.

Table 3 Classification of printing defects with Zernike-CNNs

In Tables 3, 700 sheets of test samples are classified, and a total of 41 sheets are wrongly classified, the total error rate is 5.86%.In practice, printing enterprises can tolerate the wrong classification in defect types of defective products, so only 14 of 700 sheets misjudged products are not tolerated. 4 of 100 sheets in ideal products are wrongly classified as defective products, the false detection rate is 4%; and 10 of 600 sheets in defective products are wrongly classified as ideal products, the missing rate is 1.67%. As mentioned in the introduction, the false error rates of a certain type of quality inspection machine fluctuate from 6.8% to 21.4% for every 100,000 sheets. In addition, the missing rate is required not to exceed 2% by non-special printing enterprises. The false detection rate and missing rate from the Zernike-CNNs can meet the requirements of the enterprise. The experimental results verify that improved CNNs combined with ZMs can be applied to register detection. In particular, the approach proposed in this paper avoids the misclassification between good print and register defects. Up and down, left and right, the four types of register errors are not mistakenly classified as good products; and good products are not mistakenly classified as four kinds of register errors, which is the biggest advantage of this method.

6 Conclusions and outlook

In this paper, aiming at the register detection in printing industry, a detection approach based on Zernike-CNNs is proposed. Zernike moments are exploited to extract edge feature of the printed images, and CNN is employed to classify defects. In order to improve the computational complexity of ZMs, a fast method of Zernike radial polynomials is derived. Improved CNNs are investigated to improve the accuracy of classification. Based on the classic CNN, the improved CNNs adopt parallel CNN to enhance local features, and adopt auxiliary classification part to modify classification layer weights. The experimental results show that MSE of the training sample reaches 0.0143, and the detection accuracy of training samples and test samples reached 91.43% and 94.85%, respectively. Zernike is compared with Sobel, LoG, SUSAN, FIR, MMG and other preprocessing methods, and the improved CNNs are also compared with the classical CNN, the approach shows advantages in processing speed and extraction accuracy, so the approach is effective in register detection.

In this paper, the structure and main parameters (e.g. the size, number and stride of convolutional kernels, the training steps, and the networks layer number) of CNNs are obtained by experiments and experience, but there is no theoretical guidance at present; under-fitting or over-fitting from CNNs also exists in the experimental process. There is space for further optimization. In addition, this approach is to solve the detection for printing register defects, but there are many printing troubles, such as ink spot, print through, color variation, etc., and other printing defects are not involved by this method. These problems are academic difficulties, and are still the direction of future research.