1 Introduction

Chua [1] was the first one to propose the hypothesis about the memristor. After studying about 40 years theoretically, HP Labs [2] made out the memristor physically in 2008, which attracts a lot of attention. Then various kinds of memristors emerged based on different materials. The memristance value can be varied under different voltage or current, and the value can be kept after withdrawing the voltage or current source applied on it, making memristors popular candidates for synapses. Besides nonvolatility, the memristor is characterized by nanoscale size and power efficiency, which enables them possible to be employed in neural networks [3,4,5,6], neuromorphic computing [7, 8], approximate computing [9] and memories [10]. Many researchers applied memristors in different architectures, such as learning architectures [3], computing architectures [11], Computation-in-Memory architectures [12], to get better performance. Moreover, different circuits based on memristors have been proposed. Adhikari et al. [3] presented the memristor bridge synapse used to implement random weight change algorithm in multilayer neural networks, while many scholars utilized memristor-based crossbar to realize matrix-vector operation [8, 13], neuromorphic character recognition [14], gradient-descent based learning algorithms [15, 16]. In addition, there are a lot of studies about memristors [17,18,19,20,21] in the literature.

Nowadays, the study of neural networks develops rapidly. Recently, Wen et al. [22] used neuroadaptive control approach to solve distributed consensus tracking problem for a class of multiagent systems with unmodeled dynamics and unknown disturbances. Many researches about them and their applications upsurge [23,24,25,26]. Convolutional neural network, as a kind of deep learning neural network was first inspired by the study of neural science, and a classical architecture of convolutional neural network was first proposed by Lecun et al. [27]. Compared with traditional neural networks, convolutional neural networks take advantages in weight sharing, which reduces the number of parameters need to be trained. In addition, it is good at recognizing images with displacement change, zoom, rotation and other forms of distortion. Therefore, CNNs are very popular for pattern recognition and classification, such as human face recognition [28], traffic sign recognition [29], object recognition [30]. In advances in Neural Information Processing Systems, Krizhevsky et al. [31] used deep convolutional neural networks to classify more than a million images into 1000 different classes, which achieved a new state-of-the-art classification. Following the deep convolutional neural networks, Szegedy et al. [32] increased the depth and width of the network while keeping the computational budget constant to improve its ability of recognition and detection.

Quite a lot of reports about convolutional neural networks have been reported, and almost all of them are simulated via software. As we all know, software runs serially, but hardware computes inherently in parallel. However, the realization of the convolutional neural learning in fast, compact and reliable hardware is a difficult task. The critical problem is that hardware components cannot be utilized to store nonvolatile weight. In addition, convolution operations are complex for hardware to execute, containing too many multiplication operations and addition operations. Since memristors are nonvolatile with nanoscale size, it is necessary to apply them in convolutional neural networks to speed up the calculation. In this paper, the bipolar memristor with threshold is applied, which was put forward by Yuriy et al. [33]. This kind of memristor is nonvolatile, nanoscale and power efficient, with no differences to other kinds of memristors, but its memristance value is the same as previous value when the voltage is lower than the threshold.

Considering the traits of the convolution operation and the bipolar memristor with threshold, an architecture with memristors is designed to realize the convolution operation. The reminder of the article is structured as follows. Section 2 describes the memristor used in the design, as well as the modified convolution computation. Section 3 proposes the computation architecture and details the computation procedure and builds the computation circuits. Section 4 demonstrates the simulation results.

2 Background

2.1 Memristor model

Since researchers in HP Labs made the memristors, the interests in memristors upsurge. Many memristors based on different materials with diverse electrical properties have been discussed. Yuriy et al. [33] put forward the bipolar memristor with threshold. The memristor model is defined as the following:

$$ I= x^{ - 1} V_{\mathrm{{m}} }, $$
(1)
$$\frac{{\mathrm{d}x}}{{\mathrm{d}t}}= f(V_{\mathrm{{m}}} )W(x, V_{\mathrm{{m}}} ) , $$
(2)

where x is the internal state variable, representing the memristance R. f(x) is a function modeling the device threshold property, and W(x) is a window function:

$$ f( V _{\mathrm{{m}}} )= \beta ( V_{\mathrm{{m}}} - 0.5(\left| { V _{\mathrm{{m}} }} \right. + \left. { V_{\mathrm{{t}}} } \right| - \left| { V_{\mathrm{{m}}} } \right. - \left. {V _{\mathrm{{t}}} } \right| )) , $$
(3)
$$ W(x, V _{\mathrm{{m}}} )= \theta ( V _{\mathrm{{m}}} )\theta ( R _{\mathrm{{off}}} - x) + \theta ({\mathrm{{ - }}} V _{\mathrm{{m}}} )\theta (x - R _{\mathrm{{on}}} ) , $$
(4)

where \(\theta (x)\) is the step function, \(\beta \) is a positive parameter characterizing the rate of memristance change when \(\left| { V _{\mathrm{{m}}} } \right| > V _{\mathrm{{t}}} \), and \( V _{\mathrm{{t}}} \) is the threshold voltage. \(R_{\mathrm{on}}\) and \(R_{\mathrm{off}}\) are limiting values of the memristance R. In Eq. (4), the role of \(\theta (x)\) is to restrict the memristance change to the interval between \(R_{\mathrm{on}}\) and \(R_{\mathrm{off}}\). In order to avoid convergence problems, we modified the step function as:

$$ \theta _{\mathrm{s}} (x) = \frac{1}{{1 + \exp ( - \frac{x}{b})}} , $$
(5)

where b is a constant parameter. The absolute function can be adapted as:

$${\mathrm {abs}} _{\mathrm{s}} (x) = x[ \theta _{\mathrm{s}} (x) - \theta _{\mathrm{s}} ( - x)] . $$
(6)

When a sinusoidal voltage source as shown in Fig. 1, is applied on the device, the change of state x(t) is shown in Fig. 2.

Fig. 1
figure 1

The sinusoidal voltage source

Fig. 2
figure 2

The change of state variable x(t)

By studying the pictures, we can conclude that the memristance is a constant value when the applied voltage is lower than the threshold, and its value varies between \(R_{\mathrm{on}}\) and \(R_{\mathrm{off}}\) under the opposite circumstance. Therefore, the threshold memristor model is adopted in the design and simulations.

2.2 Image convolution computation

Image convolution computation aims to extract information from input images. Addison et al. [34] studied several kinds of neural networks’ performance for feature extraction. Image convolution computation is an important part of CNNs. An image can be considered as a matrix, so the image convolution computation operation is the same as the matrix convolution computation operation. Set A is a \({{r}}_{1} \times {{c}}_{{1}} \) matrix and B is a \( {{r}}_{{2}} \times {{c}}_{{2}} \) matrix. Generally, a 2-D convolution computation is defined as:

$$g(s,t) = \sum \limits _{r = 1}^{ r _1 + r _2 - 1} {\sum \limits _{c = 1}^{ c _1 + c_2 - 1} {f(r,c)h(s - r + 1,t - c + 1)} } .$$
(7)

Following as Eq. (7) showed, it needs to perform \({(r}_{1} + {c}_{1} - 1) \times {(r}_{{2}} + {c}_{2} - 1)\) times multiplication and addition operations, which are very complicated. As we all know, it is time consumed to execute a multiplication operation by a software program. To complete a image convolution computation in convolution neural network efficiently, the algorithm is altered as:

$$ {Y}_{11} = R \left( \sum \limits _{{i} = 1}^M {\sum \limits _{j = 1}^{N} {W_{ij} x_{ij} } } \right) , $$
(8)

where \( {Y} _{\mathrm{11}} \) is the first element of matrix Y, and the output after first convolution computation when the convolution kernel overlapped the input image. Integrated output is the size of \((I - M + 1) \times (J - N + 1)\). I, J represent the number of rows and columns of the input matrix A, respectively. \( W _{ij}\) is the weight of kernel. \( x _{ij} \) is the input converted from input image data or the subsampling layer, and R is a constant variable.

3 The computation architecture

3.1 Design of the architecture

As Eq. (1) shows, the current flowing through the memristor is the result of the multiplication of voltage and conductance (\(G = x^{ - 1} \)). So the multiplication operation can be conducted by the memristor. Kirchhoff’s Current Law (KCL) describes that at any node (junction) in an electrical circuit, the sum of currents flowing into that node is equal to the sum of currents flowing out of that node. Based on KCL, a novel computation architecture for implementing Eq. (8) is proposed, as shown in Fig. 3.

Fig. 3
figure 3

The computation architecture

Collecting currents from all branch circuits, the circle outputs the summation, which is multiplied by the resistance, then the product is the output. Now \( W _{ij}\) represents the conductance of a memristor, and R is a resistor, transforming the current to voltage for the subsampling conveniently. The architecture presented only can calculate one element of the output matrix.

In order to complete the whole computation, the calculation procedure is described by the Algorithm 1 in detail, where t is a temporary variable. The number of row and column of the kernel are M, N, respectively, while the number of row and column of the input image are I, J, respectively. Do as the Algorithm 1 shows once, you could get a feature map. If you want to get different features, you need to make different kernels.

figure a

3.2 Building circuits

The circuit, using electrical elements to implement the function as Fig. 3 shows, is proposed in Fig. 4, where \( i _{11} \), \(i _{12},\ldots, \) \( i_{mn} \) are currents flowing through different memristors. I represents a current-controlled current source (CCCS), whose value is the summation of \( i _{11} \), \( i _{12},\ldots\), \( i_{mn} \). \( Y _{11} \) is the voltage of the resistor, and it is a part of the feature map. For acquiring the whole one, two alternative methods can be adopted, the one being to copy the circuit to perform the function simultaneously, the other being to wait until the calculation completed. Obviously, the first method is more time saving. More circuit elements are required to build the circuit, in return. Therefore, a trade-off needs to be handled between speed and cost.

Fig. 4
figure 4

Circuits to implement convolution operation

4 Simulation and analysis

HSPICE is compatible with most SPICE variations and takes advantages in convergence, accurate modeling and etc. Memristors are very small with nanometer size, and sensitive to the environment. In order to acquire the simulation results accurately, HSPICE is applied for simulating. When using the kernel

$$ \left( {\begin{array}{*{20}c} { - 1} & {\quad - 1} & {\quad - 1} \\ { - 1} & {\quad 8} & {\quad - 1} \\ { - 1} & {\quad - 1} & {\quad - 1} \\ \end{array} } \right), $$

we could extract the edge information from the input images. The input image is shown in Fig. 5.

Fig. 5
figure 5

Binary image used for testing

For the input image is a \({5} \times {5}\) matrix, and the kernel is a \({3} \times {3}\) matrix, performing the convolution operation as Algorithm 1 shows, there are 9 times convolution operations needed to be executed. In each convolution operation, the convolution kernel corresponding to the area is shown in Fig. 6. Figure 6a–g shows the area of the input image corresponding to the convolution kernel at each step. For example, at the first step, Fig. 6a is the input. After the input is convolved with the convolution kernel, the result is the first negative pulse as shown in Fig. 7. Similarly, the result of Fig. 6b convolved with the kernel is the second pulse in Fig. 7, and so on. After the calculation, the output simulated by HSPICE is shown in Fig. 7.

Fig. 6
figure 6

Convolution kernel corresponding to the area of the input images in each convolution operation. a is the input at the first step, b is the input at the second step, c is the input at the third step, d is the input at the fourth step, e is the input at the fifth step, f is the input at the sixth step, g is the input at the seventh step, h is the input at the eighth step and i is the input at the ninth step

Fig. 7
figure 7

Output after convoluting shown by MATLAB

Fig. 8
figure 8

Output after convoluting shown by a binary picture

Apparently, there are four negative pulses in Fig. 7, whose number is the same as the black squares’ in Fig. 5. Taking into account the characteristics of the image convolution operation, we know that the feature map is a \({3} \times {3}\) matrix. So the white pixels of the feature map are the remaining of the input image, whose number is the same as the positive pulses. If the negative pulses are taken as the black pixels and the positive pulses as the black pixels, we could draw a picture as shown in Fig. 8. The simulation results verify proposed design eventually

5 Conclusion

Recently, CNNs take important roles in the field of computer vision, artificial intelligence and other areas. Developing convolutional neural networks in software cannot meet the commands of speed in today, so it is urgent to develop the method to achieve the hardware implementation. The effectiveness of the design is verified by simulations through HSPICE. In the future work, it is necessary to optimize the architecture and realize different kind of image processing.