Handwriting Trajectory Reconstruction Using Spatial-Temporal Encoder-Decoder Network

Wei, Feilong; Zhu, Yuanping

doi:10.1007/978-3-030-88004-0_28

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 13019))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

2530 Accesses

Abstract

Chinese handwriting characters have complex strokes and various writing styles, which makes it difficult to reconstruct handwriting. Aiming at this problem, we propose a handwriting reconstruction method based on a spatial-temporal encoder-decoder network with constrains. Different from other models that generate trajectory coordinates through a fully connected network, the method proposed in this paper outputs heat map sequence. The model is consists of three modules: key point detection module, spatial encoder-decoder module and reconstruction constraint module. The key point detector module and the spatial encoder part of encoder-decoder module are composed of a full convolutional network. The former generates heat maps of all key points which is a branch of the spatial encoder, and the mainly encoding the spatial information of each position on the offline image. The temporal decoder module is composed of a GRU network and an MLP network. Finally, we combine temporal information and reconstruction constraints to generate the final sequence. At each time, the features encoding by the spatial encoder module are combined with the features at the previous time that generate a corresponding heat map. The main contribution of the work of this paper is to propose a method that more suitable for handwriting reconstruction of Chinese handwritten characters. Experimental results show that the CT [6] accuracy of our method has already reached 87.6% on OLHWDB1.1 dataset.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Complex Handwriting Trajectory Recovery: Evaluation Metrics and Algorithm

Start, Follow, Read: End-to-End Full-Page Handwriting Recognition

Online handwriting trajectory reconstruction from kinematic sensors using temporal convolutional network

Article 17 May 2023

Keywords

1 Introduction

Handwritten text analysis and recognition [20] has always been an important field of OCR [9], and it has also been the focus of research by scientists in the past decade [14]. Handwriting analysis has been studied for a long time. From the initial rule-based method to the current deep learning network-based methods, the accuracy of recognition has been continuously improved. According to different representations, handwritten text recognition is divided into online handwriting recognition and offline handwriting recognition. Offline characters are represented by two-dimensional static images, while online characters are represented by a continuous coordinate sequence. Online characters also covers the trajectory, speed and angle of the handwriting during writing. Therefore, compared to offline characters, the accuracy of online handwriting recognition is usually higher than offline handwriting recognition. However, offline text collection is more convenient, more suitable for actual application scenarios, and its applications are more extensive. If the dynamic information of the text can be recovered from the two-dimensional static image, the static and dynamic information can be combined to further improve the accuracy of recognition. Moreover, handwriting reconstruction is widely used in smart writing and handwriting identification [7].

Currently, character handwriting reconstruction methods include graph sear-ch, template matching and writing rules, as well as deep learning based methods [6, 20]. The method based on graph search [17] is to find a path with the least cost according to the minimum energy cost criterion. It is only suitable for the restoration of the writing order of numbers and alphabets. The method based on template matching [11] needs to build a stroke template library, and restore the handwriting by comparing the input image with the template. This method has a wider application range and higher accuracy, but it is too complicated to calculate the best path in the matching process. The method [2] based on writing rules uses the structural characteristics of characters to express the relationship between character strokes, and then uses rules to restore their order. Its disadvantage is that it cannot adapt to changes in writing styles and cannot handle text with broken pens. The method [19] based on deep learning performs a series of preprocessing on the image, and finally performs the arrangement prediction of the order relationship of each pixel through the network. It has poor adaptability to the complicated text with many strokes. Other methods based on deep learning like [6, 6] extracted the feature sequence of the two-bit static image, and finally the handwriting sequence is generated through RNN and fully connected network. It is not very adaptable to samples which have complex font and a wide range of stroke’s number.

When a person is writing, the visual attention will move with the movement of the handwriting. In machine vision, according to this feature, we express it as the response probability of corresponding position at different times, and should be concentrated in a certain area or point. Therefore, this paper proposes a handwriting reconstruction method based on spatial-temporal encoder-decoder network, which simulates the process of human visual attention movement [18] by predicting the probability of each point on the image at different times.

The rest of this article is organized as follows. The second section introduces Spatial-Temporal Encoder-Decoder Network model proposed in this article in detail. The third section explains in detail the reconstruction constraint proposed in this paper. The fourth section introduces the composition of the loss function in detail. The fifth section is the detail of experiment and results. The last section is the conclusions.

2 Spatial-Temporal Encoder-Decoder Network

In this section, we introduce in detail how our proposed method generates online handwriting sequences based on offline pictures. As mentioned earlier, we did not directly output, but output the absolute position of the maximum probability point at different temporal steps. The Spatial-Temporal Encoder-Decoder Network is divided into three modules to generate handwriting sequence:key point detector module, spatial encoder module, and temporal decoder module. The spatial encoder module is the backbone network of the model, which is essentially a variant FCN network and outputs the spatial features of each position of the offline image. Figure 3 shows its structure. The key point detector module is a branch of the backbone network, which outputs and classifies all key points of the font. Recurrent neural network GRU [3] and Multi-layer perceptron MLP form temporal decoder, which combines spatial features to output the heat map sequence. The overall framework as show as Fig. 1.

2.1 Key Point Detector

The key point detector [1, 5, 16] module is to return the position of each candidate point through the FCN network [13]. Fully convolutional networks have better spatial generalization capabilities than fully connected networks. This module detects all key points and divides them into two categories:end points and connection points. And then provide this information to the reconstruction constraint module. Since the full convolutional network FCN is more stable than the fully connected network in terms of position regression, more and more people use the full convolutional network FCN when studying key point detection [1]. Fully convolutional network only contains convolutional layer, pooling layer and activation layer. Specific parts are selected through the information connection between parts. This method is very meaningful for target detection (Fig. 2).

The detection network can identify the overall frame of the font and filter out the key parts. It has a certain sensitivity to the turning points of the line segments. Even a curve with a small curvature can identify subtle turning points, which are finally reflected in the output heat map. While detecting the key points, the detection network also determines the length of the output coordinate sequence, and realizes the self-variable length sequence generation.

2.2 Spatial Encoder Network

The key point detector network can find the key points of the font, but only extracts the position information, and cannot analyze the relationship between them (the feature map size is the same as the output image size, and the receptive field is limited, and deeper sequential features are not extracted). FCN cannot extract deeper-scale features well without changing the size of the feature map, so this work will be completed by the spatial encoder network. The spatial encoder network is a special full convolutional network. In order to extract the deep-level visual features of the image and obtain a larger receptive field while keeping the feature map at a certain size. So this part is composed of FCN and U-Net [12]. FCN [13] made a brief introduction in the front. And U-Net [12] is an FCN network with a special structure. U-Net [12] consists of two parts:one is the contracting path and the other is the expanding path. The contraction path can obtain contextual information of different scales, and the expansion path can supplement some of the deep-level information of the image. But the supplement of information is definitely incomplete, so skip connect is needed to combine the higher resolution pictures on the contraction path. Since the method proposed in this paper is to generate a heat map of handwriting points through FCN. It limits the output size of FCN must take into account the size ratio of handwriting points in the original image. And the addition of U-Net [12] is to maintain a certain size scale feature map while obtaining the deeper features of the image. Meanwhile, the stroke texture information of the image can be obtained by expanding the receptive field.

The specific structure of spatial encoder network shown in Fig. 3. The image through the spatial encoder network, is encoded as a tensor of size $d\times H'\times W'$. We denote these coding features as Eq. 1,

$$\begin{aligned} a=\{a_1,a_2,a_3,...,a_n\} ,a_i\in R^d ,L=H'\times W' \end{aligned}$$

(1)

where d is the dimension of $a_i$.

2.3 Temporal Decoder Network

The Temporal Decoder Network is essentially a candidate determiner composed of MLP and GRU [3]. In order to link the offline image with the variable-length output sequence, the paper [18] calculates an intermediate vector to provide a regional feature filter for subsequent recognition and classification. But we use this intermediate vector to output the coordinate points we need in the image heatmap. Figure 4 shows the work flow of the temporal decoder network, where MLP is an multi-layer perceptron composed of multiple fully connected networks and is the output layer of the temporal decoder network. GRU [3] is an improved version of cyclic neural network RNN, which solves the problem of gradient disappearance or gradient explosion during RNN training, and the space occupancy rate is much smaller than LSTM [4] while achieving the same effect. The hidden state calculation equation of GRU [3] see Eq. 2 $\sim $ Eq. 5.

$$\begin{aligned} z_t = \sigma \left( W_{hz} H_{t-1}+U_{cz}C_t+b_z \right) \end{aligned}$$

(2)

$$\begin{aligned} r_t = \sigma \left( W_{hr} H_{t-1}+U_{cr}C_t+b_r \right) \end{aligned}$$

(3)

$$\begin{aligned} \widetilde{H_t} = \tanh \left( W_h\left( H_{t-1} \otimes r_t\right) +U_h C_t+b_h \right) \end{aligned}$$

(4)

$$\begin{aligned} H_t = \left( 1-z_t\right) \otimes H_{t-1} + z_t \otimes \widetilde{H_t} \end{aligned}$$

(5)

Among them,$\sigma \left( \cdot \right) $ is the sigmoid function, $z_t,r_t,\widetilde{H_t}$ which is the update gate, reset gate and candidate state. When the temporal decoder network predicts the position of the handwriting point at each time step, it outputs the probability of each area, so the output only needs to maximize the probability of the candidate area. The temporal decoder network combines spatial features $a_i$ and the hidden state $H_{t-1}$ of the current GRU to calculate the maximum probability position of the handwriting point at the current moment (see Eq. 6$\sim $Eq. 7),

$$\begin{aligned} e_{ti} = v^T_a \tanh \left( W_a H_{t-1} +U_a a_i\right) \end{aligned}$$

(6)

$$\begin{aligned} p_{ti} =\frac{\exp \left( e_{ti}\right) }{\sum _{i=0}^L \exp \left( e_{ti}\right) } \end{aligned}$$

(7)

where $v_a \in R^n,W_a \in R^{n'\times n}, U_a \in R^d $. And then we will the most probable point as the $C_t$ to strengthen the relationship between points, like Eq. 8.

$$\begin{aligned} C_t = a\left[ \max \left( p\right) \right] \end{aligned}$$

(8)

In order to strengthen the path information, this article proposes a handwriting trend feature. The handwriting trend feature is a blank graph $\left( \beta \right) $ whose size is the output scale, and the corresponding position is marked at each time step. Then trend feature are extracted by convolution and be send to the MLP. The final calculation formula is as shown as Eq. 9 $\sim $ Eq. 12.

$$\begin{aligned} \beta = \left( 0 \right) \in F^{H' \times W'} \end{aligned}$$

(9)

$$\begin{aligned} \beta _t = \left( 1_{ij} \right) \in F^{H' \times W'}, i \in \left( 0,W' \right) ,j \in \left( 0,H' \right) \end{aligned}$$

(10)

$$\begin{aligned} F _t = f \left( \beta _t \right) \end{aligned}$$

(11)

$$\begin{aligned} e_{ti} = v^T_a \tanh \left( W_a H_{t-1} +U_a a_i + U_f f_t\right) ,f_t \in F_t \end{aligned}$$

(12)

where $\beta _t$ is the trajectory picture in the time step t, and $f\left( \cdot \right) $ is a convolution module. Finally the output $e_{ti}$ will be used in Eq. 11 which represents the response of each position of the corresponding time step t.

3 Handwriting Reconstruction Constraints

Although the Spatial-Temporal Encoder-Decoder Network has a certain adaptability to the reconstruction of handwriting Chinese characters with complex fonts and broken pens, the handwriting point probability based on the whole image will be chaotic when faced with such samples. In order to constrain the chaos of handwriting, we designed a connection rule based on different handwriting points, as shown in Fig. 5.

In the key point detection module, we divide all points into connection points and end points. So, we defined two rules:

Theorem 1

The starting point of the line segment must be the end point

Theorem 2

There must be a solid line in the line segment.

In practical applications, we select candidate points based on the rules and the output of the model (See Eq. 13$\sim $ Eq. 15),

$$\begin{aligned} p_{ti} = \left\{ \begin{array}{lll} \frac{\exp \left( e_{ti} \right) \times d_i}{\sum _{i=0}^L\exp \left( e_{ti} \right) \times d_i }, &{} if~last~point \in endpoints,\\ \frac{\exp \left( e_{ti} \right) \times k_i}{\sum _{i=0}^L\exp \left( e_{ti} \right) \times k_i}, &{} if~last~point \in connection points. \end{array} \right. \end{aligned}$$

(13)

$$\begin{aligned} l = \frac{1}{n}\phi \left( last~point,candidate~point \right) \end{aligned}$$

(14)

$$\begin{aligned} P_{ti} = \left\{ \begin{array}{lll} \max \left( p_{ti} \right) ,&{} if~last~point \in endpoints,\\ \max \left( p_{ti} \times l \right) . &{} if~last~point \in connection points. \end{array} \right. \end{aligned}$$

(15)

where $\phi \left( l,c\right) $in Eq. 14 means interpolating sampling between two points in the original image and n is the number of samples. In addition, $k_i \in key~point~map$ and $d_i \in end~point~map$. $P_{ti}$ is the final predicted value.

4 Loss Function

Since the Spatial-Temporal Encoder-Decoder Network needs to learn key point detection and key point sorting, these two tasks are different. So we define the final loss function as Eq. 16 like [8], where $L_{det}$ represents the loss in the key point detection task and $L_{sq}$ represents the loss in the key point sorting task.

$$\begin{aligned} L = L_{det} + L_{sq} \end{aligned}$$

(16)

In order to measure the gap between the predicted map and the label and balance the quantitative relationship between key points and the background, focal loss [10] is used as the loss function, as shown in the Eq. 17.

$$\begin{aligned} L_{det} = \frac{-1}{N}\sum _{c=1}^C \sum _{h=1}^H \sum _{w=1}^W \left\{ \begin{array}{ll} \beta \left( 1-p_{cij}\right) ^{\alpha }\log \left( p_{cij}\right) ,&{}if~y_{cij}=1,\\ \left( 1-\beta \right) p_{cij}^{\alpha } \log \left( 1-p_{cij} \right) ,&{} otherwise. \end{array} \right. \end{aligned}$$

(17)

Different from the traditional sorting loss function, our sorting task is to maximize the probability of the label point at each time, so we directly adopt the cross-entropy loss function.($L_{sq}$ see Eq. 18)

$$\begin{aligned} L_{sq} = \frac{-1}{N} \sum _{t=1}^N \log \left( p_{t,label\left[ t\right] }~\right) \end{aligned}$$

(18)

5 Experiment

In order to verify the effectiveness of the proposed method in handwriting reconstruction, this chapter conducts ablation experiments and comparative experiments.

5.1 Dataset Processing

OLHWDB1.1 and Tamil dataset are used in experiment.OLHWDB1.1 includes 3755 types of Chinese characters, which is written on a separate page. The stroke coordinates of the pen tip are recorded. Tamil dataset is from paper [6], which can explore the reconstruction effect on other languages, is a dataset of HP Company‘s compete. We should convert it to offline form because all of them are trajectory sequence.

Different from offline handwriting characters saved in the form of static images, online handwriting characters retain richer dynamic information when writing in the form of handwriting point sequences. We save the original data in the form of a formula [20] like Eq. 19,

$$\begin{aligned} \left[ \left[ x_1,y_1,s_1\right] ,\left[ x_2,y_2,s_2\right] ,...,\left[ x_n,y_n,s_n\right] \right] \end{aligned}$$

(19)

where $x_i$ and $y_i$ is the coordinate,$s_i$ is the point state. And then, we convert the data set into a specific form for training.

We think that the points that are too dense or the intermediate points on the same line are redundant points. In order to filter excess points, we set two conditions [20] as shown in Eq. 20 and Eq. 21,

$$\begin{aligned} \sqrt{\left( x_i-x_{i-1}\right) ^2 + \left( y_i-y_{i-1}\right) ^2} \le T \end{aligned}$$

(20)

$$\begin{aligned} \frac{\varDelta x_{i-1} \varDelta x_i + \varDelta y_{i-1}\varDelta y_i}{\sqrt{\left( \varDelta x_{i-1}^2 + \varDelta x_i^2 \right) \cdot \left( y_{i-1}^2+y_i^2 \right) }}\ge C \end{aligned}$$

(21)

where T is the threshold to filter out points with too dense distance and C is the threshold to filter out the middle point in the same straight line. In order to protect the starting point and end point of the strokes from being screened out, the screening operation is carried out when $s_i=s_{i-1}=s_{i+1}$.

Offline Character Generation. In order to make the key point heat map and label correspond to the offline image, map the preprocessed handwriting point sequence coordinates to the image whose size is $H' \times W'$. Then we resize the image to $H \times W$. (See (a) in Fig. 6) According to the corresponding label, generate a heatmap of key point [15] based on the Gaussian distribution on the image.(See (b) in Fig. 6)

5.2 Implementation Details

The network model in this article is built under the pytorch framework, and the GPU model used by the platform is NVIDIA 1080Ti, which runs on a 64-bit Linux system. In the data preprocessing in this paper, the parameter T in Eq. 20 is $0.05 \times \max \left( H,W \right) $ and the parameter C in Eq. 21 is $-0.9$. The size of image is $H \times W =512 \times 512$, the heat map and the output scale are $H' \times W' =64 \times 64$. The dimension of the output $a_i$ of the spatial encoder network is $d = 128$. The hidden state $H_t$ of GRU is a 64-dimensional tensor. Finally, we use the optimizer Adam to set the initial learning rate $lr=0.001$ and decay to 0.1 every 10 rounds.

5.3 Evaluation Metrics

At present, there is no unified standard for the evaluation of online handwriting generation problems, such as paper [6, 20]. This is also due to the large differences in the methods of generating handwriting.

Due to the particularity of the method proposed in this article. We use the average probability of the corresponding position of each handwriting point of the character as the criterion for the quality of the model.(See Eq. 22)

$$\begin{aligned} mean P = \frac{1}{K} \sum _{t=1}^K p_{t,indice} \end{aligned}$$

(22)

where K is the number of trajectory points. Although meanP cannot fully represent the recovery degree of a font handwriting, it can reflect the response degree of the model to handwriting points.

In addition, in order to facilitate the comparison with the paper [6], we also adopted their evaluation index(See Eq. 23 $\sim $ Eq. 24),

$$\begin{aligned} Starting~Point~ Accuracy = \frac{Number~of~correct~SP}{Total~number~of~test~images} \end{aligned}$$

(23)

$$\begin{aligned} Junction~Point~ Accuracy = \frac{Number~of~correct~JP}{Total~number~JP~points~in~test~data} \end{aligned}$$

(24)

when complete trajectory $\left( CT \right) $ of an offline character image is perfectly retrieved along with the correct starting point,we evaluate this metric as a positive result.

Table 1. The meanP of each method combination

Full size table

5.4 Experiment and Result Analysis

In order to verify the necessity of the handwriting trend characteristics $\left( TC\right) $ in Eq. 11 and reconstruction constraints $\left( RC\right) $ in the model proposed in this article, we conducted corresponding ablation experiments with meanP as the evaluation index. We randomly select 5000 samples from the test set for evaluation (see Table 1) The results of the Table 1 are predictable. The handwriting trend characteristics provides model with the features formed by the handwriting points of all previous moments, and provides enlightening information for the next moment. And reconstruction constraints can properly correct its errors and provide more accurate information for the next moment.

Table 2. Stroke recovery accuracy

Full size table

We also conducted a comparative experiment with the paper [6] in Tamil dataset and OLHWDB1.1. The results are from 1,000 randomly selected samples. (see Table 2)

We have compared the accuracy of our proposed method with the method from [6]. We selected it because it is the latest methods in this field as comparison objects and implemented them on the dataset. That ours model are more suitable for trajectory reconstruction of Chinese characters. A few qualitative results of methods are shown in Fig. 7 and Fig. 8.

6 Conclusion

This paper proposes a method of regression trajectory sequence that generates heatmap base on Spatial-Temporal Encoder-Decoder Network. The reconstruction results is better than the method [6] on OLHWDB. The coordinates generated in this way cannot be directly trained in the network, and the gradient must also be faulted by generating heat map labels. In future work, we will focus on solving this problem and combine it with GAN to generate a more complete trajectory sequence. In addition, whether the model can completely recover the handwriting of characters that have not been touched before will also be a future research direction.

References

Cao, Z., Hidalgo, G., Simon, T., Wei, S.E., Sheikh, Y.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Trans. Pattern Anal. Mach. Intell. 43(1), 172–186 (2021)
Article Google Scholar
Cao, Z., Su, Z., Wang, Y.: An offline handwritten chinese character writing sequence recovery model. J. Image Graphics 10(1), 2074–2081 (2009)
Google Scholar
Cho, K., et al.: Learning phrase representations using rnn encoder-decoder for statistical machine translation. In: Proceedings of the Empiricial Methods in Natural Language Processing (EMLP) (2014)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Insafutdinov, E., Pishchulin, L., Andres, B., Andriluka, M., Schiele, B.: Deepercut: a deeper, stronger, and faster multi-person pose estimation model. In: Proceeding of European Conference on Computer Vision, pp. 34–50 (2016)
Google Scholar
Kumar Bhunia, A., et al.: Handwriting trajectory recovery using end-to-end deep encoder-decoder network. In: 2018 24th International Conference on Pattern Recognition (ICPR), pp. 3639–3644 (2018)
Google Scholar
Lai, S., Jin, L., Zhu, Y., Li, Z., Lin, L.: Synsig2vec: Forgery-free learning of dynamic signature representations by sigma lognormal-based synthesis. IEEE Trans. Pattern Anal. Mach. Intell. 8(1), 99–112 (2021)
Google Scholar
Law, H., Deng, J.: Cornernet: detecting objects as paired keypoints. Int. J. Comput. Vis. 128(3), 642–656 (2020)
Article Google Scholar
Li, L., Gao, F., Bu, J., Wang, Y., Yu, Z., Zheng, Q.: An End-to-End OCR Text Re-organization Sequence Learning for Rich-Text Detail Image Comprehension. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12370, pp. 85–100. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58595-2_6
Chapter Google Scholar
Papandreou, G., et al.: Towards accurate multi-person pose estimation in the wild. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Google Scholar
Qiao, Y., Yasuhara, M.: Recover writing trajectory from multiple stroked image using bidirectional dynamic search. In: Proceedings of the 18th International Conference on Pattern Recognition (ICPR), pp. 970–973 (2006)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015, pp. 234–241 (2015)
Google Scholar
Schwing, A.G., Urtasun, R.: Fully connected deep structured networks. In arXiv:1503.02351. (2015)
Shi, B., Xiang, B., Cong, Y.: An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39(11), 2298–2304 (2016)
Article Google Scholar
Shi, X., et al.: Deep learning for precipitation nowcasting: A benchmark and a new model. In: 31st Annual Conference on Neural Information Processing Systems, NIPS, pp. 5617–5627 (2017)
Google Scholar
Wei, S.E., Ramakrishna, V., Kanade, T., Sheikh, Y.: Convolutional pose machines. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4724–4732 (2016)
Google Scholar
Yu, Q., Yasuhara, M.: Recovering drawing order from offline handwritten image using direction context and optimal euler path. In: IEEE International Conference on Acoustics, pp. 765–768 (2006)
Google Scholar
Zhang, J., et al.: Watch, attend and parse: An end-to-end neural network based approach to handwritten mathematical expression recognition. Pattern Recognit. 71(11), 196–206 (2017)
Article Google Scholar
Zhang, R., Zhan, Y., yang, M.: A method for restoring handwritten strokes based on endpoint sequence prediction. Comput. Sci. 046(4), 264–267 (2019)
Google Scholar
Zhang, X.Y., Yin, F., Zhang, Y.M., Liu, C.L., Bengio, Y.: Drawing and recognizing chinese characters with recurrent neural network. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 849–862 (2018)
Article Google Scholar

Download references

Acknowledgement

This work was supported by the Natural Science Foundation of Tianjin(Grant No.18JCYBJC85000)

Author information

Authors and Affiliations

Tianjin Normal University, No. 393 Binshuixi Road, Xiqing District, Tianjin, China
Feilong Wei & Yuanping Zhu

Authors

Feilong Wei
View author publications
You can also search for this author in PubMed Google Scholar
Yuanping Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuanping Zhu .

Editor information

Editors and Affiliations

University of Science and Technology Beijing, Beijing, China
Huimin Ma
Chinese Academy of Sciences, Beijing, China
Liang Wang
Tsinghua University, Beijing, China
Changshui Zhang
Zhejiang University, Hangzhou, China
Fei Wu
Chinese Academy of Sciences, Beijing, China
Tieniu Tan
Hunan University, Changsha, China
Yaonan Wang
Sun Yat-Sen University, Guangzhou, Guangdong, China
Jianhuang Lai
Beijing Jiaotong University, Beijing, China
Yao Zhao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wei, F., Zhu, Y. (2021). Handwriting Trajectory Reconstruction Using Spatial-Temporal Encoder-Decoder Network. In: Ma, H., et al. Pattern Recognition and Computer Vision. PRCV 2021. Lecture Notes in Computer Science(), vol 13019. Springer, Cham. https://doi.org/10.1007/978-3-030-88004-0_28

Download citation

DOI: https://doi.org/10.1007/978-3-030-88004-0_28
Published: 22 October 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-88003-3
Online ISBN: 978-3-030-88004-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics