Research on image text recognition based on canny edge detection algorithm and k-means algorithm

Wu, Fangsheng; Zhu, Changan; Xu, Jinxiu; Bhatt, Mohammed Wasim; Sharma, Ashutosh

doi:10.1007/s13198-021-01262-0

Research on image text recognition based on canny edge detection algorithm and k-means algorithm

Original article
Published: 22 August 2021

Volume 13, pages 72–80, (2022)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

International Journal of System Assurance Engineering and Management Aims and scope Submit manuscript

Research on image text recognition based on canny edge detection algorithm and k-means algorithm

Download PDF

Fangsheng Wu¹,
Changan Zhu¹,
Jinxiu Xu²,
Mohammed Wasim Bhatt ORCID: orcid.org/0000-0003-0542-2790³ &
…
Ashutosh Sharma⁴

1089 Accesses
32 Citations
Explore all metrics

Abstract

The latest research in the field of recognition of image characters has led to various developments in the modern technological works for the improvement of recognition rate and precision. This technology is significant in the field of character recognition, business card recognition, document recognition, vehicle license plate recognition etc. for smart city planning, thus its effectiveness should be improved. In order to improve the accuracy of image text recognition effectively, this article uses canny algorithm to process edge detection of text, and k-means algorithm for cluster pixel recognition. This unique combination combined with maximally stable extremal region and optimization of stroke width for image text yields better results in terms of recognition rate, recall, precision, F-score and accuracy. The results show that the correct recognition rate is 88.3% and 72.4% respectively with an accuracy value of 90.5% for the proposed method. This algorithm has high image text recognition rate, can recognize images taken in complex environment, and has good noise removal function. It is significantly an optimal algorithm for image text recognition.

Text Recognition Using K-means Clustering and Support Vector Machine

Novel Text Recognition Based on Modified K-Clustering and Hidden Markov Models

Article 23 November 2019

Review on Text Recognition in Natural Scene Images

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Image character recognition is one of the most important topics in the computer field. The detection and recognition algorithms of image characters have been deeply explored and improved. The detection and recognition of characters covers people's daily life, such as character recognition system, business card recognition system, document recognition system, vehicle license plate recognition system and container coding recognition in traffic management. These applications not only facilitate text entry, document entry and other work, but also because of the extensive use of imaging equipment such as smartphones in daily life. With the rapid progress of science and technology, character detection and recognition system is no longer limited to the application in the field of office. Nowadays, the demand for optical character recognition system to record information and search information is increasing ordinary people's daily life (Yang et al. 2017; Feng and Wang 2020; Li et al. 2021; Liu et al. 2021).

The main aim of text detection methodology is to generate the text region in an image using various image processing methods (Zhaojun and Jun 2019; Xu et al. 2021). Initially, the text is detected and distinguished from the entire image using the bounding box technique. A bounding box is created around the text object in that particular image which is further followed by text localization. This step determines the location of text in the image through the drawn bounding box. This step is further followed by the text enhancement and segmentation. However, the bounding box method however, accurately provides the text location in the image but still segmentation is required. Text image enhancement is required in order to remove some background noise and make the text recognition process easy. The text recognition phase acquires the useful text information from enhanced region and then the text is obtained as the output (Pang et al. 2021; Wang et al. 2018). This entire process of text recognition is depicted in Fig. 1.

Since entering the information age, multimedia information represented by digital image and video has gradually penetrated into all aspects of human society. The effectively recognition, retrieval and control of multimedia information is the major concern in today's society. As the product of human abstract thinking, text information has a strong generalization and representation images and videos (Sharma et al. 2017; Dogra et al. 2020). To sum up, the research of image text recognition based on canny edge detection algorithm and k-means algorithm is of great significance.

This article significantly contributes in image text recognition by using the combination of canny edge detection algorithm and k-means algorithm. The method proposed in this article has several contributions like:

It modifies the canny edge detection and k-means algorithm and this combination is further optimized to obtain the optimal text recognition output.
The preprocessing steps are involved in the proposed methodology along with the optimization of stroke width for image text which provides improved performance.
The outcomes of the proposed method are assessed in terms of recognition rate, error and average recognition time and some of the other parameters are also evaluated. The other parameters computed are recall, precision, f-score and accuracy.
The evaluation parameters are further compared with the other state of the art methodologies in order to assess the superiority of the proposed technique.
The outperformance of the proposed methodology is observed compared to the other current methodologies in this domain and it can be further utilized for various application of smart city planning like character recognition, business card recognition, document recognition, vehicle license plate recognition etc.

The rest of this article is organized as: Literature review of traditional methods being utilized for image text recognition is provided in Sect. 2. Section 3 presents the research methodology including the principles of canny edge detection algorithm and k-means algorithm along with the optimization approach. The results are presented in Sect. 4 followed by the concluding remarks in Sect. 5.

2 Literature review

Digital image is one of the largest information sources in information access in the digital age, and edge is the most essential feature of image. Although edge pixels only occupy a small part of an image, they carry most of the information of the image. These contours play an important role in describing or identifying images. Most of the classical edge detection algorithms are based on differential operations in mathematics and belong to differential algorithm (Wang et al. 2019a). Convenient and reliable, easy to operate is the biggest advantage of this method, and this method is more mature, and they can be called directly in the MATLAB, now it has been widely used in many fields, and the effect is remarkable. For this reason, improving classical algorithms has become one of the research directions. Some of the conventional text detection and recognition methodologies are compared in Table 1.

Table 1 Comparative analysis of conventional text detection and recognition approaches

Full size table

The text recognition methodology adopted by some of the latest techniques of 2019–2020 are explained in detail in the following section.

Zheng et al., obtained the gray prediction image with enhanced edge by replacing the pixel value in the original image with the maximum gray prediction value. The original image is subtracted from the gray prediction image to obtain the gray prediction subtraction image of the edge separated from the non-edge point. The best separation threshold in grey prediction subtraction images can be determined by the global adaptive threshold selection method. The neighborhood search method is then deployed to remove stray points and burrs from the image after the target is separated from the background, thus creating the final edge image. Experiments were performed on the spectre of computer simulations to find that subjective visual effects and objective evaluation criteria are better in the proposed method than several other competing methods. He proposed edge detection algorithm shows excellent edge detection capability and highly robust to noise (Zheng et al. 2020). Zhang et al., proposed an improved adaptive canister edge detection algorithm. Use bilateral filtering instead of gaussian filtering to eliminate noise and sharpen image edges. The gradient size of the image is then calculated using gradient templates in four directions: Horizontal, vertical, 45 and 135. A traditional OTSU threshold segmentation algorithm is improved. The improved concept is to find the intra-class and level variance as the maximum of the threshold. Reducing the search range of threshold can reduce the computation and realize fast segmentation. The verification of road marking image shows that the improved canny algorithm can better mark the location, reduce the edge disconnection and false edge, and shorten the processing time relatively (Zhang et al. 2020). Shi et al., proposed a new edge detection method, which combines the canned operator and improves the ant colony optimization algorithm. In this method, first, the traditional can operator extracts the edges of the image. the endpoint of the edge is then calculated as the initial position of the ant. The fuzzy triangle member function is introduced by the gray value of the neighborhood. the fuzzy member values of each pixel between edge endpoints are calculated as a heuristic matrix of ant colonies. Heuristic matrices prompt ants to search along real edges to detect continuous and complete edge lines. Experimental results show that this method effectively improves the accuracy of contour extraction of target objects in images, and the edge information extracted by this method is clearer and (Shi et al. 2019).

The innovation of this paper is to propose a combination of canny text edge detection algorithm and k-means algorithm to identify text. The main processes include image grayscale processing, image binarization processing using Maximally stable extremal regions (MSER), canny edge detection, pixel clustering, text segmentation, OCR recognition and so on. The image text of ICDAR2013 data set is recognized by this algorithm and compared with other recognition text methods.

3 Research methodology

3.1 Principle of canny edge detection algorithm

This algorithm is a detection algorithm based on edge features. It will keep the value of text attribute unchanged in the process of text processing, but it will not process the text, but will reduce the data scale of text image. At present, there are many algorithms based on edge detection in the world. In this paper, the canny algorithm is used. This algorithm can deal with edge detection problem well, which brings advantages to text recognition. Edge detection algorithms are diverse, and canny algorithm is to find the most suitable edge detection, meaning as follows.

(1)
Search for the best detection: Edge detection will extract some features of the text edge as much as possible, but also need to meet the probability of missing detection as small as possible.
(2)
Edge location rule: It is required that the point of searching the edge is not far from the edge point of the actual text, in short, the deviation between the edge position of the search and the edge position of the actual text cannot be too large, thus improving the recognition accuracy.
(3)
Comparison of search location and edge location: The algorithm makes the search point correspond to the actual text point.

3.2 Principle of k-means algorithm

k-means clustering analysis algorithm is the most widely used algorithm in clustering analysis. k-means algorithm is one of the classical methods in cluster analysis, because of its efficiency, therefore, clustering of large-scale data is widely used. K-means algorithm realizes data clustering by dividing samples into k classes with homogeneous variance (Liu and Zou 2020). The algorithm needs to specify the number of classes divided. It works better with big data, has been widely used in practical applications. K-means algorithm divides n samples in the dataset N into k disjoint classes, c these k classes in letters, n sample is represented X letters, each class has its own central u_i. K-means algorithm is an iterative optimization algorithm, finally, the following mean square error is minimized:

$$\min \sum\limits_{i = 0}^{n} {\sum\limits_{{x_{j} \in c_{i} }} {\left( {\left\| {x_{j} - u_{i} } \right\|^{2} } \right)} }$$

(1)

3.3 Image preprocessing

Image grayscale conversion. Grayscale transformation refers to the method of changing the gray value of each pixel in the original image point by point according to a certain transformation relation according to a certain target condition. The purpose is to improve the quality of the picture, remove the noise, make the display effect of the image clearer, and provide convenience for the subsequent text segmentation and extraction. Noise in image text recognition is mainly caused by interference of normal information of image text due to external illumination factors, color factors and other factors. therefore, noise removal is a very important link. We mainly use filtering to remove image noise, including domain average filtering and median filtering (Pei et al. 2008; Kumar et al. 2021; Dhawan et al. 2021; Fan et al. 2020).

$$v\left( i \right) = \frac{{\left| {Q_{{i + {\Delta }}} - Q_{{i - {\Delta }}} } \right|}}{{\left| {Q_{i} } \right|}}$$

(2)

By the calculation of the above formula, we can roughly get a region, then the region is considered to be Maximally stable extremal regions (MSER). However, the results obtained only by the above formula are not very good, in some special cases, the above formula calculation may not be able to obtain MSER. And that means it may not be detected, so we still need to deal with it. The common practice is to do another test after region reversal, so that in half of the cases, the more accurate MSER can be measured.

Text edge detection based on canny algorithm. A binary processed image is divided into feature regions to realize the separation of text blocks from the image background by detecting the gray level change, color change and texture feature difference of the digital image. The main steps of the canny edge detection algorithm are as follows: Smoothing the image with gaussian filtering in order to remove noise; finding the intensity gradient of the image; using non-maximum suppression technique to eliminate edge false detection; using double threshold method to determine the possible boundary; using hysteresis technique to perform edge detection.

Image smoothing: The Gao Si filter can be realized by two one-dimensional Gao Si kernels, that is, one-dimensional X direction convolution, and then one-dimensional Y direction convolution. It can also be realized directly by a two-dimensional Gao Si kernel.
Non-maximum suppression: The purpose of non-maximum suppression is to clear the boundary of the image after edge detection, that is, to retain the maximum value of each pixel. The processing process of pixel points is as follows: i) the so-called gradient direction value is processed first i) the second is that the gradient value of each pixel is positive and negative, and their gradient value needs to be compared. ii) if the gradient intensity of the pixel is obtained, it is retained, otherwise it is suppressed.
Double threshold processing: After the above series of processing, it is not over, because there will still be some noise, this time we need to use double threshold technology, the core of this technology is to set a threshold of the upper and lower bounds, through the upper and lower bounds of the value of rational judgment, there are three possibilities, pixel points between the upper bound, lower bound or upper and lower bound.
Edge detection: Through the above processing analysis, we have obtained the pixel edge value situation, so how to determine the best edge position? We're looking at the pixels and the thresholds.

3.4 Enhanced processing of stroke width in image text

After the image preprocessing stage, it is necessary to optimize the stroke width of the text. The width processing is still the strength of the canny algorithm. According to this value, we have the following analysis.

(1)
When the pixel p is located at the edge of the text, the gradient value must be 90 degrees with the stroke direction, and then search the other pixel points corresponding to the min value along that direction q, then the pixel p is roughly opposite to the q direction. When the pixel p and q are found to satisfy the condition, the length of the formed ray is calculated by Euclidean distance, and the width value of a stroke can be obtained, which is the best choice in this direction.
(2)
Repeat step (1) to calculate the stroke width attribute value of pixels on all routes that are not discarded, so the algorithm ends.

3.5 Optimization of stroke width for image text

After image preprocessing and text stroke width optimization, the target image can be extracted by k-means clustering algorithm. By clustering the pixels in the target image, the text target area to be recognized can be extracted from the image. The operation flow is shown in Fig. 2.

4 Results and discussion

4.1 Canny operator recognition effect before and after improvement

To verify the effect of human behavior recognition improved by canny operator, the ICanny-RBF and Canny-RBF are compared and the experimental results are shown in Fig. 3. Figure 3 reveals that the average recognition accuracy of the five human behaviors compared with the Canny-RBF, ICanny-RBF improves by a significant factor of 1.52%, 2.38%, 0.23%, 3.89%, 2.40% for various processes of human behavior like walking, run, jump, side and skipping respectively.

The comparison results show that the improved canny operator can be used to detect the contour of human behavior images, and a more complete foreground contour can be obtained. It is extremely beneficial to the subsequent feature extraction and the classification and recognition of human behavior RBF neural network.

4.2 Comparison of RBF and BP neural networks

The analysis of support for the reference framework is done using the comparative analysis of ICanny-RBF neural network with ICanny-BP neural network. The boundary conditions considered for the experimentation are network training target error to 0.001, learning rate of 0.005. The training and testing samples are created using 70–30% train and test criteria.

For analyzing the advantages of ICanny-RBF neural network, comparison and simulation is done using ICanny-BP neural network and their average recognition accuracy is shown in Fig. 4. Figure 4 shows that, the recognition performance of ICanny-RBF neural network is better than that of ICanny-BP neural network. The comparison shows that, an ideal recognition effect can be achieved by using RBF neural network. The network training target error is set to 0.001 for comparison of training process and training errors for BP neural network and RBF neural network are shown in Fig. 5.

Comparing the results, RBF the neural network iterates to 340, has achieved preset precision, when the iteration stops; and BP the neural network iterates 180 times, to achieve preset precision. The comparison shows that, compared to BP neural networks, RBF neural network has faster learning speed and it improves the recognition efficiency of human behavior.

4.3 Identification of recognition speed

In order to detect the recognition speed of human behavior, the average recognition time of the test set is used as the evaluation index, and the results are shown in Table 2. Table 2 shows that the human behavior recognition time of RBF neural network is quite short and less than the average recognition time of BP neural network, which is very suitable for the real-time requirement of human behavior recognition.

Table 2 Comparison of Average Recognition Time for Different Models

Full size table

This paper based on the combination of canny text edge detection algorithm and k-means clustering algorithm to identify the image text, in which the steps mainly include the gray level conversion of the image. The binary processing of the image and the preprocessing stage of the image text edge detection. By measuring the width of the text stroke, the scope of text recognition is reduced; then the k-means algorithm is used to cluster and integrate the recognized image pixels. The text area is divided, and process the text segmentation of the text area by optimizing the integration. Finally, the recognition of picture text is realized by OCR recognition interface (Kaur et al. 2021; Bhuyan et al. 2021; Wang et al. 2019b; Zhang et al. 2018). To analyze the results of this experiment, we use ICDAR2013 data set, which contains many kinds of text content and the image without text and some fuzzy text is further removed. Through this algorithm, 233 pairs of ICDAR2013 data sets are studied by using MATLAB (Mahajan et al. 2021). The recall rate is 72.4% and the precision is 88.3%. The average performance F-score parameter is 75.9% while accuracy value of 90.5% is observed for the proposed method. Table 3 and Fig. 6 shows the performance results of processing ICDAR2013 data sets by different methods (Sharma et al. 2021a).

Table 3 ICDAR2013 Text recognition results

Full size table

The data tabulated in Table 3 and graphically presented in Fig. 6 reveals that in the text recognition of the ICDAR2013 data set, the recall rate and the correct rate are improved because the ICDAR2013 data set removes the fuzzy text and the category without text. Compared with other algorithms, this algorithm better the recognition accuracy of ICDAR2013 data set (Sharma et al. 2021b).

5 Conclusion

This paper utilizes a maximally stable extremal region phenomenon for the binarization of image, canny algorithm to process the edge detection of the text and uses the k-means algorithm to cluster the pixels. The standard ICDAR2013 dataset is used for the study and optimal text recognition outputs are obtained utilizing the combination of preprocessing steps along with the optimization of stroke width for image text. Various assessment parameters are evaluated like recall, precision, f-score and accuracy and they are further compared with the other state of the art methodologies in order to assess the superiority of the proposed technique. The proposed approach yields the recall rate of 72.4% with precision of 88.3% and F-score of 75.9% while accuracy value of 90.5% is observed for the proposed method. It was observed that the method proposed in this article outperforms all other current methodologies in this domain. This work can be further utilized for various applications of smart city planning like character recognition, business card recognition, document recognition, vehicle license plate recognition etc. The future research direction of this algorithm can deal with the location and extraction of text area in the background of complex digital image, and get the image text that can be recognized by the Optical Character Recognition (OCR).

References

Bhuyan H, Chakraborty C, Pani S, Ravi V (2021) Feature and subfeature selection for classification using correlation coefficient and fuzzy model. IEEE Trans Eng Manag. https://doi.org/10.1109/tem.2021.3065699
Article Google Scholar
Cherian A, Sebastian S (2016) Automatic localization and recognition of perspectively distorted text in natural scene images. In: 2016 international conference on emerging trends in engineering, technology and science (ICETETS). IEEE, pp 1–6
Dhawan S, Chakraborty C, Frnda J, Gupta R, Rana AK, & Pani SK (2021) SSII: Secured and high-quality Steganography using Intelligent hybrid optimization algorithms for IoT. IEEE Access
Dogra J, Jain S, Sharma A, Kumar R, Sood M (2020) Brain tumor detection from MR images employing fuzzy graph cut technique. Recent Adv Comput Sci Commun Formerly Recent Patents Comput Sci 13(3):362–369
Article Google Scholar
Fan D, Guo X, Lu X, Liu X, Sun B (2020) Image defogging algorithm based on sparse representation. Complexity 2020:1–8
Google Scholar
Feng, J., & Wang, X. (2020) Non-motor vehicle illegal behavior discrimination and license plate detection based on real-time video. In Journal of Physics: Conference Series (Vol. 1544, No. 1, p. 012105). IOP Publishing
Jaderberg M, Simonyan K, Vedaldi A, Zisserman A (2016) Reading text in the wild with convolutional neural networks. Int J Comput Vision 116(1):1–20
Article MathSciNet Google Scholar
Kaur M, Khan MZ, Gupta S, Noorwali A, Chakraborty C, Pani SK (2021) MBCP: Performance analysis of large scale mainstream blockchain consensus protocols. IEEE Access. 9:80931
Article Google Scholar
Kumar A, Abhishek K, Chakraborty C, Kryvinska N (2021) Deep learning and internet of things based lung ailment recognition through coughing spectrograms. IEEE Access 9:95938
Article Google Scholar
Li G, Liu F, Sharma A, Khalaf OI, Alotaibi Y, Alsufyani A, & Alghamdi S (2021). Research on the natural language recognition method based on cluster analysis using neural network. Mathematical Problems in Engineering, 2021. Yao L, Zhao Y, Fan J, Liu M, Jiang J, & Wan Y (2019). Research and application of license plate recognition technology based on deep learning. In Journal of Physics: Conference Series (Vol. 1237, No. 2, p. 022155). IOP Publishing
Liu S, Zou Y (2020) An improved hybrid clustering algorithm based on particle swarm optimization and K-means. In: IOP conference series: materials science and engineering, vol 750, no 1. IOP Publishing, p 012152
Liu L, Wang L, Xu D, Zhang H, Sharma A, Tiwari S, Kaur M, Khurana M, Shah MA (2021) CT image segmentation method of liver tumor based on artificial intelligence enabled medical imaging. Math Probl Eng 2021:1–8. https://doi.org/10.1155/2021/9919507
Article Google Scholar
Mahajan K, Garg U, Shabaz M (2021) CPIDM: A clustering-based profound iterating deep learning model for HSI segmentation. Wirel Commun Mob Comput 2021:1–12. https://doi.org/10.1155/2021/7279260
Article Google Scholar
Neumann L, Matas J (2015). Efficient scene text localization and recognition with local character refinement. In: 2015 13th International Conference on Document Analysis and Recognition (ICDAR) (pp. 746–750). IEEE
Pang H, Zheng Z, Zhen T, Sharma A (2021) Smart farming: An approach for disease detection implementing IoT and image processing. Int J Agric Environ Inf Syst(IJAEIS) 12(1):55–67
Article Google Scholar
Pariwat T, & Seresangtakul P (2017). Thai finger-spelling sign language recognition using global and local features with SVM. In: 2017 9th international conference on knowledge and smart technology (KST) (pp. 116–120). IEEE
Pei Z, Hua X, & Han J (2008). The clustering algorithm based on particle swarm optimization algorithm. In: 2008 International conference on intelligent computation technology and automation (ICICTA) (Vol. 1, pp. 148–151). IEEE
Ramachandran P, Liu PJ, Le QV (2016) Unsupervised pretraining for sequence to sequence learning. arXiv preprint arXiv: 1611.02683.
Sharma C, Bagga A, Singh BK, Shabaz M (2021a) A novel optimized graph-based transform watermarking technique to address security issues in real-time application. Math Probl Eng 2021:1–27. https://doi.org/10.1155/2021/5580098
Article Google Scholar
Sharma C, Amandeep B, Sobti R, Lohani TK, Shabaz M (2021b) A secured frame selection based video watermarking technique to address quality loss of data: combining graph based transform, singular valued decomposition, and hyperchaotic encryption. Security Commun Netw 2021:1–19. https://doi.org/10.1155/2021/5536170
Article Google Scholar
Sharma A, Ansari MD, & Kumar R (2017) A comparative study of edge detectors in digital image processing. In 2017 4th International Conference on Signal Processing, Computing and Control (ISPCC) (pp. 246–250). IEEE
Shi B, Bai X, Yao C (2016a) An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans Pattern Anal Mach Intell 39(11):2298–2304
Article Google Scholar
Shi B, Wang X, Lyu P, Yao C, & Bai X (2016). Robust scene text recognition with automatic rectification. In: Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4168–4176)
Shi Q, An J, Gagnon KK, Cao R, & Xie H (2019) Image edge detection based on the Canny edge and the ant colony optimization algorithm. In: 2019 12th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI) (pp. 1–6). IEEE
Singh RV, Randhawa N (2014) Automobile number plate recognition and extraction using optical character recognition. Int J Sci Technol Res 3(10):37–39
Google Scholar
Tian S, Bhattacharya U, Lu S, Su B, Wang Q, Wei X, Tan CL (2016) Multilingual scene character recognition with co-occurrence of histogram of oriented gradients. Pattern Recogn 51:125–134
Article Google Scholar
Wang J, Huang H, Qian X, Cao J, Dai Y (2018) Sequence recognition of Chinese license plates. Neurocomputing 317:149–158
Article Google Scholar
Wang Y, Lian Z, Tang Y, Xiao J (2019a) Boosting scene character recognition by learning canonical forms of glyphs. Int J Document Anal Recog (IJDAR) 22(3):209–219
Article Google Scholar
Wang H, Liu C, Yu L, and Zhao J (2019). Research on target detection and recognition algorithm based on deep learning. In: 2019 Chinese Control Conference (CCC) (pp. 8483–8487). IEEE
Xu X, Li L, Sharma A (2021) Controlling messy errors in virtual reconstruction of random sports image capture points for complex systems. Int J Syst. Assurance Eng Manag, 1–8
Yang H, Wang C, Bartz C, & Meinel C (2016). SceneTextReg: a real-time video OCR system. In Proceedings of the 24th ACM international conference on Multimedia (pp. 698–700)
Yang A, Jiang W, Chen L (2017) An adaptive edge detection algorithm based on improved canny. In Advanced Computational Methods in Life System Modeling and Simulation (pp. 566–575). Springer, Singapore
Yildirim G, Achanta R, & Süsstrunk S (2013). Text recognition in natural images using multiclass hough forests. In Proceedings of the 8th International Conference on Computer Vision Theory and Applications (Vol. 1, No. CONF, pp. 737–741)
Zhang X, Gao X, Tian C (2018) Text detection in natural scene images based on color prior guided MSER. Neurocomputing 307:61–71
Article Google Scholar
Zhang H, Liang J, Jiang H, Cai Y, Xu X (2020) Lane line recognition based on improved 2D-gamma function and variable threshold Canny algorithm under complex environment. Measure Control 53(9–10):1694–1708. https://doi.org/10.1177/0020294020952477
Article Google Scholar
Zhaojun L, Jun Z (2019) Adaptive Canny algorithm improvement based on Otsu algorithm and histogram analysis. Modern Electron Tech 42(11):54–58
Google Scholar
Zheng Z, Zha B, Yuan H, Xuchen Y, Gao Y, Zhang H (2020) Adaptive edge detection algorithm based on improved grey prediction model. IEEE Access 8:102165–102176
Article Google Scholar

Download references

Acknowledgements

Key Projects of Natural Science Research in Colleges and Universities in Anhui Province (KJ2019A1166).

Funding

This research work is self-funded.

Author information

Authors and Affiliations

Institute of Information Engineering, Anhui Business Vocational College, Hefei, 231131, Anhui, China
Fangsheng Wu & Changan Zhu
University of Science and Technology of China, Hefei, 230026, Anhui, China
Jinxiu Xu
Central University of Punjab, Bathinda, India
Mohammed Wasim Bhatt
Institute of Computer Technology and Information Security, Southern Federal University, Southern Federal District, Russia
Ashutosh Sharma

Authors

Fangsheng Wu
View author publications
You can also search for this author in PubMed Google Scholar
Changan Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Jinxiu Xu
View author publications
You can also search for this author in PubMed Google Scholar
Mohammed Wasim Bhatt
View author publications
You can also search for this author in PubMed Google Scholar
Ashutosh Sharma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohammed Wasim Bhatt.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest and all ethical issues including human or animal participation has been done. No such consent is applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, F., Zhu, C., Xu, J. et al. Research on image text recognition based on canny edge detection algorithm and k-means algorithm. Int J Syst Assur Eng Manag 13 (Suppl 1), 72–80 (2022). https://doi.org/10.1007/s13198-021-01262-0

Download citation

Received: 05 July 2021
Revised: 30 July 2021
Accepted: 09 August 2021
Published: 22 August 2021
Issue Date: March 2022
DOI: https://doi.org/10.1007/s13198-021-01262-0

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Research on image text recognition based on canny edge detection algorithm and k-means algorithm

Abstract

Similar content being viewed by others

Text Recognition Using K-means Clustering and Support Vector Machine

Novel Text Recognition Based on Modified K-Clustering and Hidden Markov Models

Review on Text Recognition in Natural Scene Images

1 Introduction

2 Literature review

3 Research methodology