Abstract
This study implements a method of automating anomaly detection in engineering diagrams by extracting patterns within graphs after recognizing graphs from a piping and instrumentation diagram (P&ID). The framework consists of three parts: graph generation, subgraph extraction, and graph classification. Graphs are generated through symbol recognition and line recognition, and subgraphs are extracted using the frequent subgraph mining algorithm. The graph classification targets are divided into two categories according to the frequency of the main equipment of the extracted subgraph. If the frequency is low, it is classified through whether to include a user-defined subgraph, and if it is high, it is trained in a support vector machine (SVM) algorithm after vector embedding to generate a classification model. K-fold cross-validation is also applied to increase classification accuracy. The proposed framework shows 85% accuracy for a given test drawing through cross-validation. These outcomes contribute to the field of engineering diagram analysis and have potential applications in plant industries.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
Abbreviations
- D:
-
the graph dataset
- G:
-
the graph dataset
- n:
-
DFS code
- Nc :
-
the number of target class of nodes in the graph
- NG :
-
the number of whole class of nodes in the graph
- NF :
-
the number of anomaly data
- NT :
-
the number of normal data
- s:
-
a subgraph of graph G
- S:
-
the subgraph dataset of graph
- CC&R:
-
combination contour & ramer douglas peucker
- CNN:
-
convolutional neural network
- CV:
-
control valve
- DFS:
-
depth-first search
- DGCNN:
-
dynamic graph convolutional neural networks
- ED:
-
engineering diagram
- FCN:
-
fully convolutional neural network
- FEED:
-
front-end engineering design
- FFSM:
-
fast frequent subgraph mining
- FSM:
-
frequent subgraph mining
- FN:
-
false negative
- FP:
-
false positive
- GSpan:
-
graph-based Substructure Pattern
- ILSVRC:
-
imagenet large scale visual recognition challenge
- MoFa:
-
molecule fragment miner
- OCR:
-
optical character recognition
- OLE:
-
object linking and embedding
- OPC:
-
OLE for process control
- P&ID:
-
piping and instrumentation diagram
- PFD:
-
process flow diagram
- PSV:
-
pressure safety valve
- R-CNN:
-
region-CNN
- SSD:
-
single shot detector
- SVM:
-
support vector machine
- TN:
-
true negative
- TP:
-
true positive
Reference
W. I. Strunk and E.B. White, The elements of style, Pearson Publications, New York, 88 (1979).
S.U. Rehman and A.U. Khan, IEEE., In Seventh International Conference on Digital Information Management (ICDIM), Graph mining: A survey of graph mining techniques, 88 (2012)
N. Otsu, IEEE., A threshold selection method from gray-level histograms, 9, 62 (1979).
J. Sauvol and M. Pietikäinen, Pattern Recognition, Adaptive document image binarization, 33, 225 (2000).
D. M. Himmelblau, Korean J. Chem. Eng., Applications of artificial neural networks in chemical engineering, 17, 373 (2000).
C. Szegedy and W. Liu, In Proceedings of the IEEE conference on computer vision and pattern recognition, Going deeper with convolutions, 1 (2015).
J. Redmon and S. Divvala, In Proceedings of the IEEE conference on computer vision and pattern recognition, You only look once: Unified, real-time object detection, 779 (2016).
W Liu and D. Anguelov, In Computer Vision-ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, Ssd: Single shot multibox detector, 21 (2016).
R. Girshic, J. Donahue and T. Darrell, In Proceedings of the IEEE conference on computer vision and pattern recognition, Rich feature hierarchies for accurate object detection and semantic segmentation, 580 (2014).
R. Rahul, S. Paliwal and M. Sharma, arXiv preprint arXiv:1901., Automatic information extraction from piping and instrumentation diagrams, 11383 (2019).
K. Simonyan and A. Zisserman, arXiv preprint arXiv:1409., Very deep convolutional networks for large-scale image recognition, 1556 (2014).
J. Long, E. Shelhamer and T. Darrell, InProceedings of the IEEE conference on computer vision and pattern recognition, Fully convolutional networks for semantic segmentation, 3431 (2015).
P. V. Hough, Method and means for recognizing complex patterns. U.S. Patent 3,069,654 (1962).
J. Canny, IEEE Transactions on pattern analysis and machine intelligence, A computational approach to edge detection, 6, 679 (1986).
S. Oh, M. Chae, H. Lee, Y. Lee, E. Jeong and H. Lee, Plant Journal, A Study on the Improved Line Detection Method for Pipeline Recognition of P&ID, 16, 33 (2020).
S. Agarwal, In 2013 international conference on machine intelligence and research advancement, Data mining: Data mining concepts and techniques, 203 (2013).
C. C. Aggarwal and H. Wang, Managing and mining graph data, Graph data management and mining: A survey of algorithms and applications, 13 (2010).
J. W Raymond, E. J. Gardiner and P. Willett, J. Chem. Inf. Comput. Sci, Heuristics for similarity searching of chemical graphs using a maximum common edge subgraph algorithm, 40, 13 (2002).
X. Yan and J. Han, In 2002 IEEE International Conference on Data Mining, gspan: Graph-based substructure pattern mining, 721 (2002).
J. Huan, W. Wang and J. Prins, In Third IEEE international conference on data mining, Efficient mining of frequent subgraphs in the presence of isomorphism, 549 (2003).
S. Nijssen and J. N. Kok, Electronic Notes in Theoretical Computer Science, The gaston tool for frequent subgraph mining, 127, 77 (2005).
M. Wörlein, T. Meinl, I. Fischer and M. Philippsen, In Knowledge Discovery in Databases: PKDD 2005: 9th European Conference on Principles and Practice of Knowledge Discovery in Databases, A quantitative comparison of the subgraph miners MoFa, gSpan, FFSM, and Gaston, 392 (2005).
N. Kiryati, Y. Eldar and A. M. Bruckstein, Pattern Recognition, A probabilistic Hough transform, 24, 303 (1991).
Y. Baek, B. Lee, D. Han, S. Yun and H. Lee, In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, Character region awareness for text detection, 9365 (2019).
R. Smith, In Ninth international conference on document analysis and recognition (ICDAR), An overview of the Tesseract OCR engine, 2, 629 (2007).
M. A. Hearst, S. T. Dumais, E. Osuna, J. Platt and B. Scholkopf, IEEE Intelligent Systems and their applications, Support vector machines, 13, 18 (1998).
A. Narayanan, M. Chandramohan, R. Venkatesan, L. Chen, Y. Liu and S. Jaiswal, arXiv preprint arXiv:1707.05005, graph2vec: Learning distributed representations of graphs (2017).
Technical Committee ISO/TC 27, Graphical symbols for use on mechanical engineering and construction drawings, diagrams, plans, maps and in relevant technical product documentation, ISO 14617-14:200 Publications (2004).
Symbols Instrumentation, International Society of Automation, Instrumentation Symbols and Identification ANSI/ISA-5.1 (2009).
T. Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan and S. Belongie, In Proceedings of the IEEE conference on computer vision and pattern recognition, Feature pyramid networks for object detection, 936 (2017).
H. Saigo, S. Nowozin, T. Kadowaki, T. Kudo and K. Tsuda, Machine Learning, gBoost: a mathematical programming approach to graph classification and regression, 69 (2009).
M. Thoma, H. Cheng, A. Gretton, J. Han, H. P. Kriegel, A. Smola, L. Song, P. S. Yu, X. Yan and K. Borgwardt, In Proceedings of the 2009 SIAM International Conference on Data Mining, Near-optimal supervised feature selection among frequent subgraphs, 1076 (2009).
R. Hu, X. Zhu, Y Zhu and J. Gan, World Wide Web, Robust SVM with adaptive graph learning, 23, 1945 (2020).
J. M. Spoor, J. Weber and J. Ovtcharova, In 2022 8th International Conference on Control, Decision and Information Technologies (CoDIT), A Definition of Anomalies, Measurements, and Predictions in Dynamical Engineering Systems for Streamlined Novelty Detection, 1, 675 (2022).
Acknowledgements
This research was supported by the Chung-Ang University Graduate Research Scholarship in 2021 and this research was supported by the H2KOREA funded by the Ministry of Education. Also, it was supported by the Human Resources Development (No. 20214000000280) of the Korea Institute of Energy Technology Evaluation and Planning (KETEP) grant funded by the Korea government Ministry of Trade, Industry and Energy.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Shin, HJ., Lee, GY. & Lee, CJ. Automatic anomaly detection in engineering diagrams using machine learning. Korean J. Chem. Eng. 40, 2612–2623 (2023). https://doi.org/10.1007/s11814-023-1518-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11814-023-1518-8