Key Information Recognition from Piping and Instrumentation Diagrams: Where We Are?

Hantach, Rim; Lechuga, Gisela; Calvez, Philippe

doi:10.1007/978-3-030-86159-9_36

Rim Hantach¹⁰,
Gisela Lechuga¹⁰ &
Philippe Calvez¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12917))

Included in the following conference series:

International Conference on Document Analysis and Recognition

1828 Accesses
2 Citations

Abstract

Nowadays, the increase of technical drawings in different industries such as construction, mechanical and the energy sector makes the task of information analysis and interpretation more complex and fastidious. In this context, the automatic digitization of these drawings is becoming important. Piping and instrumentation diagram (P&ID) is a type of engineering drawing where the flow and components are represented by lines, texts and symbols. In this paper, we propose an industrial research approach in order to detect symbols, texts and lines. We focus on the application of recent computer vision and natural language processing techniques to automatically detect and recognize the different components. First experimental results on real-world data show that the proposed pipeline can achieve competitive results.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Digitize-PID: Automatic Digitization of Piping and Instrumentation Diagrams

Extraction of line objects from piping and instrumentation diagrams using an improved continuous line detection algorithm

Article 03 April 2023

A unified framework for authenticating topology integrity of 2D heterogeneous engineering CAD drawings

Article 05 October 2016

Keywords

1 Introduction

Piping and instrumentation diagrams, P&IDs, are a type of technical drawing that provide a graphical representation of a process system that include the piping, vessels, control valves, instrumentation, and other process components and equipments in a given system. Such diagrams are the primary schematic drawing used for laying out a process control system’s installation and is most commonly used in the engineering field, such as during the designing of a manufacturing process in a processing plant. These facilities usually require complex chemical or mechanical configurations that are represented through the use of symbols and connections on a P&ID. They can be used to streamline a process, keep track of different pieces of equipment or guide the design of a new facility.

There are standard symbols used to represent the components in P&IDs, this representation is not made to scale, which add to the complexity of the task. Most of these symbols are also associated to words, letters or numbers to further identify and specify the components being represented. Even though standard symbols are used there can exist some variations on how they are depicted across P&IDs from different origins and industries.

In recent years, several attempts have been made at performing this task of automated reading. Given the schematic aspect of P&IDs, the goal of this work is to identify different symbols, their associated text and the different connections between components.

The rest of the paper is structured as follows: Sect. 2 describes some P&ID approaches in the state-of-the-art. Section 3 details the proposed pipeline and some experiments done so far. Section 4 concludes the paper and discusses possible future directions.

2 State of the Art

Digital engineering drawings have seen great attention across industries. Analyzing and processing these drawings are becoming important for decision making. In this context, several efforts have been made in different domain applications such as planning, construction, oil, gas, etc. In this section, we detail some works related to P&ID data extraction and automatic digitizing.

Dong-Yeol Yun et al. [8] propose a new object detection approach in order to recognize the different symbols in P&ID. To do that, three steps have been defined: i) region proposal using a sequential image-processing, ii) dataset annotation iii) regions classification where a convolutional network has been trained to classify the proposed regions and extract the information related to the symbols. In [7], authors introduced a new texts and symbols recognition in a P&ID based on local binary pattern (LBP) and spatial pyramid matching (SPM) for image feature extraction. Arroyo et al. [1] aim to convert the P&ID documents into an object-oriented plant description. This is by applying optical recognition and semantic analysis.

Premanand Ghadekar introduces an end-to-end GPU algorithm [3]. It starts by extracting the meta data of each instance related to the instruments, valves and equipment’s present in the P&ID. Then it detects the lines and texts. The authors combine computer vision methods and deep learning models (Retina net with a Resnet-101 backbone) to classify symbols, semantically analyze texts and detect connections between equipment’s. In [5], authors use template matching for symbols recognition, these have been extracted and registered automatically in the database. For line and texts recognition, the sliding window method and aspect ratio calculation has been used. A recent paper extends the previous works by applying graph search to detect the connections between symbols in P&IDs [6]. This paper is based on state-of-the-art methods to detect symbols where a Convolutional Neural Network has been used, recognize and interpret text and detect connections between symbols across lines. The proposed method can be applied in different applications such as diagram search, equipment-to-sensor mapping, and asset hierarchy creation.

3 Methodology

In this paper, we propose a pipeline for automatically extracting key information from P&ID’s. This pipeline is based on computer vision methods to detect lines, text and symbols.

3.1 Line Detection Using Kernels

As the lines encountered in a P&ID are either vertical or horizontal and always straight in shape the use of convolutional kernels for their detection becomes well adapted. We tested a combination of a vertical and a horizontal kernels to make the line detection. Since thicker and thinner lines are connecting the different elements, we take this information within our kernels in order to capture as many line width variation as possible. We can see an example of detected lines in Fig. 1a.

As it can be seen in the case of the line detection, improvements are still needed. For future work we aim at removing spurious line detections, Some possible implementation could be to consider the areas with text as non line regions, as not to mix the detection of text and lines. This could also be extended to the regions with detected symbols. Besides, an improvement in the detection of dotted lines would also provide an easier to read result.

3.2 Text Detection Using Character Region Awareness for Text Detection

Several algorithms exists in the literature to perform this task, we chose to implement the CRAFT algorithm [2] and apply it to P&ID scanned images. Without any pre-processing of the image, the results show a good detection performance. We can see an example of detected text bounding boxes in Fig. 1b.

3.3 Symbol Detection

For this section we consider two possible approaches when a small amount of data is available and when we have access to a bigger dataset. Having very few examples in our dataset and the lack of annotated images, we opted for a morphological approach using the circle Hough Transform. As P&ID come in different sizes and resolutions, different symbols may come in different sizes. To try this approach, we opted to focus on the detection of primary device and control symbols. In order to have a robust detection, we varied the radius of the circles in order to detect all of them. We provide an example of detected primary device and control symbols in Fig. 2a.

When datasets contain a large number of examples, other approaches can be considered, such as the use of deep neural networks. To illustrate that, annotated and trained a YOLOv5 neural network [4] to detect gate valves. We only had access to 8 P&ID’s with varying amounts of gate valves. In order to have enough examples for the training we realized random crops around the object of interest. We can see the detected valves in Fig. 2b.

The use of object detection neural network also provide for an adaptable tool to detect different types of symbol, given access to a sufficiently big dataset.

4 Conclusion

In this paper we proposed a pipeline for automatically extracting key information, such as text, symbols and connections from P&ID’s. This pipeline is based on various computer vision and deep learning methods inspired from state of the art approaches. The proposed symbols detection allows for a high flexibility and can easily be trained on new classes, given large volume of annotated data or the right morphological approach is used.

Despite the recent advances in the automatic feature extraction of P&ID’s using computer vision and machine learning approach, it remains a challenging task. The lack of a benchmark dataset and open access P&ID’s further increase this challenge. As text and symbol detection methods improve the focus should be given to finding the links between different symbols.

As future works, our proposed pipeline can be improved by combining recent models for better performance. It is also possible to develop a new deep learning method for symbols detection and text associations. Experiments on different real-world datasets will also be tested.

References

Arroyo, E., Hoernicke, M., Rodríguez, P., Fay, A.: Automatic derivation of qualitative plant simulation models from legacy piping and instrumentation diagrams. Comput. Chem. Eng. 92(C), 112–132 (2016)
Google Scholar
Baek, Y., Lee, B., Han, D., Yun, S., Lee, H.: Character region awareness for text detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9365–9374 (2019)
Google Scholar
Ghadekar, P.: Intelligent agent for automatic engineering diagram digitization with deep learning. Biosci. Biotechnol. Res. Commun. 13, 01–06 (2020)
Google Scholar
Jocher, G., et al.: ultralytics/yolov5: v5.0 - YOLOv5-P6 1280 models, AWS, Supervise.ly and YouTube integrations (2021)
Google Scholar
Kang, S.O., Lee, E., Baek, H.K.: A digitization and conversion tool for imaged drawings to intelligent piping and instrumentation diagrams (p&id). Energies 12, 2593 (2019)
Article Google Scholar
Mani, S., Haddad, M.A., Constantini, D., Douhard, W., Li, Q., Poirier, L.: Automatic digitization of engineering diagrams using deep learning and graph search. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2020
Google Scholar
Tan, W.C., Chen, I.M., Tan, H.K.: Automated identification of components in raster piping and instrumentation diagram with minimal pre-processing. In: 2016 IEEE International Conference on Automation Science and Engineering (CASE), pp. 1301–1306 (2016)
Google Scholar
Yun, D.Y., Seo, S.K., Zahid, U., Lee, C.J.: Deep neural network for automatic image recognition of engineering diagrams. Appl. Sci. 10(11), 4005 (2020)
Google Scholar

Download references

Author information

Authors and Affiliations

CSAI Lab ENGIE, Paris, France
Rim Hantach, Gisela Lechuga & Philippe Calvez

Authors

Rim Hantach
View author publications
You can also search for this author in PubMed Google Scholar
Gisela Lechuga
View author publications
You can also search for this author in PubMed Google Scholar
Philippe Calvez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rim Hantach .

Editor information

Editors and Affiliations

Boise State University, Boise, ID, USA
Elisa H. Barney Smith
Indian Statistical Institute, Kolkata, India
Umapada Pal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hantach, R., Lechuga, G., Calvez, P. (2021). Key Information Recognition from Piping and Instrumentation Diagrams: Where We Are?. In: Barney Smith, E.H., Pal, U. (eds) Document Analysis and Recognition – ICDAR 2021 Workshops. ICDAR 2021. Lecture Notes in Computer Science(), vol 12917. Springer, Cham. https://doi.org/10.1007/978-3-030-86159-9_36

Download citation

DOI: https://doi.org/10.1007/978-3-030-86159-9_36
Published: 02 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86158-2
Online ISBN: 978-3-030-86159-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)