Keywords

1 Introduction

Building detection and feature detection are vital research areas in the study of computer vision. There are numerous old and ancient building sites in the Indian subcontinent region, such as Taj Mahal (Mughal era), Sixty Dome Mosque (Sultanate era), etc. Generally, the archaeologists can identify the construction period of old building by using its architectural characteristics or features. In this point of view, this research has established a computational technique for recognizing the construction period of old architectures by differentiating the building’s architecture.

In previous years, some researches have been published, where computer vision is used in archaeology sections [1, 3]. An artificial neural network based feature recognition technology is used to identify the features of the ancient structure [4]. A CNN method focuses on visualization for primitive Maya hieroglyph [5]. The deep learning method is being utilized for recognizing the ancient Roman coin [6]. China’s ancient warrior terracotta has been visualized by computer vision [7] and it is effective for 3D modeling [8]. Photogrammetric method has been enabled the image analysis of the Turkish ancient heritage site [9]. Moreover, some researches have been revealed where machine learning is also used in period identification [10, 11].

Furthermore, any technique for recognizing the building period of old architectural structures like the old building, mosque, and temple is not available. That’s why this research has committed a technique that helps the archaeologists for recognizing the construction period by detecting the old spectacular architecture.

For establishing the CNN, a deep learning model has been developed where four features detection methods are applied. These are Canny Edge Detector [12], Hough Line Transform [13], Find Contours [14], and Harris Corner Detector [15]. After utilizing these methods, three diverse architectural features have been classified which are Dome, Minaret, and Front because different old structures contain different forms of these three features. A deep feed-forward neural network [16] model has been developed where three features have been used for identifying the era. Moreover, this research has identified three ruler periods, such as the Mughal period (1526–1857), Sultanate period (1206–1526), and British period (1858–1947).

Recently a deep learning model has been expressed [17] for identifying the old era for ancient buildings. Here, only Canny Edge Detector method and two eras’ (Mughal and Sultanate) datasets have been used. The updated research has developed a more custom neural network where the remaining methods (Hough Line Transform, Find Contours, and Harris Corner Detector) has been utilized. Moreover, the British dataset is used here in addition to Mughal and Sultanate datasets.

The contributions of this manuscript are in three areas: (1) Identifying construction era based on Dome, Minaret and Front features of Mughal (1526–1857), Sultanate (1206–1526) and British (1858–1947) eras’ buildings; (2) Edge, Line, Contour and Corner elements are raised for identifying the different features (Dome, Minaret and Front) of different heritage buildings; (3) A Deep Neural Network (DNN) has been developed and applied in CNN where three features (Dome, Minaret and Front) of four different methods (Canny Edge Detector, Hough Line Transform, Find Contours, Harris Corner Detector) have been used for classifying old periods.

2 Era Identification Process

This research has illustrated a computational archaeological model that has described how a program can identify the construction era of an old building. At first, a photo was sent to Canny Edge Detector, Hough Line Transform, Find Contours, and Harris algorithm functions. These techniques have been used for collecting the features of Dome, Minaret, and Front from the old building image. The architecture and process of the era identification model have been illustrated in Fig. 1.

Fig. 1.
figure 1

Process and steps of the era identification for Indian subcontinent old buildings

3 Experimental Methods

3.1 Canny Edge Detection

Edge recognition covers a diversity of mathematical processes that goals at identifying the points in an image. In this experiment, Canny edge detection method has been utilized for acknowledging the edges from a photo. At first, vertical direction (Gy) and horizontal direction (Gx) were filtered by finding the gradient intensity of an image. After applying the Canny algorithm, gradient was constantly perpendicular to edges and it was rounded to the angles for illustrating vertical, horizontal, and diagonal directions. The direction and edge gradient [18] for each pixel were found as follows:

$$ {\text{Edge}}\_{\text{Gradient}}\,(G) = \sqrt {G_{\varvec{x}}^{2} + G_{\varvec{y}}^{2} } $$
(1)
$$ {\text{Angle}}\,(\theta ) = \tan^{ - 1} \left( {\frac{{G_{y} }}{{G_{x} }}} \right) $$
(2)

3.2 Hough Line Transform

Hough line transform is a feature extraction technique. It was related to the line identification on the picture. In this technique, the parameters m, b mentioned [19] for Cartesian coordination and parameters r, θ for Polar coordinate system [20]. These coordination approaches were used for identifying the line of ancient buildings. In this research, a line had been represented as y where, y = mx + b or in parametric form, as r = x cos θ + y sin θ. Hence, the line equation for an image is as follows:

$$ y = \left( { - \frac{\cos \theta }{\sin \theta }} \right)x + \left( {\frac{r}{\sin \theta }} \right) $$
(3)

3.3 Find Contours

Contours can be narrated entirely as a curve or turn joining all the continuous points’ boundary and it is an adjuvant tool for shape or object detection. Image Moment technique has been used for finding the counters of an image. The spatial structure moment of an image was declared as mij where i and j are nested for loop order. This image moment had detected different features from the ancient buildings by matching different shapes. The image moment [21] was computed as:

$$ m_{ij} = \sum\limits_{x,y} {({\text{array}}(} x,y) \cdot x^{j} \cdot y^{i} ) $$
(4)

3.4 Harris Corner Detection

Harris corner detection method extracts the corners and concludes the features of an image. It generally searches the corners in image intensity for a prolapsed of (u, v). In this method, there is a Window function that is Gaussian Window and gives weights to the image pixels down. In Eq. (5) [22], E is the distinction between the original and the moved window. Here, I parameter is the image intensity. The window’s translocation in the direction x is u and the direction y is v. Window w(x, y) is at (x, y) position. The I(x + u, y + v) portion is moved window’s intensity. Last portion I(x, y) is the original intensity. The window function w(x, y) is a Gaussian function.

$$ E(u,v) = \sum\limits_{x,y} {w(x,y)[I(x + u,y + v) - I(x,y)]^{2} } $$
(5)

3.5 Training Dataset and Classification

For recognizing the construction period or era three classifications had been created for Sultanate, Mughal, and British periods. After using the feature detection techniques, a Decision Tree [23] had been created based on output of above techniques. Decision tree creates classification in the form of a tree formation. It improves an “if-then” ruleset which is reciprocally exclusive. These rules are learned orderly using the training data one at a time. Table 1 showed the types of data classification of the training dataset. Here, the data were classified by three periods (Mughal era, Sultanate era, and British era). Every era contains three different features (Dome, Minaret, and Front) of four different methods (Canny Edge Detector, Hough Line Transform, Find Contours, and Harris Corner Detector).

Table 1. Training dataset and classification of Mughal, Sultanate and British eras

4 Deep Neural Network (DNN) Model

In this experiment, a DNN approach has been developed. In the input layer of DNN, there are five nodes (x1, x2, x3, x4, and bias unit). The inputs of the input layer have been displayed in Table 2, Figs. 1 and 2. The mathematical structure [24, 25] of the node at neural network in this experiment has been illustrated in Fig. 2. Here, a is Activation, b is Bias and W is the ‘Weight’ of input layer. A bias unit allows changing the activation to the left or right, which is used for successful learning.

Table 2. Input and inputs of DNN
Fig. 2.
figure 2

Deep neural network (DNN) for construction era identification

From Fig. 2, the equation for each activation node (a) is as follows:

$$ {\text{For hidden layer 1:}}\quad a_{i}^{(L)} = W_{i}^{L - 1} x_{i} + b_{i}^{L - 1} $$
(6)
$$ {\text{After hidden layer 1:}}\quad a_{i}^{(L)} = W_{i}^{L - 1} a_{i}^{(L - 1)} + b_{i}^{L - 1} $$
(7)

Here, Index = i; Activation = a; Current Layer = L; Previous Layer = L − 1; Input node = x; Bias Unit = b. The computational algorithm of the developed DNN is represented as follows:

Layer, L= 2 (Hidden Layer 1):

$$ a_{1}^{(2)} = f(W_{1}^{(1)} x_{1} + W_{4}^{(1)} x_{2} + W_{7}^{(1)} x_{3} + W_{10}^{(1)} x_{4} + b_{1}^{(1)} ) $$
(8)
$$ a_{2}^{(2)} = f(W_{2}^{(1)} x_{1} + W_{5}^{(1)} x_{2} + W_{8}^{(1)} x_{3} + W_{11}^{(1)} x_{4} + b_{2}^{(1)} ) $$
(9)
$$ a_{3}^{(2)} = f(W_{3}^{(1)} x_{1} + W_{6}^{(1)} x_{2} + W_{9}^{(1)} x_{3} + W_{12}^{(1)} x_{4} + b_{3}^{(1)} ) $$
(10)

Layer, L= 3 (Hidden Layer 2):

$$ a_{1}^{(3)} = f(W_{1}^{(2)} a_{1}^{(2)} + W_{4}^{(2)} a_{2}^{(2)} + W_{7}^{(2)} a_{3}^{(2)} + b_{1}^{(2)} ) $$
(11)
$$ a_{2}^{(3)} = f(W_{2}^{(2)} a_{1}^{(2)} + W_{5}^{(2)} a_{2}^{(2)} + W_{8}^{(2)} a_{3}^{(2)} + b_{2}^{(2)} ) $$
(12)
$$ a_{3}^{(3)} = f(W_{3}^{(2)} a_{1}^{(2)} + W_{6}^{(2)} a_{2}^{(2)} + W_{9}^{(2)} a_{3}^{(2)} + b_{3}^{(2)} ) $$
(13)

Layer, L= 4 (Hidden Layer 3):

$$ a_{1}^{(4)} = f(W_{1}^{(3)} a_{1}^{(3)} + W_{4}^{(3)} a_{2}^{(3)} + W_{7}^{(3)} a_{3}^{(3)} + b_{1}^{(3)} ) $$
(14)
$$ a_{2}^{(4)} = f(W_{2}^{(3)} a_{1}^{(3)} + W_{5}^{(3)} a_{2}^{(3)} + W_{8}^{(3)} a_{3}^{(3)} + b_{2}^{(3)} ) $$
(15)
$$ a_{3}^{(4)} = f(W_{3}^{(3)} a_{1}^{(3)} + W_{6}^{(3)} a_{2}^{(3)} + W_{9}^{(3)} a_{3}^{(3)} + b_{3}^{(3)} ) $$
(16)

Layer, L= 5 (Output Layer):

$$ h_{W,b} (x) = a_{1}^{(5)} = f(W_{1}^{(4)} a_{1}^{(4)} + W_{2}^{(4)} a_{2}^{(4)} + W_{3}^{(4)} a_{3}^{(4)} + b_{1}^{(4)} ) $$
(17)

In Fig. 2, we have applied node to also denote the inputs to the network. The nodes labeled “+1” are called bias units corresponding to the intercept. We denoted ni, the number nodes (without bias unit) in neural network. Weight W (L−1)i denoted the parameter which connected with the link between i unit in layer L and this weight comes from previous layer L − 1. The bias units don’t have inputs and links going into them. The bias units always output the value +1. Here, we have denoted the activation a (L)i of unit i in layer L. For L = 1, we declared a (L)i  = xi to denote the ith input. The parameters W, b defines the hypothesis h (x)w,b that outputs a real number.

5 Convolution Neural Network (CNN) Model

A CNN has been created which is based on the developed DNN model. Generally, the CNN consists of multiple hidden layers having convolution and pooling layers. Here, CNN has been developed on three convolution layers, two max-pooling layers, two fully connected layers, and a dropout (Fig. 3). After using the feature detection methods we had got two datasets: training set and test set. Then the neural network model provides the prediction result of the old period.

Fig. 3.
figure 3

CNN model for era identification of the old or ancient building

6 Results and Analysis

The outputs of the developed model consist of identifying the construction era, where a program provides a probable output by learning the ancient buildings’ features. This work has indicated how a computer program learns several old buildings’ features such as Dome, Minaret, and Front. For evaluating the performance of such systems, the data in the matrix has been used. The CNN model has been trained with the modified dataset and calculated the accuracy. Figure 4 has shown the composition of the CNN model, where the process successfully predicted the period from the picture of the ancient or old heritage building. Accuracy is also used as a statistical grade of the test calculations. The law for calculating accuracy is as follows:

Fig. 4.
figure 4

Results of construction era identification of old buildings by using CNN model

$$ {\text{Accuracy}} = \frac{{\left( {{\text{TP}} + {\text{TN}}} \right)}}{{\left( {{\text{TP}} + {\text{TN}} + {\text{FP}} + {\text{FN}}} \right)}} \times 1 0 0 {\text{\% }} $$
(18)

where, TP = True Positive; FP = False Positive; TN = True Negative; FN = False Negative.

In this research, total tested 500 images data have been used. The Sultanate era contains 270 data, Mughal era 130, and British era 100 data. We get TP = 254, TN = 227, FP = 3, FN = 16. Following the above Eq. (17) for the raw data, 96.20% accuracy achieved from this research.

7 Conclusion

This study has represented a model that demonstrates how an intelligent program can identify the construction era from an ancient or old heritage building. This research is mainly focused on the construction period and features of the heritage building by using artificial neural network and feature detection techniques. This research achieved much better accuracy over the previous method by using three periods (Mughal, Sultanate, and British eras) and four feature detection methods (Canny Edge Detector, Hough Line Transform, Find Contours, and Harris Corner Detector).

Still there are some limitations to this study. There are further issues to be resolved. Furthermore, if the model is tested on the low pixel picture, it cannot determine the target result. This drawback would also lead this research to the future work to make the raised model more robust and more significant to recognize precise objects from the image. These issues will be looked forward to solve in the future experiment.