1 Introduction

Railroad tracks are a prime focus in the railroad environment as they largely benefit the community through the provision of passenger comfort, effective railroad operations, and train speed modulations. A railroad track comprises of two rail lines, the track space between these lines and the lateral area (right and left side of the rail lines) comprising of the ballast as shown in Fig. 1a. Railroad track monitoring is essential to ensure the effective maintenance of operational rail lines in the safety–critical railroad infrastructure. Railroad track extraction is an important pre-requisite step in the track monitoring module. Track extraction is a crucial step as it helps to rapidly segment the railroad track (target area). This extraction of track area facilitates tasks such as inspection and detection of rail surface defects [1, 7], automated fastener classification and defect detection [4], train driver assistance and obstacle identification [2, 12] provision of autonomous train control systems [3], computation of asset-sighting distance [13] and vegetation condition monitoring [14] in a convenient, and cost-effective manner.

Fig. 1
figure 1

a Study Image 1 b Study Image 2 c Study Image 3 d Study Image 4 with their corresponding image histogram in RGB

There has been an advent of various railroad track IAS in conjunction with research on models using image processing and computer vision techniques for track detection and extraction. Dynamic Programming is used for the extraction of railroad track space through video sequences captured by a camera placed in front of a train engine [2]. A combination of image processing algorithms and line segment detector (LSD) is used for track detection in images captured through cameras hung below a train coach [4]. The camera is fixed behind the windshield and the track extraction is carried out using techniques such as priori shape model and gradient information [5, 6]. The semantic segmentation method is computed in the visual-based track inspection (VITS) collected images in order to extract rail tracks and locate ROIs using U-Net [7]. In fact, there are several other methods which include HOG feature extraction for track detection in images captured through a camera placed in the vehicle [8], histogram-based track extraction (HBTE) where IAS is under a test train [9], geometry constraints-based track extraction in images captured from the camera installed in the front of a train [10], track extraction based on projection profile (TEBP) in rail track images captured by a camera fixed on a train [11]. Also, the target tracks have been segmented by feature analysis in the hue channel [1]. However, this method uses a portable track defect vision inspection prototype for image acquisition. Angle alignment measure is computed for real-time railroad extraction in images captured using a mobile phone [12]. Rail track detection is performed in image data captured from a rail-mounted vehicle camera using cubic Bezier curves [13]. HOG and mean-shift clustering are used for the detection of rails in images captured from DSLR camera placed within a trolley[14].

However, these aforementioned IAS have varying inevitable drawbacks which include high cost, limited detection range, railroad becomes closed to normal train traffic during operations, and inaccessibility to remote geographical locations. Therefore, researchers are exploring drone-based image acquisition that may come to rescue during such scenarios. Usage of drones or unmanned vehicles for railroad infrastructure monitoring is the latest trend and currently, the camera sensors are the richest data sources[15,17,17]. Drones are lightweight UAVs (Unmanned Aerial Vehicles) which provide various advantages such as cost-effectiveness, efficient image acquisition while capturing track images without blocking the railroad traffic, ease of control, and flexibility while aiming areas inaccessible to human inspectors, inspection trains or rail-mounted vehicles.

Although the drone-based image acquisition module offers various advantages, still research on drone-based railroad track extraction faces the following challenges:

  1. 1.

    Due to different sunlight conditions (sunny, partially sunny/cloudy), there might be illumination inconsistency. Also, partial occlusion due to infrastructures (such as overhead catenaries along with respective support structures), rail reflectance properties, shaking of the drone along with other environmental aspects leads to uneven and low contrast and brightness of the captured DI. This necessitates an adaptive method for track extraction in uneven illumination.

  2. 2.

    Rail line positions vary in drone images. The angle of the HD camera installed on the drone is sensitive to various environmental factors (which includes turbulence and wind) as well as operators. Although drone can balance itself through GPS flight mode still positions of rail lines are extremely variable in drone images. Hence, these variances in the rail line positions lead to complex track extraction scenarios.

  3. 3.

    Flying drones at various flight heights may lead to capturing different views of the same area/different areas. Also, the presence of poles lying beside rail lines, oil lines between rail lines, track area under grass, train running over railroad tracks are some complex railroad environments in the captured DI which make track extraction a difficult task.

Due to these challenges, it is difficult to exercise the above discussed various IAS-based track detection and extraction models on drone railroad track images. This acts as the motivation to overcome these shortcomings and propose a novel railroad track extraction framework that can locate the rail lines and extract the railroad track from drone images acquired in uneven sunlight intensity, at varying flight heights, and with different rail line positions.

1.1 Contributions

The main contributions of the proposed work are summarized as follows:

  1. 1.

    We propose a novel and adaptive railroad track extraction method for drone images termed as DroneRTEF. This framework is based on computing colour feature-based adaptive image enhancement of DI to locate rail lines and Hough parameter space-based novel image analysis method for track extraction. To the best of our knowledge, this is the first attempt to examine the hybrid approach of the colour feature and Hough parameter space analysis for railroad track extraction in drone images captured at varying flight heights, with different rail lines orientations, and in uneven illumination.

  2. 2.

    Due to the unavailability of a standard railroad track dataset comprising of drone images captured in diverse railroad environments, the image acquisition is also performed to capture datasets of track DI for the research work.

  3. 3.

    The performance of the proposed rail line detection module is evaluated and validated with different experiments upon 3 different metrics and is compared with another rail line detection algorithm.

The organization of the article is as follows: In Sect. 2, algorithms related to railroad track detection and extraction are discussed. Image acquisition and dataset description are presented in Sect. 3. The theoretical background is presented in Sect. 4. The proposed novel adaptive railroad track extraction framework is discussed in Sect. 5. In Sect. 6, we present experimental results and discussions. In this section, we have also discussed the evaluation and validation of the proposed method. Finally, this paper is concluded along with future work in Sect. 7.

2 Related work

A typical approach for railroad track extraction can be carried out using image processing algorithms [18]. These algorithms can be generalized into two steps: image enhancement and image analysis. The image enhancement method is used to pre-process the track drone image and then segment it in order to highlight the rail line feature information. This is advantageous as it helps to eliminate or weaken the interference of background information and identify rail lines. The image analysis step is performed by extracting the features (shape features/geometric features) of the identified rail lines. Several studies have been previously reported in the literature on railroad track image enhancement and analysis for track detection and subsequent extraction in DI.

The local weber-like contrast(LWLC) algorithm has been used for the enhancement of rail images [19]. However, this method is performed on a local window after rail track extraction step. Therefore, it is unable to give adequate insight into if LWLC is an effective global image enhancement step for track extraction in high-altitude drone images. Colour pattern template matching is computed for constraining rail search locations by detecting railroad profile using ROI technique [20]. Although it has been noted that these results are only possible if the template of railroad of interest is used. Histogram equalization (HE) is also a popular method for image enhancement in global lighting conditions [19, 20, 21]. However, it poses disadvantages such as loss of detail information from the image and noise amplification [19]. Mask bank technique is found advantageous for locating railroad tracks in aerial images [22]. However, this mask bank contains masks that run only from top to bottom. Also, this method works in two phases: training and detection where further testing is required to prove functionality. Nonetheless, masks prove to be versatile and provide a quick way to access information. Also, colour is a distinctive and powerful feature for identification in the railroad environment[1, 23, 24]. It has been observed that colour spaces in conjunction with masks are quite helpful for rail line identification in DI while working in complex railroad environments [22, 25]. Through this critical analysis, it is observed that colour space-based masking seems an effective image enhancement method for the segmentation and identification of rail lines in railroad track DI captured in uneven illumination.

In earlier studies, a range of image processing techniques has been developed for image analysis of rail lines in DI. Linear regression model is used for detecting valid rail lines [26]. However, the regression method is less robust to noisy data [27]. Several other rail line detection algorithms have also been discussed such as correlation matching, Touzi filters, CNN-based semantic segmentation [28]. All these methods hint at using a prior infrastructure ground coverage mask for improved performance [28]. Canny detector and k-NN Mean Euclidean distances classifier have also been used for the detection of rail tracks [29]. In this work, k-NN classifier faces challenges as it is slow, consumes a lot of memory, and requires rail edges for training. Deep drone vision uses 2-step CNN for computing rail track detection [30]. However, this method is computationally intensive as it requires training and the dataset faces limitations due to its inability to accommodate varying degrees of illuminations at different times of the day. Hough transform (HT) is based on the duality of line and can be used for railroad track extraction [19, 31]. Hough transform and cumulative grey value of each pixel column have been implemented for track extraction in drone images captured only at 30 m [19]. Several variants of this transform have already been discussed in the literature [17, 31, 32]. Also, Probabilistic Hough transform (PHT) [31] is one of the variants. However, PHT cannot be used instead of HT pertaining to the computational complexity of PHT [33]. Hough transform (an under constraint line fitting algorithm) serves advantageous over other line detection algorithms such as RANSAC (constraint bound) [6, 17] and Least Squares Fit (over constraint). Also, the Hough transform model can handle a higher percentage of outlier points and is best suited for highly noisy data [34]. Therefore, inspired by these successes, HT is a suitable solution for geometrical representation of Hough parameter space of rail lines oriented at varying angles. Nonetheless, the detection of rail lines in drone images captured at different altitudes is another essential step for track extraction at different flight heights. In order to fulfil this aim, ground sample distance (GSD) method [35]is useful as changing the altitude of the drone alters the GSD of the image which in turn helps to detect valid rail lines (rail line pairs) at different flight heights and subsequently extract the railroad track. Alongside, the other advantages of the method include: global image transformation of drone images which provides adequate insight into track extraction, template of the railroad of interest is not required, ground truth is not required at any stage of the framework, training is not required which makes the algorithm less computationally intensive.

Therefore, our aim is railroad track extraction in DI captured with uneven illumination, varying rail line orientations, and at different flight heights. To achieve this aim, we propose the novel DroneRTEF framework in this paper as follows:

  1. 1.

    Present a global image enhancement algorithm termed as adaptive colour space-based masking (ACSM). This algorithm is adaptive as it computes colour space to enhance DI captured under uneven natural light and then creates track mask to segment these images and identify rail lines in them.

  2. 2.

    Propose Hough parameter space analysis-based novel Hough transform-ground sample distance(HT-GSD) algorithm for rail line detection and track extraction. In this image analysis method, we first detect valid identified rail lines at varying orientations and flight heights, which then augments segmenting the required railroad track area.

3 Image acquisition and dataset description

3.1 Study area and data sets used

In this work, the image acquisition has been carried out using DJI phantom quadcopter. This drone constitutes a high definition 4 K resolution RGB colour camera, with an integrated GPS unit in order to capture geotagged standardized RGB (sRGB) images each of size 3000 × 4000 (\(I_{h} \times I_{w}\)) pixels. The acquired track datasets have about 415 track DI, captured during the course of various flights taken over railroad tracks at different track locations near Roorkee, Haridwar district, India. These DI are field acquired frame by frame at varying flight heights and with different rail line orientations, different locations, different time durations and have managed to capture gamut of variations in complex railroad environments under different sunlight illumination.

Some of these railroad track images have been selected as study images as shown in Fig. 1. These images are taken from the field captured track datasets for evaluating the proposed framework. The details of these study images, as mentioned in Table 1, include their data ID, date of acquisition (DOA), flight height (\(F_{h}\)) (in metres (m)), central latitude and longitude, GSD, illumination conditions and the total number of images in the dataset from which the corresponding study image has been selected. The railroad environment scenarios in which these study images are captured are described as follows:

Table 1 Dataset description

Scenario 1: Study image 1(D1) as shown in Fig. 1a is captured at 4:48p.m. at a flight height of 22 m on a sunny day with little occlusion and constitutes poles beside tracks and oil lines.

Scenario 2: Study image 2 (D2) shown in Fig. 1b is captured on a bright sunny day at 6:04 p.m. at 25 m flight height and showcases illumination irregularity with partial occlusion on the track. The other components constitute train running over a railroad track, catenary wires, dense bushes along the track.

Scenario 3: Study image 3(D3) in Fig. 1c is also captured at 2:50 p.m. at 25 m. Even though images both in Figs. 1b and c are captured at the same height still Fig. 1c image showcases a low brightness (dark) environment. It also comprises of assets along the track and overhanging catenary wires.

Scenario 4: Study image 4(D4) in Fig. 1d is captured at 11:53 a.m. at 11 m on a little brighter day and showcases varying rail line orientations along with track areas covered with vegetation and shadowed by catenary wires.

3.2 Challenges faced while processing DI

As discussed in Introduction (Sect. 1), DI show variations in terms of railroad tracks captured under different sunlight intensity, with varying rail line orientations, at different flight heights and in complex railroad environment scenarios. These variations make it challenging to process DI for rail line segmentation and identification. In such scenarios, the following observations have been made:

  1. 1.

    Image characteristics vary under challenging railroad environment imaging conditions. Each of the study image(D1, D2, D3, D4)plots in Figs. 1a–d showcase the pixel intensity count for red, green, and blue channels of the corresponding DI in the form of red, green, and blue histograms, respectively. It can be observed that plots of different DI exhibit different histogram behaviours when viewed in RGB colour space.

  2. 2.

    Rail lines are a constant feature in all the DI. However, it has been inferred from the histogram analysis that no uniform histogram pattern indicating the presence of rail lines has been observed among the colour channel(s) of different images. This is because the rail line area is very small and the external effects (such as illumination) are quite inconsistent. Therefore, it is difficult to accurately segment and identify the rail lines from the background using a histogram-based thresholding method that may be useful for the creation of a railroad track mask comprising of only the rail lines.

  3. 3.

    The aim is to segment and identify only rail lines in DI. For this purpose, various thresholding algorithms like Otsu’s method[36], entropy-based thresholding [37] have also been attempted. However, it has been observed that it is difficult to apply a common thresholding algorithm for all images in order to obtain a track mask.

Therefore, this makes it a challenging task to devise an algorithm to create a track mask in order to segment and identify the rail lines in DI captured under varied image acquisition scenarios in the railroad environments.

4 Theoretical background

In view of the discussion in Sect. 2, it is essential to address non-uniform illumination, rail line orientations, and change in flight heights in order to detect rail lines and extract railroad tracks from DI. As discussed in Sect. 3, there is no uniform histogram pattern in order to threshold only the rail lines(rail line area) in images and even other thresholding algorithms do not hold good results. Therefore, colour space transformations are essential for track mask creation to identify only the rail lines in DI with uneven illumination. Also, the analysis of the GSD is essential as it facilitates track extraction at various flight heights and rail line orientations. To fulfil the aforementioned aims, we have critically analysed the following parameters.

4.1 Colour space transformations

Colour is a fundamental descriptive property of objects in various object identification and segmentation frameworks [38,39,40]. However, perception subjectivity and extensive need of the application draw an uncertainty over which colour space fits a certain application during the segmentation process. In our work, no prior preference has been given to any specific colour models (colour spaces). The human eye is considered more sensitive to the colour changes than the grey-scale variations. Hence, here, we have concentrated on the following colour models: RGB, HSV, L*a*b*, YCbCr [40]. The colour space transformation functions are described in Table 2.

Table 2 Colour space transformations

RGB colour model is widely used for the storage of digital data in railways [44]. The colour space is additive in nature and is defined by the sum combination of three primary colours which are represented by red (R), green (G), blue (B) chromaticities. The aim is to obtain a railroad track mask which comprises of only the rail lines while the rest of the background (comprising of the vegetation, sleepers, ballast, poles lying beside the rail lines) in DI is removed. However, the RGB colour space holds disadvantages (1) It is device-dependent (2) It is associated with an inherent problem of mixing the luminance and the chrominance data(present in all three channels) [41, 42].

Hue, saturation and value (HSV) are a cylindrical transformation [23] where H,S,V components denote dominant wavelength, purity of the colour and intensity value, respectively. The H and S are invariant to shading and shadows [40]. Also, hue(H) component is found to be independent of illumination conditions[1] and specularities [40]. The equations for RGB to HSV space transformation function T are stated in Table 2. Even though the HSV colour space is device-dependent but it presents with an advantage that the colour information is intact only in one channel (H). It also presents the advantage of the separation of luminance from chrominance information[42]. Also, the colour space provides an advantage of easily selecting the desired hue which can be slightly modified by adjusting the saturation and value [42, 45]. HSV histograms for images are as shown in Appendix A.

Luminance-based L*a*b* (CIELAB or Lab) colour space stems from the CIE colour model and is considered as perceptually uniform colour space [24, 40]. The L*a*b* colour space is device independent, and the colour information is encoded in only two colour channels (a* and b*). The RGB to L*a*b* space conversion is carried out using the respective transformation function T stated in Table 2.

YCbCr colour space is considered perceptually uniform with an approximation [40]. The YCbCr colour space is device-dependent and the colour information is encoded in both Cb and Cr (only two colour channels). Additionally, it is advantageous as it separates the RGB into chrominance and luminance information[42]. The RGB to YCbCr conversion formulae T is stated in Table 2.

4.2 Edge characterization and geometrical representation

The prerequisite to rail line detection process is creation of an edge map (discussed in Sect. 5.2). This edge map comprises the detected edges from the railroad track mask by using an edge detection algorithm. Such an edge detection algorithm should be able to provide localization and detect right number of rail line edge responses. Localization corresponds to detection of rail line edges at the right location.

The accurate shape and geometrical representation of edge map in Hough parameter space are necessary for rail line detection as many objects in the railroad environment are characterized by straight lines. It is essential to determine which edge belongs to which line and record all possible lines upon which edge points (of the edge map) lie (discussed in Sect. 5.3).

4.3 GSD calculation

GSD is defined as the distance between the pixel centres of the camera sensor as measured on the ground and indicates how big each pixel is on the ground. As the flight height changes, the corresponding pixel size changes (GSD (in cm)), as shown in Table 1. This change in the pixel size leads to a subsequent increase or decrease in the number of pixels between two valid rail lines of the rail line pair. Therefore, the calculated number of pixels is essential during rail line pair selection (discussed in Sect. 5.3.3). Consequently, GSD helps in railroad track extraction from DI at any given fight height \(F_{h}\) and makes the algorithm adaptive. The calculated distance,\(GSD\), is a function of the flight height, sensor dimensions, and image measurements (in pixels). For any pixel, the \(GSD\) mapping on the ground is represented by the pair GSD length (\(GSD_{h}\)) and GSD width (\(GSD_{w}\)), which is computed as in (1), (2) and (3) [35]:

$${\text{GSD}}_{h} = \frac{{F_{h} \times S_{h} }}{{{\text{Flen}} \times {\text{I}}_{h} }}$$
(1)
$${\text{GSD}}_{w} = \frac{{F_{h} \times S_{w} }}{{{\text{Flen}} \times {\text{I}}_{w} }}$$
(2)
$${\text{GSD}} = {\text{GSD}}_{h} \times {\text{GSD}}_{w}$$
(3)

Here, \(F_{h}\) denotes the flight height above ground which is evaluated from Table 1, \(Flen\) represents the focal length of the camera sensor, \(S_{w}\) and \(S_{h}\) are the sensor width and height, respectively,\(I_{h}\) and \(I_{w}\) are the image height and width, respectively (in pixels).

5 The proposed algorithm

The primary objective of the proposed algorithm DroneRTEF is to develop a framework for railroad track extraction from DI captured under uneven sunlight intensity, with different rail line orientations and at varying flight heights in the railroad environment. This aim is facilitated by the capability of the algorithm to characterize the colour, shape, and geometric features of rail lines for railroad track extraction. The proposed DroneRTEF framework is divided into two stages:

  1. 1.

    ACSM algorithm is developed to identify rail lines in DI captured in uneven illumination. In order to achieve this aim, an adequate colour space is selected and then thresholding is performed to create a railroad track mask. This mask segments DI to identify rail lines in them.

  2. 2.

    HT-GSD algorithm is a novel method for Hough parameter space-based characterization of shape and geometric features of rail lines. This method uses Hough parameters \(\rho\) and \(\theta\), to detect the identified rail lines and then perform rail line pair selection. Coordinate transformation is then computed for extraction of the railroad track from DI captured at different orientations and flight heights.

The complete flow diagram of the proposed algorithm is as shown in Fig. 2, and the steps are discussed in Sect. 5.1, 5.2 and 5.3.

Fig. 2
figure 2

DroneRTEF framework

5.1 Adaptive Colour Space-based masking (ACSM)

In order to create a railroad track mask to identify rail lines in DI, we propose a global image enhancement algorithm adaptive colour space-based masking (ACSM) as described in Algorithm 1 in Fig. 3. As mentioned in Sect. 4.1, four colour spaces: RGB, HSV, L*a*b*, YCbCr have been proposed for railroad track mask creation in DI. The visual analysis of track masks is discussed in Sect. 5.1.1.

Fig. 3
figure 3

ACSM algorithm

The track mask creation method is devised to perform rail line identification during uneven sunlight illumination in DI. This work is challenging as:

  1. 1.

    Rail lines occupy small areas and are difficult to locate uniformly in DI.

  2. 2.

    A greater variation in the range of the grey values has been observed in the global scope. This may be due to the reflectance properties of the railroad surfaces and uneven illumination.

The ACSM method is devised to overcome these challenges and create a track mask. Each step is given in Algorithm 1 and is performed as follows:

  1. 1.

    Global Image transformation: In this method, we perform evaluations of different colour spaces based on their discriminative power and their ability to provide photometric invariance in DI. This technique applies different colour space transformations \(T\)(in Sect. 4.1) using Eqs. 4,5 and 6, with the aim to enhance DI \(D_{i}\).

    $$o_{j} = t_{j} (l_{1} ,l_{2} ,l_{3} , \ldots ,l_{n} )\quad j = 1,2,3, \ldots n$$
    (4)
    $$T = \{ t_{1} ,t_{2} ,t_{3} \ldots t_{n} \}$$
    (5)
    $$O{}_{i} = T[D_{i} ]$$
    (6)

    where \(O_{i}\) is transformed(enhanced) colour output drone image, \(D_{i}\) is colour input drone image,\(o_{j}\) and \(l_{j}\) denote colour components of the transformed colour output image \(O_{i} (x,y)\) and colour input image \(D_{i} (x,y)\) at any point (x,y), \(t_{j}\) implies colour mapping function, n equals number of colour components and \(n\) functions, \(t_{j}\), combine to implement one transformation function,\(T\).

  2. 2.

    Track Mask creation: In this method, the transformed image \(O_{i}\) is now segmented to identify rail lines using Eq. (7). The colour threshold-based segmentation is performed in each of the colour spaces on corresponding \(o_{j}\) components in order to obtain mask \(M_{i}\). The thresholds used in equation (7) are determined through critical analysis of different values for the lower and upper limit in the range of intensity values of colour channels. Thus, in this mask only the rail lines may be visible and the rest of the background in DI(such as catenaries, assets along the track, track signs, vegetation along-track, ballast, sleepers, pole lines) may be eliminated.

    $$M_{i} = O_{{i|{\text{thresh}}}}$$
    (7)

    where \(M_{i}\) denotes track mask,\(O_{{i|{\text{thresh}}}}\) denotes thresholded \(O_{i}\) image (\(o_{j}\) components) in the range \(\in [{\text{low}}\_\lim it_{{{\text{color}}\_{\text{space}}}} ,{\text{upper}}\_\lim it_{{{\text{color}}\_{\text{space}}}} ]\), \([{\text{low}}\_{\text{limit}}_{{{\text{color}}\_{\text{space}}}} ,{\text{upper}}\_{\text{limit}}_{{{\text{color\_space}}}} {]}\) denotes the range of intensity values of colour channels for enhanced rail lines in respective colour spaces.

Therefore, the ability of ACSM method to identify rail lines in DI captured in uneven sunlight intensity makes it adaptive.

5.1.1 Visual Analysis

The main aim of the visual analysis is to choose a colour space in which the respective colour space transformation is carried out on the track DI and the obtained railroad track mask consists of only rail lines. This way the railroad mask is able to eliminate uneven illumination, and background in DI. Here, track masks for D2 and D3 images in all four colour spaces are shown in Fig. 4.

Fig. 4
figure 4

Railroad Track masks for images D2 (above) and D3 (below ) in a RGB Colour Space b HSV Colour Space c L*a*b* Colour Space d YCbCr Colour Space

Upon applying the thresholds, the RGB mask is created. It is observed in Fig. 4a that even though the rail lines in the RGB mask of D2 are prominent along with clutter (ballast between the rail lines and along the left and right rail lines), the rail lines in the mask for D3 are invisible. Hence, it is not feasible to consider an RGB colour space mask for further creation of edge map.

Upon creation of HSV mask for D2, the rail lines can be definitively traced in the mask as shown in Fig. 4b. Also, the HSV mask for D3 is present with quite distinct rail lines and negligent background clutter (ballast, pole lines, etc.) as seen in Fig. 4b. Hence, the HSV colour space has been considered for further evaluation.

Upon computation of the L*a*b* mask in Fig. 4c, we observe that the ballast surrounding the rail lines in D2 mask is also distinctively visible. However, the rail lines in D3 mask are still not visible in Fig. 4c. Hence, we discard this colour space due to inability to obtain proper railroad track mask comprising of only rail lines.

The YCbCr mask for D2 in Fig. 4d shows the rail lines distinctly. However, the D3 mask has white blobs and huge clutter as the railroad background, as shown in Fig. 4d. Hence, YCbCr colour space mask is not examined further.

Therefore, HSV colour space is selected for track mask creation. The masks for study images are shown in Figs. 6b,7b, 8b, and 9b and they act as an input for edge map generation.

5.2 Edge Map Generation

The objective of this step is to generate an edge map from the railroad track mask (obtained in Sect. 5.1). For this purpose, a suitable edge detection algorithm needs to be selected and we have examined the following edge detection techniques: Prewitt, Sobel, Roberts, LOG and Canny [10, 23]. Upon evaluation of the comparison results, Canny is selected as the suitable edge detector [17]for rail lines.

The Canny edge detector poses the following advantages:

  1. 1.

    The Canny edge detector[46] uses hysteresis thresholding in which the connectivity characteristic is taken into consideration.

  2. 2.

    Also, in this detector, the computation of the first derivative (gradient) of Gaussian very closely approximates the operator, which causes optimization of the product of localization and signal-to-noise ratio.

  3. 3.

    Another advantage of Canny is that it uses the direction of the gradient to detect the rail edges.

The smoothing of the masked drone image is performed using the Gaussian kernel spread \(\sigma\). The smoothened image is filtered with the Sobel kernel of size,\(Sob_{siz}\), to compute the image gradients in the horizontal (\(M_{i|x}\))and vertical (\(M_{i|y}\)) direction. The gradient magnitude (\(M_{i|canny}\)) and gradient direction (\(\theta\)) are computed as in Eqs. (8) and (9):

$$M_{i|{\text{canny}}} = \sqrt {{M_{{i|}_{x}}^{2} + } M_{{i|}_{y}}^{2}}$$
(8)
$$\theta = \tan^{ - 1} \left( {\frac{{M_{i|y} }}{{M_{i|x} }}} \right)$$
(9)

Then, hysteresis thresholding is applied on all the edges, with threshold values minval and maxval, in order to remove small noise pixels(edge detections in the mask apart from rail line edges) and obtain localized rail line edge detections. These detections have been vividly outlined for all the study images using the Canny operator as shown in Figs. 6c, 7c,8c, and 9c.

5.3 Geometrical Representation of Hough Parameter Space

The goal of this step is to take the generated edge map (computed in Sect. 5.2) as input, perform efficient Hough parameter space evaluation in order to obtain the geometrical representation of the rail lines and detect these lines at varying orientations. Thereafter, GSD-based line pair selection facilitates detection of rail line pairs at different flight heights. Railroad track extraction is then performed through coordinate transformation. This algorithm is termed as HT-GSD method and is described as Algorithm 2 in Fig. 5

Fig. 5
figure 5

HT-GSD algorithm

5.3.1 Candidate rail line detection

In order to detect the rail lines, firstly their shape representation is obtained. To accomplish this task, we compute Hough transform [47] on the railroad track edge map for the detection of rail lines as straight lines that run through DI. A rail line is represented as a straight line in the Cartesian coordinate as in Eq. (10):

$$y = gx + c$$
(10)

where (x,y) is a point on the line, g is gradient and c is y-intercept. A rail line is represented in the Hough transform in the polar coordinate system, as a function of two parameters \(\rho\) and \(\theta\), as depicted in Eq. (11):

$$\rho = x\cos \theta + y\sin \theta$$
(11)

where, \(\rho\) = perpendicular distance from the coordinate origin((0,0), top-left corner in the image) to the line.\(\theta\) is the angle formed by \(\rho\). This geometrical representation of Hough parameter space in the DI is as shown in Figs. 6a,7a,8a and 9a. The Hough transform for the study images is computed in Figs. 6d,7d,8d, and 9d.

Fig. 6
figure 6

Study Image 1: D1 a Geometrical Representation of Hough Parameter Space b HSV Colour Space Mask c Canny edge detection d Hough Transform e Detected lines based on Hough Parameter Space Evaluation f HT-GSD-based detected lines g Corresponding extracted tracks

Fig. 7
figure 7

Study Image 2: D2 a Geometrical Representation of Hough Parameter Space b HSV Colour Space Mask c Canny edge detection d Hough Transform e Detected lines based on Hough Parameter Space Evaluation f HT-GSD-based detected lines g Corresponding extracted tracks

Fig. 8
figure 8

Study Image 3: D3 a Geometrical Representation of Hough Parameter Space b HSV Colour Space Mask c Canny edge detection d Hough Transform e Detected lines based on Hough Parameter Space Evaluation f HT-GSD-based detected lines g Corresponding extracted tracks

Fig. 9
figure 9

Study Image 4: D4 a Geometrical Representation of Hough Parameter Space b HSV Colour Space Mask c Canny edge detection d Hough Transform e Detected lines based on Hough Parameter Space Evaluation f HT-GSD-based detected lines g Corresponding extracted tracks

In order to hold the values of the two parameters, a representation of the Hough parameter space is created in the form of a 2-D array, \(Acc(\rho ,\theta )\). This array will store the \((\rho ,\theta )\) value pairs of the detected lines. The values of \(\rho\) are a function of distance resolution,\(\rho_{acc}\)(in pixels), and values of \(\theta\) are governed by angle resolution \(\theta_{acc}\)(in radians). Here, \(\rho_{acc}\) and \(\theta_{acc}\) denote the minimum difference between any two \(\rho\) and \(\theta\) values, respectively. The generated rail lines cover a range of angles \(\theta\) \(\in\)[0,180], and the angle is then measured in radians. If the angle resolution (\(\theta_{acc}\)) is set to a larger value lesser \((\rho ,\theta )\) pairs are recorded, however, this may cause the orientation precision of the rail lines to decrease. All drone images are of the same size but the length of the rail lines may differ as they are orientated at different angles (as seen in Fig. 1). The parameter values (\(\rho_{t} ,\theta_{t}\)) of Hough space with the number of votes equivalent or greater than the threshold \(H_{the}\), which indicates minimum line length or the minimum number of intersections to detect a line, are recorded in \(houg(\rho_{t,} \theta {}_{t})\) for the corresponding lines to be selected as rail line candidates.

5.3.2 Parameter space evaluation

The candidate rail lines in DI (computed in Sect. 5.3.1) are aligned in horizontal or vertical direction at different orientations. The analysis of this rail line alignment (direction) is important so that all rail lines located horizontally or vertically at varying orientations are detected. In the geometry of rail lines, the Hough parameters \(\rho\) and \(\theta\) play a fundamental role. Therefore, it is essential to extract parameter space values \(\rho_{i}\) and \(\theta_{i}\) from \(houg(\rho_{t,} \theta {}_{t})\), which upon Cartesian Coordinate transformation produce lines that map approximately to the rail line locations in the right direction and at proper orientations in DI. It is observed that \(\rho_{first}\), the first \(\rho\) value of \(houg(\rho_{t,} \theta {}_{t})\) is indicative of the direction of the rail lines in the DI and is examined to determine if the rail lines are horizontally or vertically oriented. For this, the following relationships are evaluated:

$${\text{rhoLines}}(\rho_{i,} \theta_{i} ) = \left\{ \begin{gathered} \rho_{i} = \rho_{t} \quad {\text{If}}\,\rho_{first} > 0,\rho_{t} > 0\, {\rm{or}} \hfill \\ \theta_{i} = \theta_{t} \quad \rho_{first} < 0,\rho_{t} < 0 \hfill \\ \hfill \\ t = t + 1\quad {\text{Otherwise }} \hfill \\ \end{gathered} \right.$$
(12)

where, \(\rho_{t}\) = value of \(\rho\) in every iteration of the Hough parameter space \(houg(\rho_{t,} \theta {}_{t})\), \(\theta_{t}\) = value of \(\theta\) corresponding to \(\rho_{t}\), \(\rho_{i}\) = value of \(\rho_{t}\) greater or lesser than zero based on conditions evaluated in Eq. (12) and \(\theta_{i}\) = value of \(\theta\) corresponding to \(\rho_{i}\). These parameter values \((\rho_{i,} \theta_{i} )\) are recorded and updated in another array \(rhoLines(\rho_{i,} \theta_{i} )\). The conditions stated in Eq. (12) are evaluated in both the cases, i.e. if the orientation of all the rail lines is the same or if the orientation of rail lines differ.

After examination of the aforementioned conditions, now \(rhoLines(\rho_{i,} \theta_{i} )\) comprises of Hough parameter values, \(\rho_{i}\) and \(\theta_{i}\) corresponding to the rail lines which would match to the rail line orientations and directions at their respective locations in DI. It is now important to determine another parameter \(lin\) which is the number of rail lines to be detected in the DI. Sometimes, setting a very small value for \(lin\) does not output all required rail lines while setting a very large value may provide with spurious output in the form of more number of lines than the actual number of rail lines present in DI. As much as it is essential to select the correct value of \(lin\), it is important to remove these extra line detections except rail lines (if any). Therefore, in order to achieve this, the sensitivity range for \(lin\) is set to \(lin_{sen}\) pixels.

The detections within the sensitivity range are performed as follows:

$$senLines(\rho_{s,} \theta_{s} ) = \left\{ \begin{gathered} i = i + 1\quad \rho_{i} - lin_{sen} < = \rho_{r} < = \rho_{i} + lin_{sen} \hfill \\ \hfill \\ \rho_{s} = \rho_{i} \hfill \\ \theta_{s} = \theta_{i} \hfill \\ \rho_{s + 1} = \rho_{r} \hfill \\ \theta_{s + 1} = \theta_{r} \quad {\text{Otherwise}} \hfill \\ \end{gathered} \right.$$
(13)

where \(\rho_{i,} \rho_{r}\) denotes any pair of \(\rho\) parameter values in \(rhoLines(\rho_{i,} \theta_{i} )\). If any value \(\rho_{r}\) falls within the range given in Eq. (13), it implies the pair \(\rho_{i,} \rho_{r}\) values are placed within the \(lin_{sen}\) range and hence, \(\rho_{r}\) is discarded. Otherwise, both the \(\rho_{i,} \rho_{r}\) values are retained in the parameter space and the corresponding \(\theta_{i} ,\theta_{r}\) are the orientation values of the rail lines obtained in \(senLines(\rho_{s,} \theta_{s} )\). The detections are shown in Figs. 6e,7e, 8e and 9e.

5.3.3 GSD-based rail line pair selection

In Sect. 5.3.2, even though we have tried to remove the spurious rail line detections, it is possible that such detections may still remain outside the range as defined in Eq. (13). Consequently, GSD-based rail line pair selection method is accurate for the selection of valid rail lines(of the rail line pair) and subsequent extraction of railroad track.\(GSD\) is calculated in Sect. 4.3 from Eqs. (13). The broad gauge length is represented by \(track_{gauge}\). Consequently, the distance in pixels between the rail line pair,\(N_{r}\), can be calculated as:

$$N_{r} = {\text{track}}_{{{\text{gauge}}}} /{\text{GSD}}$$
(14)

In this method, if the distance between the two rail lines is equal to \(N_{r}\), they are parallel and belong to one rail line pair. To perform this computation, the following condition in Eq. 15 is computed between any pair of \(\rho\) values \(\rho_{s}\),\(\rho {}_{s + 1}\) in \(senLines(\rho_{s,} \theta_{s} )\):

$$Lines(\rho_{l,} \theta_{l} ) = \left\{ \begin{gathered} \qquad \qquad \qquad \rho_{l} = \rho_{s} \quad {\text{If}} \hfill \\ \theta_{l} = \theta_{s} \quad N_{r} - pix_{\max } < = (abs(\rho_{s + 1} ) - abs(\rho_{s} )) \hfill \\ \qquad \quad \rho_{l + 1} = \rho_{s + 1\quad } < = N_{r} + pix_{\max } \hfill \\ \qquad \qquad \qquad \theta_{l + 1} = \theta_{s + 1} \hfill \\ \hfill \\ \qquad \qquad s = s + 1 \quad Otherwise \hfill \\ \end{gathered} \right.$$
(15)

where \(pix_{\max }\) denotes the maximum number of pixels around the rail line which helps determine the range within which a rail line detection is considered as one of the rail lines of the rail line pair. The Hough space \(Lines(\rho_{l,} \theta_{l} )\) gives the parameter values of the selected rail line pairs which satisfy the distance condition in Eq. (14) while calculating \(N_{r}\) and these line pairs are mapped to the exact rail line locations in the DI. The HT-GSD-based line pair selection has been depicted visually in Figs. 6f,7f,8f and 9f for different study images.

5.3.4 Coordinate system transformation-based railroad track extraction

Once the parameters for the rail line pair have been calculated, as in the previous Sect. 5.3.3, it is now essential to obtain the Cartesian line forms as in Eq. (10) of the rail lines (of the rail line pair) from the corresponding parametric forms shown in Eq. (11). A rail line pair is represented by two lines \(L_{1}\) and \(L_{2}\), as shown in Figs. 6a,7a,8a and 9a. The rail line \(L_{1}\) is described by points \((p1,p2)\) and another rail line \(L_{2}\) is denoted by points \((p3,p4)\). For railroad track extraction, the coordinates of the rail line points need to be shifted by a value of \(lextract\) on the left side of the left rail line and on the right side of the right rail line. The value of \(lextract\) denotes the number of pixels (along the rail lines) which constitutes the ballast area along the left and right rail lines.

The coordinates are calculated by evaluating a \((\rho_{l,} \theta_{l} )\) pair under the following conditions:

  1. 1.

    If \(\rho_{first}\) > 0 and \(\theta_{first}\) \(\in\)[1,180]:

    $$x1 = \rho_{l} /\sin (\theta_{l} )$$
    $$x2 = (\rho_{l} - I_{w} *\cos (\theta_{l} ))/\sin (\theta_{l} )$$
    $$x3 = \rho_{l + 1} /\sin (\theta_{l + 1} )$$
    $$x4 = (\rho_{l + 1} - I_{w} *\cos (\theta_{l + 1} ))/\sin (\theta_{l + 1} )$$

    The coordinates of the two endpoints for the rail line \(L_{1}\), used for track extraction, are \(p1\left( {0,x1 - lextract} \right)\) and \(p2\left( {I_{w} ,x2 - lextract} \right)\). Also, the endpoint coordinates for rail line \(L_{2}\) are \(p3\left( {0,x3 + lextract} \right)\) and \(p4\left( {I_{w} ,x4 + lextract} \right)\).

  2. 2.

    If \(\rho_{first}\) < 0 and \(\theta_{first}\) \(\in\)[0,180] or \(\rho_{first}\) > 0 and \(\theta_{first} < [0 - 1]\):

    $$x1 = \rho_{l} /\cos (\theta_{l} )$$
    $$x2 = (\rho_{l} - I_{h} *\sin (\theta_{l} ))/\cos (\theta_{l} )$$
    $$x3 = \rho_{l + 1} /\cos (\theta_{l + 1} )$$
    $$x4 = (\rho_{l + 1} - I_{h} *\sin (\theta_{l + 1} ))/\cos (\theta_{l + 1} )$$

For track extraction in this scenario, the coordinates of the two endpoints for the line \(L_{1}\) are \(p1\left( {x1 - lextract,0} \right)\) and \(p2\left( {x2 - lextract,I_{h} } \right)\). The endpoint coordinates for line \(L_{2}\) are \(p3\left( {x3 + lextract,0} \right)\) and \(p4\left( {x4 + lextract,I_{h} } \right)\).

The above evaluated conditions calculate the coordinates of \(L_{1}\) and \(L_{2}\) endpoints in both the scenarios. Therefore, coordinate transformation facilitates railroad track extraction as shown in Figs. 6g,7g,8g and 9g.

6 Experimental results and discussions

6.1 Experimental setup

The track images have been captured by the DJI quadcopter. The sensor parameter values of the drone are \(Flen\) = 3.61 mm,\(S_{w}\) = 6.16 mm, \(S_{h}\) = 4.62 mm. The image dimensions are \(I_{h}\) = 3000 pixels and \(I_{w}\) = 4000 pixels. The proposed algorithm has been implemented as a software model using the OpenCV library function [48] and MATLAB R2016a. For evaluation, we have chosen 4–5 study images each (including the study images) from the acquired datasets for representation of uneven illumination, different rail line orientations and varied flight heights in different railroad environments.

6.2 Analysis of colour spaces for mask creation

An extensive evaluation of the colour space thresholds is performed to obtain a railroad track mask such that only rail lines are visible.

In order to obtain the railroad track mask in RGB colour space an extensive evaluation of thresholds has been carried out upon the two study images D2 and D3, as shown in Fig. 4 a. The range for all three channels in RGB \(\in\) [0–255]. For thresholding, the \(low\_\lim it_{color\_space}\) values for channel 1, channel 2 and channel 3 are set to 110, 14 and 119, respectively, and \(upper\_\lim it_{color\_space}\) values are set to 158,119 and 174, respectively.

In order to obtain the mask in HSV colour space, the hue range \(\in\) [0, 179], saturation range \(\in\) [0,255] and value range \(\in\) [0,255] are marked in the study images as shown in Fig. 4b. For this purpose, the \(low\_\lim it_{color\_space}\) and the \(upper\_\lim it_{color\_space}\) for mask thresholds for all three channels are set to [100, 40, 40] and [130,255,255], respectively.

L*a*b* expresses colour as: L* for lightness which scales from 0 (black)-100(white), a* axis and b* axis both in the range \(\in\) [-100,100], which represent green (−) – red ( +) and blue (−) – yellow ( +) component, respectively. In order to obtain the masks, shown in Fig. 4c, the \(low\_\lim it_{color\_space}\) and \(upper\_\lim it_{color\_space}\) values for all three channels have been set to 43.419,0.751,-31.492 and 63.950, 39.603, -5.247, respectively.

In the YCbCr, Y represents the luminance (intensity-based information) component and ranges from 16 to 235. The Cb, Cr channels are the chrominance (colour related) components both within the range \(\in\) [16,240]. Also, the threshold values for the Y, Cb, Cr channels range from \(low\_\lim it_{color\_space}\): 40, 130, 92, respectively, to \(upper\_\lim it_{color\_space}\): 82, 194, 137, respectively, for obtaining the masks as shown in Fig. 4d.

6.3 Steps for Rail line detection and track extraction

  1. 1.

    In the ACSM method, it has been observed that rail lines have been distinctly identified upon segmentation only in the HSV colour space railroad track mask, as seen in Fig. 4b and discussed in Sect. 5.1. This colour space handles identification of rail lines well in non-uniform sunlight illumination.

  2. 2.

    Subsequently, edge detection is necessary in the HSV mask. After an exhaustive analysis of different sets of parameter values, it is observed that the following set of canny edge parameter values: Gaussian spread, \(\sigma\) = \(5 \times 5\), \(Sob_{siz}\) = 3, minval = 50, maxval = 200, projects sharp rail line edges as depicted in Figs. 6c,7c,8c and 9c. These edges are necessary as input for shape and geometric representations.

  3. 3.

    For computing shape representations of rail lines from edge map using Hough transform, the Hough parameter space resolutions are set to \(\rho_{acc}\) = 2 pixels, \(\theta_{acc}\) = 0.017453 radians (\(1^{ \circ }\)). The value for \(H_{the}\) is set to 100 pixels.

  4. 4.

    The values for number of lines \(lin\) = 10,18,20,25,30 lines have been chosen such that the value of \(lin\) is able to detect all the rail lines in all DI which may include spurious detections also.

  5. 5.

    The value for \(lin_{sen}\) is set to \(\pm\) 80 pixels. The range is suitable in order to remove spurious rail line detections and is generally selected less than \(N_{r}\). This is necessary to reduce the computational overhead for the next step by reducing the number of comparisons for line pair selections.

  6. 6.

    The \(GSD\) is calculated at any given flight height using Eqs. (13) and determines the pixel size on the ground as seen in Table 1. The broad gauge of tracks, \(track_{gauge}\) = 1676 mm (\(\approx\) 168 cm). The value of \(N_{r}\) is calculated for any given flight height using Eq. (14) which facilitates rail line pair selection. Two rail lines are classified as rail line pair when the distance (in pixels) between these two lines is evaluated as per Eq. (15). The value for \(pix_{\max }\)\(\in\) [5, 14]. The \(lextract\) value has been set to 100 pixels (the width in pixels) in order to obtain the left and right side of the rail lines for extraction of the railroad track.

The aforementioned empirically selected parameter values lead to correctly matched rail lines (in red colour). This facilitates respective automated railroad track extractions despite multi-rail lines and complex background as shown in Figs. 6g,7g,8g and 9g.

6.4 Evaluation and validation of the proposed method

In order to validate our proposed method, the values of Central Latitude: 29°51′13.407399″N Central Longitude: 77°52′11.9761″E for study image D3 are extracted and marked over the Google Earth image of that area as seen in Fig. 10a. The study image with detected rail lines in red colour is then overlapped over the respective Google Earth image as seen in Fig. 10b. It has been observed that these detections have matched well even at locations with difficult terrain.

Fig. 10
figure 10

a Google Earth Imagery for D3 b Rail lines detected in red in image overlapped on Google Earth

The efficiency of our proposed algorithm DroneRTEF(ACSM + HT-GSD) for drone images is evaluated using the following standard performance metrics: Precision, Recall and Accuracy(ACC) [49}. Before discussing these metrics, we should know about some prior notations. Suppose we classify our detections into two classes A(a rail line) and B(not a rail line), then true positive (TP) denotes number of correct matches(detections) which belong to A and are correctly identified as A, False Negative(FN) represents matches not correctly detected, False Positive(FP) denotes proposed matches that are incorrect and True Negative(TN) represents non-matches that are correctly rejected.

Precision is the measure of the proportion of the detected relevant instances or positive predictive value. It is represented as the ratio of TP and its sum with FP (depicted in Eq. 16).

$${\text{Precision}} = \frac{{{\text{TP}}}}{{\text{TP + FP}}}$$
(16)

Recall, also called sensitivity, is the measure of the proportion of actual positives which are detected correctly. It is represented as the ratio of TP and its sum with FN (depicted in Eq. 17).

$${\text{Recall}} = \frac{{{\text{TP}}}}{{{\text{TP}} + {\text{FN}}}}$$
(17)

Accuracy (ACC) is the measure of correct detections made by the model over total number of detections(depicted in Eq. 18).

$${\text{Accuracy (ACC)}} = \frac{{{\text{TP}} + {\text{TN}}}}{TP + TN + FP + FN}$$
(18)

In order to access the efficiency of our proposed approach, we have compared it with ACSM + HT on our newly acquired dataset. The metric analysis is performed on both the algorithms in order to calculate the difference in detections and show the effectiveness of our framework. The detection in ACSM + HT is considered after computing HT on the image. In our proposed approach, the detection is considered after parameter space evaluation. The evaluations on the metrics are performed with values of lin = 10,18,20,25,30 lines. The results are shown in Tables 3 and 4 for proposed approach and ACSM-HT approach, respectively.

Table 3 Performance measures for DroneRTEF (proposed approach)
Table 4 Performance measures for ACSM + HT

The comparative analysis is obtained with three other methods K-NN mean Euclidean distances, Inception V4,ACSM-HT as shown in Table 5. The results have been obtained on our dataset. As observed the highest precision and accuracy values for the proposed framework DroneRTEF indicates robustness of the model upon comparison with other methods. However, recall values can be viewed to be comparable to K-NN and ACSM-HT method. The adaptive behaviour of the method can be inferred from the above results.

Table 5 Comparative Analysis

6.5 Discussion

The comparative analysis of Precision, Recall and Accuracy metrics between both the approaches is shown in Fig. 11a, b and c, respectively. These values have been computed for lin = 10,18,20, 25, 30 lines. In Fig. 11a, it has been observed that the Precision metric values for all lin values are more as compared to the other method. Therefore, it can be inferred that the percentage of detected relevant lines is higher for the proposed approach as compared to the other method. In Fig. 11b, it has been observed that the recall values are higher for the other method. Recall values denote the proportion of actual positives identified correctly. In the proposed approach, it is observed that for each of the values in lin = 10,18,20, 25, 30 lines some of the actual detected lines are eliminated during the parameter evaluation method while removing the extra detections. Therefore, the recall value percentage is little lesser for our approach. The accuracy metric values are seen in Fig. 11c. It has been observed that the accuracy percentage is much higher for DroneRTEF. Therefore, it can be inferred that our proposed approach is efficient as compared to other algorithm for all the rail line values.

Fig. 11
figure 11

DroneRTEF versus ACSM + HT evaluation with a Precision b Recall c Accuracy

7 Conclusion and Future Work

In this paper, we have proposed DroneRTEF, a novel adaptive railroad track extraction framework for DI. This framework is divided into two stages. The identification of the rail lines is performed by ACSM. ACSM is a global image enhancement method in which colour space-based segmentation is performed for creation of a railroad track mask for DI comprising of only the rail lines. This approach facilitates rail line identification in DI with uneven illumination. Another step is Hough parameter space analysis-based novel HT-GSD algorithm. In this algorithm, hough parameters are evaluated in order to compute the shape and geometrical representation of lines in DI. This parameter space evaluation helps in the detection of identified rail lines at varying orientations, in different directions (horizontally or vertically) and also at different heights. This facilitates railroad track extraction. Our proposed framework is validated on large datasets and has achieved an accuracy of 90%. Therefore, the framework provides with the following advantages:

  1. 1

    Drone is introduced as an IAS and provides with advantages such as image acquisition in inaccessible locations, is operative even during various railroad operations and provides cost-effectiveness.

  2. 2

    A new dataset has been developed to take into consideration various railroad environment scenarios.

  3. 3

    It is inferred from the experiments that the proposed method is adaptive to uneven illumination, varying rail line orientations and change in flight heights during track extraction in DI.

  4. 4

    The algorithm has been tested and evaluated for large datasets in order to consider all railroad environment conditions and examine them for track extraction. This is in order to ensure safety of people travelling by railroad under any circumstances.

  5. 5

    The effectiveness of the framework serves as an advantage for fast railroad inspections, obstacle identification and driver assistance systems.

In future, a technique to discuss removal of small recall values may be discussed. This is major importance as detection of rail lines is of paramount importance. Improvement of the HT-GSD method is proposed to be taken under consideration.