Introduction

Terrestrial Laser Scanner (TLS) is a surveying imaging technology that is becoming increasingly important in obtaining fast and accurate 3D coordinate information of natural and man-made surfaces (buildings) (OuYang and Feng 2005). Feature could be defined as a set of laser points reflected by an object. Features are primarily classified into two groups:

  1. 1.

    Man-made features.

  2. 2.

    Natural features.

This definition is based on the fact that man-made features exhibit a regular geometric pattern (linear, planar, curved etc.). Natural features do not exhibit regular geometric patterns, rather points are mostly scattered. Feature types based on previous definition, trees and planets could be classified as natural features, while buildings, walking path, and road utilities could be classified as man-made features. 3D surface data could be used in different fields, such as industrial design, civil engineering, cultural heritage, and complex surveying projects. 3D information data are also needed for the documentation and preservation of historical architectural assets.

Unstructured 3D point clouds are captured from both terrestrial and airborne laser scanning systems. The airborne scanning system (ALS) was used to get Digital Terrain Modelling (DTM) (Kraus and Pfeifer 1998). TLS is used for close-range, high-accuracy applications, such as bridge and dam monitoring, architectural restoration, and many other applications. Nowadays, it has the advantage of increasing resolution, accuracy, and acquisition rate. This accuracy led TLS becoming an important tool to obtain rich detail quantity of point cloud data for scanning objects. The market offers many of the laser scanners with different system specifications. They are classified in different ways:

  • Scanning from air (Airborne Laser Scanning).

  • Scanning from the ground (TLS).

  • Hand-held scanners.

The second type of TLS will be considered in this study. Surveyors and many others are interested in laser scanning systems because the laser scanner technology offers many advantages over the traditional surveying and photogrammetric method (Kern 2003). TLS’s are system generating point clouds containing 3D coordinates (x, y, and z) and intensity value (I) for scanned objects. The RGB (Red, Green, Blue) information can be obtained by an internal or external digital camera. By comparison with the photogrammetric method, TLS gives explicit 3D information that enables the rapid and accurate capture of the geometry of complex scanned objects such as historical buildings or façades.

Data segmentation and feature extraction are fundamental problems in the point cloud processing step. Segmentation means dividing scanned object to homogeneous regions (such as the regions having the same properties). Feature extraction is a very important step in point cloud processing which further allows object modelling and getting a final 3D model from point cloud data (Janowski and Rapinski 2013). 3D model is important in many applications, such as urban planning, virtual reality, and facility management. TLS can be used perfectly in highways, roads construction and earthwork volume calculation (Slattery et al. 2012). The aim from this study is to extract features by improving edge-based segmentation techniques for scan data from the TLS system.

Problem Statement

Laser scanners provide unstructured point cloud data. The biggest challenge for the data processing step is their huge file size to be processed by the computer. Therefore, scanned data must be reduced significantly to proceed with the computations and to lower the storage requirement. To overcome the challenge, researchers emphasize data segmentation, modelling, and memory optimization strategies for both ALS and TLS data. For ALS data reduction problems, Błaszczak-Bak et al. (2011) and Błaszczak-Bak and Sobieraj (2013) presented a modified process of the initial data processing by using the adaptive (TIN) model. Other examples are as follows: Sithole and Vosselman (2004) presents a comparison of the most common ground filtering algorithms, and (Kraus and Pfeifer 1998) uses an iterative linear prediction scheme. On the other hand, filtering or data reduction is successful for TLS when redundant points and outliers are removed from original data. Examples of point cloud simplification algorithms can be found in: Carsten and Neil (2003), Stal et al. (2012) and Zainuddin et al. (2009). Data segmentation and features extraction will be our concern in this work. Data segmentation is the process of partitioning point cloud data into regions or segments with homogeneous geometric or radiometric characteristics. The segmentation process is used to isolate features of interest (windows, doors, roof) in a point cloud data, also it could be consider the first step to recognise 3D objects from scan data. Segmentation of TLS data can be helpful in localization, classification, and feature extraction. Segmentation is a very important step for point cloud interpretation of massive data on large sites, which allows for further object modelling (Brenner 2005). The ideal segmentation method must be able to deal with noisy data and features such as planar or curved surfaces. In addition, it must not depend on data viewing direction. Unfortunately, no existing methods have all of these capabilities.

Previous Work

The segmentation research area is one of the main areas for both terrestrial and airborne laser scanner data; not only because of the requirement of introducing some level of organization into the data before extraction of effective information, but also because it is one of the main processing steps far from being solved even for planar features Filin and Pfeifer (2005) and Hoover et al. (1995). For ALS data, most of the experts deal with segmentation either to extract 3-D features such as buildings (Rottensteiner and Briese 2003; Matikainen et al. 2004; Filin and Pfeifer 2005) and bridges (Sithole and Vosselman 2006), or to reconstruct 3-D city models Haala and Brenner (1999) and Takase et al. (2003). Rabbani et al. (2006) reviewed some segmentation problems when processing industrial point clouds, such as handling curved objects or finding the best segmentation parameters, especially when those are large in numbers. Segmentation algorithms are provided by Hoover et al. (1995) and Woo et al. (2002). The algorithms that are based on edge detection, seek out edges in a point cloud, and then identify segments by further seeking edge loops (cyclic edges) (Sithole and Mapurisa 2012). The points that exist within the same edge loop are determined to be in the same segment. Edges are discriminated using geometric properties, such as principal curvature, facet normals, and gradients (Yang and Lee 1999). Edge extraction, based on surface curvature as a geometric property to detect and segment identical features in adjacent scans, is used for TLS data registering or matching step (Elkhrachy 2008). Edge based segmentation techniques would be an interesting and challenging topic for many researchers.

Overview of Different Segmentation Methods

Segmentation techniques for point cloud data could be generalized into two main categories: Region based segmentation (Hafiz et al. 2011), where points are classified into regions and edge based segmentation (Rabbani et al. 2006), where the region boundaries are detected. The most relevant methods investigated for the purpose of detecting and extracting features from point cloud data are divided in four main types:

  1. 1.

    Segmentation based on clustering of features. Filin (2002) defined seven dimensional vectors for each point, which are coordinate position, the parameters of the fitted plane to the neighbourhood, and the relative difference height between the points and its neighbours. Then, the feature space is clustered using an unsupervised technique to identify surface classes. After extracting the surface classes, the points are grouped in object space using spatial proximity measurement.

  2. 2.

    Segmentation based on surface growing. In this process, the algorithm starts at a chosen point and grows around the neighbouring points based on certain similarity criteria. In TLS, data for façade segmentation, the region algorithm is used to extract planar surfaces. More information found in references Gorte and Pfeifer (2004), Vosselman et al. (2004), Tóvári and Pfeifer (2005), Rabbani et al. (2006) and Biosca and Lerma (2008).

  3. 3.

    Segmentation by model fitting. Data can be decomposed into geometric primitive features such as planes, cylinders or spheres. This process tries to fit primitive shapes in the point cloud data to describe the form of building façades. Its principle is explained in detail in Besl and Jain (1998), Pu and Vosselman (2006) and Awwad et al. (2010).

  4. 4.

    Hybrid segmentation technique. More than one method is combined for planar feature segmentation. The hybrid method is a combination of the region and edge techniques. Edge based techniques obtain step edges and building roof edges respectively by locating the points with depth discontinuities and normal inconsistencies, then link them to divide the data into different regions. Region based techniques select a set of seed regions and iteratively grows these regions by merging neighbouring points with similar properties.

Presented Approach

Segmentation algorithms have their origination from digital image analyses. They connect regions within the image with a specific property such as colour or intensity, or a relationship between pixels. If the same idea is used for raw 3D data acquired by laser scanning (x, y, and z), it has to be transformed to a raster. Such a transformation usually causes loss of information. Therefore, the current research aims to improve algorithms that work on raw 3D data acquired by laser scanning. The presented approach depends on surface properties, such as its normal vector. Edges in 3D TLS data are defined by the points where changes exceed a given threshold value in the local surface properties. The local surface properties mostly include surface normals, gradients, principal curvatures or higher order derivatives. The edge based segmentation technique will be studied and improved; consequently, the region boundaries for TLS data will be detected based on variation in surface normal.

Point cloud data within any scan could be presented by P = {p 1,p 1…. p n }. (where P is a set of n points). This structure can be exploited in search of nearest neighbours. Nearest neighbours search (Arya et al. 1994), also known as closest point search, is an optimization problem for finding closest points in 3D point data. The fixed-radius near neighbours is used to define neighbours of point. In the fixed-radius near neighbours problem, a set of points in 3D point cloud and a fixed distance k is given as input. The problem has long been studied (Bentley 1975). The term k-fixed distance refers to the radius of circle in which neighbouring point are lying around it. Point cloud resolution for used data is approximately 3 cm. So, k-fixed distance of 3 cm is used to obtain point neighbours in proposed algorithm. Figure 1 shows scan data and nearest neighbours for point pi , which are away around that point by 3 cm. For each point pi three attributes: (n i , α i , d i ) are calculated, where n i is the average unit normal vector of the tangent plane at point p i neighbours, α i is the angle between adjacent tangent plans, and d i is the perpendicular distance from the origin 0 to the tangent plane, (see Fig. 2). The angle α i is commonly used to measure local variations between adjacent planar regions. If the variation of α i is small, it indicates that the neighbouring points lying on a planar surface, while higher value of α i will indicate an edge is detected. Two values of angle α i have been used for data segmentation using two real scanning examples. Process of edge detection could be summarized by the following steps:

Fig. 1
figure 1

Noisy scan profiles from Cyra scan system. Surface scan points are depicted in blue. For query point p i (in red colour), the nearest neighbours are searched under condition r ≤ threshold value where r is the radius of sphere its centre at point p i . In this figure, only 5 true nearest neighbours (in green colour) are found and considered in the algorithm

Fig. 2
figure 2

a Angle α i between adjacent two tangent planes A and B while n A and n B are normal vectors. b d i is the perpendicular distance from origin to tangent plane. Three attributes: (n i , α i , d i ) are calculated for data segmentation. Some times normal vectors are parallel or skewed lines, in this case; the angle α i will fail

  • Step 1 Read raw point cloud (x, y, and z) coordinate.

  • Step 2 Get a fixed number from nearest neighbours of point p i (ten points are used in experimental results), after that, calculate the average normal vector at point p i and so on until the last point.

  • Step 3 Search neighbours of each point p i which lie in a circle with r = 3 cm. At least one point neighbour should be found to decide if point p i is an edge or not.

  • Step 4 Compare surface smoothness between point p i and their neighbours using threshold value for α t h .

  • Step 5 If angle α between p i and any neighbours is more than a threshold value, then store p i as an edge point.

  • Step 6 Otherwise, repeat from step 2 to step 4 n times. Finally, separate all points that do not lay in plane surface from total scan data and display them in a different colour, as segmented edge points.

Figure 3 shows main steps and flow chart of segmentation algorithm.

Fig. 3
figure 3

The procedure of an edge extraction based on variation on surface normal values for adjacent points

Normal Vector Calculation

The normal vector calculation from point cloud plays a vital role in 3D laser scanning data processing. The normal vector at a point p can be calculated after its neighbouring points are identified (Klasing et al. 2009). Other researchers have attempted to estimate the normal vectors of discrete points by fitting smooth parametric surfaces (Hoffman and Jain 1987), (Yang and Lee 1999), or by generating polygonal surface models Milroy et al. (1997), Huang and Menq (2001) and Woo et al. (2002). In 3D surface coordinate, a normal vector n can be computed from any triangle of points (△v 0, v 1, v 2) as the cross-product (n = u × v), where u = (v 1 − v 0) and v = (v 2 − v 0). Several normal vectors n are calculated for a single point i. It is often useful to have a unit normal vector that simplifies some formulas. Unit normal vector calculation is easily done by dividing n by |n| as in Eq. 1.

$$ \mathbf{n}=\frac{\mathbf{u}\times \mathbf{v}}{\left\Vert \mathbf{u}\times \mathbf{v}\right\Vert } $$
(1)

where:

n :

unit normal vector at point v 0

u :

vector differences of the nearby points v 1 and v 0

v :

vector differences of the nearby points v 2 and v 0.

The distance of that plane to the origin is computed from the standard equation of a plane in 3D space. If the normal vector is normalized, then the constant term of the plane equation d becomes the perpendicular distance from the origin to tangent plane:

$$ {\boldsymbol{\mathsf{d}}}_{\boldsymbol{\mathsf{i}}}={\boldsymbol{\mathsf{n}}}_{\mathbf{\mathsf{1}}}\times {\boldsymbol{\mathsf{x}}}_{\boldsymbol{\mathsf{i}}}+{\boldsymbol{\mathsf{n}}}_{\mathbf{\mathsf{2}}}\mathbf{\cdotp}{\boldsymbol{\mathsf{y}}}_{\mathbf{\mathsf{i}}}+{\boldsymbol{\mathsf{n}}}_{\mathbf{\mathsf{3}}}\mathbf{\cdotp}{\boldsymbol{\mathsf{z}}}_{\mathbf{\mathsf{i}}} $$
(2)

where:

d i :

the perpendicular distance from the origin 0 to the tangent plane.

n 1 ,n 2 , n 3 :

coefficient of unit normal vector at point p i .

x i , y i , z i :

surface coordinate of point p i .

Mean values of normal for each point cloud was computed based on nearest 10 neighbours.

Data Acquisition

Used TLS was Cyra 2500, second generation laser scanner. This scanner’s version is based on the time-of-flight (TOF) principle. Single point range accuracy of ± 4 mm, angular accuracies of ± 60 micro-radians, and a beam spot size of only 6 mm from 0 to 50 m range. The field of vision (FOV) extends to 40 deg. vertically and 40 deg. horizontally. Two real scan examples data are used as following:

  • Example 1 The first object was Brunswick Lion in Braunschweig city, Germany, which has an area of approximately 6 m long, 4 m wide and 5 m high. Obtained point clouds are composed of 3D points, given in cartesian coordinates, along with intensity information. The laser survey was carried out at a distance of 10 m from the statue. The scan object is captured from four locations (stations). Figure 4a and b show the original laser scanned point cloud of Brunswick Lion and acquired by using the TLS system.

    Fig. 4
    figure 4

    a Photo of Brunswick lion statue. b Raw point cloud. Photo and raw point cloud data for Brunswick lion statue acquired by Cyra 2500

  • Example 2 The second object was a part from Altgebäude façade, TU Braunschweig acquired also by the same scanning system Cyra 2500. The scanning was carried out at a distance of 25 m from the façade. The density of points was about 2 cm of spacing. Figure 5a and b show a photo and raw point cloud for scanned façade.

    Fig. 5
    figure 5

    a Photo for a part from Old building facade TU Braunschweig. b Point cloud for a part from Old building facade TU Braunschweig acquired by TLS. Photo and raw point cloud for a part from Altgebäude façade, TU Braunschweig

In order to test the proposed algorithm and obtain the results, cyclone 7.4 (Geosystems 2012), commercial software was used for laser scanner data processing and exporting. The program has been used to export laser data to ASCII format (x, y, and z). By using self developed C++ code, three attributes: (n i , α i , d i ) for each surface coordinate point are calculated. The Visualization ToolKit (VTK) ((VTK) 2012) is used after adding and editing to visualize point clouds data and segmentation results. The processor that was used was an Intel P 2.2 GHZ (it takes 1 min and 49 s for processing the second field test). The edges are detected and can be verified visually.

Results and Discussion

In the presented two experimental tests, edges as feature are extracted successfully, depending on variation of normal vector values. Segmentation of point cloud from TLS system using smaller angles between neighbouring surface normal vectors leads to extract small features such as doors, windows, and walls. The used algorithm is efficient in the detection of all edges of different shapes. The major problems occur when the density of the point cloud varies or is too low, small occlusions on building façade leads also to error in segmentation processing that causes irregular edges on the extracted feature from laser data. From Figs. 6 and 7, it can bee seen that several edges (with red colour) are successfully found. The rest of points (with green colour) are lying on a planar surface. Once the normal vector of the tangent plane is known for each point cloud, the perpendicular distance from the origin is computed as mentioned before by using Eq. 2. The size of data is reduced and filtered from noisy points (see Table 1 for more information). Figures 8 and 9 show histograms for the perpendicular distance d for used two examples before segmentation process, whereas Figs. 10 and 11 show histograms for the same distance d for the same data after segmentation process.

Fig. 6
figure 6

a Edges results based on segmentation with condition of α t h  ≤ 10 deg.. b Edges results based on Segmentation with condition of α t h  ≤ 20 deg.. Segmentation results for Brunswick Lion, green colour represent plane surface while red represent edge points

Fig. 7
figure 7

a Edges results based on Segmentation with condition of α t h  ≤ 10 deg.. b Edges results based on Segmentation with condition of α t h  ≤ 20 deg.. Segmentation results for a part from of Altgebäude façade, green colour represent plane surface while red represent edge points

Table 1 Number of extracted edge points for two field tests
Fig. 8
figure 8

Frequency distribution of plane distance d for Brunswick Lion before segmentation

Fig. 9
figure 9

Frequency distribution of plane distance d for Altgebäude façade before segmentation

Fig. 10
figure 10

a After segmentation based on α t h  ≤ 10 deg. b After segmentation based on α t h  ≤ 20 deg. Frequency distribution of plane distance d for Brunswick Lion after segmentation

Fig. 11
figure 11

a After segmentation based on α t h  ≤ 10 deg.. b After segmentation based on α t h  ≤ 20 deg.. Frequency distribution of plane distance d for Altgebäude façade after segmentation

Segmentation results contained many noisy points and some boundaries were relatively distorted. By increasing the difference between neighbours normal angle, the edges of segmented patches were improved.

Surface slope and aspect are used to compare and evaluate segmentation results. Slope is the steepness or the degree of incline of a surface, while aspect is the orientation of slope measured clockwise in deg. from 0 to 360. Slope at surface point p i is computed as the maximum rate of change of elevation between that location and its neighbours. A raster surface from (x, y, and z) data is created by ArcGIS 10.1 soft ware and used to compute slope and aspect for Altgebude faade. Figure 12 shows values of slope and aspect for Altgebäude façade. Segmented edges by proposed algorithm and slope aspects methods are identical. The only difference is the image of slope method is more clearer where ArcGIS uses a range values to represent slope, whereas the proposed method uses only two values for edge representation.

Fig. 12
figure 12

a Surface slope. b Surface aspect. Surface slope and aspect for a part from of Altgebäude façade using ArcGIS 10.1 software

Conclusions

This work shows a method to detect and extract edge features of historic buildings from data provided by TLS systems. This approach is valid for monuments or buildings with linear features detection with planar surfaces. This methodology uses geometric properties for scan objects such as surface normal. The final result shows that it is possible to extract the main features of scan objects from surface information.

The proposed algorithm is programmed by C++ code and applied to real scan data. To obtain results, three main steps are used. First, calculate surface normal from nearest neighbour points for each point. Second, test variation in adjacent normal values based on threshold angle α t h . Two angle values for segmentation and edge detection have been used (20 and 10 deg.). Last, based on step two the edge points will be defined. Experiments have been used for testing the performance of the proposed algorithm using two real data sets. Linear edges and building boundaries are detected. Also, data has been filtered and reduced.

The segmentation results have been compared with surface slope and aspects from ArcGIS program. The main difficulties of edge segmentation techniques are as follows: when data contain noise, the effectiveness of the methods is greatly reduced; when the edge points detected are not continuous, it is difficult to link these discontinuous points. An example for drawback of proposed method, when it applied near glass windows. This may lead to overlapping edges especially at glass windows and the rim of a point cloud.