Keywords

1 Introduction

The significant advantages brought by robotics in smart factories have led to increasing productivity. As it is clear by the results of research activities of the past years, the improvement in technologies and the development of new ways to approach and solve problems, i.e., new algorithms for control systems, have fixed high quality standards, without neglecting human security [1, 2]. Although manufacturing and quality control are almost mature at the current state of art, logistics in warehouse management is of increasing interest. Mobile autonomous robots and vehicles able to carry pallets along defined paths are the bases of warehouse automation. In this context, the exact and fast localization of the robot within a structured environment is fundamental.

Many works have been proposed over the past few years to solve the general problem of robot self-localization in mapped environments [3, 4]. Odometry provides good accuracy for short-term measurements, but is heavily affected by increasing drift errors. In particular, complex environments, rough floors, high robot speed, and wheels slippage very often induce huge errors in the vehicle localization, thus disabling its applicability in many contexts. At the same time, gyroscopes and accelerometers can also be used for relative position measurements, but this method typically leads to poor accuracy due to the need for integrating noisy measurements over time. Other systems, as Global Positioning Systems (GPS), are only suitable for outdoor environments, where the line of sight between the sensor and GPS satellites is not occluded.

The best way to overcome the limits of the aforementioned techniques is the fusion of data from different sensors [5], including vision systems [6]. In this case the processing of images brought by cameras over the robot is aimed at the extraction of distinctive features, such as recognizable landmarks, in both two and three dimensions which are then referred to a single reference system, aiming at robot localization [7]. Laser scanners may also be used for measuring the bearings of reflective beacons in order to estimate the position of Automated Guided Vehicles (AGVs) [8]. Accuracies for this approach have been reported in the range of 0.25–1.72 mm, one order of magnitude more precise than vision-based estimators [9]. Commercial devices have also been proposed [10] to guide vehicles through reflection marks, whose position has to be topographically measured with submillimeter accuracy. However, low measurement rates, between 6 and 18 Hz, and high costs limit their eligibility for real-time applications in modern industries.

Moreover, in robot localization and navigation are often used Laser Range Finder and RFID systems; the former can be used for many applications [11] since it provides accurate measurements although with some limitations tied to reflections, surface, kind, etc. The latter are recently applied to the mobile robot localization problem [12]; for example, an active RFID tag can be placed on a mobile robot and by means of a tag reader, the robot position can be estimated. However, this method does not provide quite accurate results.

In this paper we propose a cost-effective method for thorough and fast localization of vehicles in a smart warehouse, where the use of classical odometry is strictly limited by continuous wheel slippage and vehicle bumps. The system exploits laser profilometry to get information about the heading of the vehicle and its relative position within an allowed path (the rail). The prototype has been placed on a Smoov ASRV platform [13], which constitutes the basis of an autonomous storage system. The main advantage brought by the proposed setup is related to the small cycle time required by the sensor to detect the robot pose, comparable with the one required by inductive sensors, but with much higher measurement resolutions. Moreover, the capability of the system to detect crossing holes along the rail opens the possibility of accurate estimation of the vehicle speed. The paper is organized as follows: in Sect. 2, the description of the experimental setup is provided together with the explanation of the mathematical background and the image processing phases; Sect. 3 is devoted to the analysis of results, whereas final conclusion and remarks are reported in Sect. 4.

2 Methodology

As briefly described, the proposed system is made of a laser source placed in front of an AGV able to project a line-shaped beam on the border of the rail. Under these conditions, the analytical properties of the laser line within the image plane of the camera are related to the actual position of the vehicle with respect to the guiding rail. Moreover, the detection of holes within the projected line can be linked to the position of the AGV along the direction of the trajectory. In the following subsections a brief explanation of the mathematics that transforms pixel coordinates into metric ones will be provided. Then the description of the components of the acquisition system will be presented together with the image processing steps.

2.1 Mathematical Framework

The distance between the mobile robot and the binary profile can be computed following the principle of laser triangulation. A laser beam, that impinges on a target, can be detected by a camera at a given position in the image. When the target is approaching the triangulation system, the revealed laser spot undergoes a coherent shift in the image plane. When these concepts are extended to the case of a laser line, further information on the system orientation can be extracted by the inspection of the line slope in the camera plane.

Since the laser line can be decomposed into multiple laser dots, the analysis can be developed in two dimensions for each point of the line, i.e., for each pixel of the camera. Figure 1 displays a two-dimensional sketch of the triangulation system where \(\alpha \) and \(\beta \) represent the tilt angles of the laser and the camera with reference to the y-axis, respectively. Here, each distance between the vehicle and the rail border is computed along the y-axis, whereas the x-axis represents the direction of motion of the robot. Moreover, the camera optical axis crosses orthogonally the center of the image plane.

All formulations are derived under the hypothesis that the optical system can be effectively approximated to a single thin lens having center point C, so that:

$$\begin{aligned} \frac{1}{f}=\frac{1}{F_0 }+\frac{1}{S} \end{aligned}$$
(1)

where f is the focal length, while \(F_{0}\) is the distance between the point C, and the intersection P of the laser beam and the camera optical axis. Finally S denotes the distance between C and the image plane. It is worth noticing that we have implemented the thin lens model rather than the pin hole model because the first provides more accurate measurements in comparison with the second.

Since the goal of the technique is the estimation of vehicle distance d, the problem can be aimed at the identification of the two components \(y_{{ OFF}}\) and \(\varDelta y\). The former is a constant term depending on \(F_{0}\) and \(\beta \), and has to be determined during the calibration of the setup. On the contrary, the latter is dynamically related to the actual position of the vehicle. Both parameters can be derived through the term a, which represents the pixel displacement of the detected laser spot within the image plane subsequent to the shift of the target, or, equivalently, of the laser-camera set. The term a can be easily related to A exploiting the knowledge of F and S. After a simple algebra, the displacement \(\varDelta y\) is obtained as

$$\begin{aligned} \Delta y=\frac{a\left( {F_0 -f} \right) \cos \alpha }{\cos \left( {\alpha -\beta } \right) \cdot a+\sin \left( {\alpha -\beta } \right) \cdot f} \end{aligned}$$
(2)

where a is prior multiplied by the camera pixel size.

Fig. 1
figure 1

Geometry of triangulation system

The second step regards the definition of the relative displacement of the detected laser spot along the x-axis. Assuming that laser spot is detected at coordinates (ab) in the camera reference system having origin in the center of the image plane, the actual x-displacement \(\varDelta x\) with reference to the optical axis is derived as

$$\begin{aligned} \Delta x=\frac{b\left( {F_0 -f} \right) \sin \left( {\alpha -\beta } \right) }{\cos \left( {\alpha -\beta } \right) \cdot a+\sin \left( {\alpha -\beta } \right) \cdot f} \end{aligned}$$
(3)

where, once more, b is multiplied by the pixel size of the camera before its application. In summary, Eqs. (2) and (3) allow estimation of the depth distance and heading of the vehicle with reference to the rail border, knowing the coordinates of each point of the laser line detected on the camera plane.

2.2 Experimental Setup and Image Processing

As stated above, the acquisition system is made of a laser and a camera which are mounted over a metallic structure fastened onto the front of the mobile vehicle. Figure 2a reports a picture of the prototype used at ISSIA-CNR laboratories for calibration and validation of the system, whereas Fig. 2b displays the actual experimental setup mounted on the Smoov ASRV platform.

In this case the laser line is generated by the Lasiris SNF 701L by Coherent [14], which produces an output beam with a power of 200 mW, clearly visible under any ambient light conditions. The laser is assisted by a cylindrical lens able to broaden the laser beam in a cone of \(30^{\circ }\) of aperture. Consequently, the length of the laser line is equal to 150 mm, i.e., higher than the maximum size of holes and bends that have to be reconstructed to control the robot position along the x-axis.

Since the optical system is aimed at the fast localization of a moving vehicle, whose maximum speed is equal to 1.3 m/s, a camera with high frame rate is required. Here the EoSens CL camera by Mikrotron [15] is used to achieve 120 fps with base Camera Link interface at full resolution. Nevertheless, the image plane has been downsized to a resolution of 1280 \(\times \) 96, thus reaching a camera frame rate of 250 fps. Furthermore, the pixel size of the proposed camera is equal to \(14\,{\upmu \mathrm{m}}\). All these parameters allow estimation of a maximum depth \(\varDelta y_{\text {max}}\) equal to about 16 mm, which is high enough to cover all the spatial oscillations that can affect vehicle movement.

Fig. 2
figure 2

a Experimental setup for calibration and validation steps (see the aperture in the rail profile) and b triangulation system fastened on the Smoov ASRV platform

The detected line is first handled for the compensation of distortion effects due to curvature of the camera lens, following the well-known notation introduced by Heikkilä [16]. The laser line can be now scanned for columns in order to find the position of the maximum of laser intensities. Many algorithms able to determine the peak position with subpixel precisions have been presented in [17]. Since the final intention of the proposed methodology is fast estimation of the laser peak position, each column that includes at least one pixel with intensity higher than a threshold value, is scanned along its extension, until the slope of the laser intensity changes its sign. Consequently, the last row coordinate is stored in memory and labeled as the peak position related to the particular column under investigation. It is worth noting that in this way secondary reflections are filtered out since the geometry of the problem induces second-order peaks always below the laser line. In this way the peak estimation allows a depth resolution of 0.17 mm, which is adequate for the vehicle control system. Moreover, a simple least squares fit is used to extract the line that best matches the samples in the camera plane. Therefore, the vehicle poses are derived by considering only two points taken from the fitting straight line and applying the formulations in Eqs. (2) and (3).

As stated previously, the presence of arrays of holes on the rail borders enables the possibility to refer the vehicle position to constitutive landmarks. This goal is achieved by looking for the hole edges. With reference to Fig. 3 three different working conditions can be identified:

  1. 1.

    the line is continuous and its length is completely comprised in the camera plane: no holes are detected (Fig. 3a);

  2. 2.

    the line is continuous but its length is lower than its expected value: the vehicle is starting experiencing of leaving a hole (Fig. 3b);

  3. 3.

    two segments can be detected on the image plane: the vision system is between a crossing hole (Fig. 3c).

The evaluation of edge corners is performed by counting the number of spacing columns, i.e., the number of meaningful laser peaks. If this value is higher than a threshold number, the two laser margins are labeled as the hole margins. Finally, the speed of the vehicle can be estimated by processing a set of consecutive frames. Knowing the frame rate of the camera, namely 250 fps, the edge displacement can be tracked over time thus reconstructing the motion properties. Figure 4 reports an example of subsequent frames extracted from a single acquisition. Here the detected corners are not aligned horizontally since the vehicle trajectory is not completely straight.

3 Experimental Results

The designed prototype has been used for actual measurements. In the following subsections results will be presented and discussed in order to show the capability of the system to aid the control of smart vehicles.

Fig. 3
figure 3

Evaluation of the three working states: a no holes are detected, b the starting edge of a hole is found, and c the hole is comprised in the image. Markers identifies the hole edges

Fig. 4
figure 4

Corner displacement on consecutive frames. See that the detected corners are not aligned horizontally since the AGV trajectory is not completely straight with respect to the rail

Fig. 5
figure 5

Relative errors in the estimation of (a) distance and (b) rotation of the rail border with reference to the vision system

3.1 Preliminary Analyses for Setup Validation

Before going through the actual validation, the setup is preliminarily calibrated exploiting the knowledge of the size of a simple triangular object placed in front of the vision system and scanned in different positions. Once points are extracted in the camera plane, relative distances between the target edges are used for the inverse computation of the calibration parameters \(\alpha \), \(\beta \), and \(F_{0}\) in Eqs. (2) and (3).

First experiments are then run to compare the estimated distances and headings of the vision system with respect to a sample rail placed on a rot-translational micrometric stage at known positions (see Fig. 2a). As a first example, Fig. 5 reports the evaluation of relative errors in actual measurements. Figure 5a shows the effect of shifting the rail border far from the vision system of known distances, but keeping the rail tilting. On the contrary Fig. 5b displays the alteration of relative errors in tilting measurements without changing the rail distance. The analysis of results demonstrates that relative errors in computing distances are always below 2 %. When the sample rail is tilted by \(10^{\circ }\), the absolute error is found equal to \(0.321^{\circ }\), which corresponds to small absolute error in the estimation of depth differences of only \(4.2\,\times 10^{-2}\,{\text{ mm }}\).

Finally, Table 1 reports the estimated widths of the rail aperture, whose extension is equal to 95.5 mm, changing the relative position of the laser-camera set with reference to the target. Further experiments have been performed finding an average accuracy of about 0.3 % and a \({\sim }99\,\%\) confidence interval of about 0.32 mm.

Table 1 Estimated widths of rail aperture

3.2 Vehicle Implementation

Once results are validated, the setup is used for the actual dynamic measurements of the vehicle pose. Here, two different acquisitions with inverted trajectories are reported. In the first acquisition the robot travels a distance of about 2.5 m, whereas in the second acquisition it travels for about 4 m.

The first results are reported in Fig. 6 where the vehicle poses with reference to the rail boundary are displayed as a function of the vehicle movement. In this case it is possible to observe the effect of the motion control system which compensates the vehicle trajectory for wheel slippage, inducing controlled oscillations. Moreover, final spikes in the robot positions are due to vehicle collisions with the rail border at the end of both measurements. Furthermore, it is worth noting that the missing points in Fig. 6a are due to the presence of the apertures on the rail border.

Finally, the estimated speed of the mobile robot when it crosses the aperture is reported in Fig. 7, which shows how the vehicle reaches the target speed of 1.3 m/s and how this value fluctuates during the full-speed regime. As noticed previously, speed surges and dips are mainly due to wheel slipping and the corresponding control instances. Furthermore, it is important to observe that speed values are discretized as a consequence of the technique adopted for speed estimation. In other words, hole corners are detected in discrete pixel positions which are then transferred to metric units. Being the sample step equal to 4 ms, estimated speed values show leaps of about \(4\,\times 10^{-2}\) m/s. Nevertheless, this aspect would not introduce problems in the efficient control of the vehicle trajectory.

Fig. 6
figure 6

Distance (a) and heading (b) of the AVG computed with reference to the rail boundary for the first and second acquisitions (red and blue dots, respectively), as a function of the vehicle position along the x-axis

Fig. 7
figure 7

Estimated speed from the first and second acquisitions (red and blue lines, respectively)

4 Conclusions

In this paper we have presented an accurate system to determine the position of a mobile vehicle moving in a structured smart warehouse. A system for laser triangulation has been fastened on an autonomous vehicle (Smoov ASRV) in order to get back information as fast as inductive sensors and classical odometry, but with higher accuracy. A quantitative validation of the proposed system has been given by means of controlled acquisitions, with results in terms of depth, tilt, and aperture width estimations which are in good agreement with nominal values. Then, the proposed system has been set for the dynamic pose measurements of a moving vehicle, together with the evaluation of its speed. Results prove small errors, lower than 2 and 3.2 % in distance and tilt estimation, respectively, with single processing time of 4 ms. Although the proposed vision system has been developed to control the specific vehicle, its application can be extended to any mobile robot in a structured environment. Further activities will lead to the implementation of the algorithm on a single-board computer, and to the use of a more compact laser source and an on-chip camera.