Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

The design and development of systems for Ambient Intelligence (AmI) is an active investigation field, with many potential applications, including safety, security, and health-care assistance [8, 13]. One of the main research challenges in the AmI domain is activity monitoring, which, in turn, depends on accurate and robust people tracking. This can be achieved by means of distributed sensor networks. As a basic principle, in distributed estimation, each node of the network locally estimates the state of a dynamical process using information provided by its local sensor and by a subset of nodes of the network, called neighbors [17]. In the literature, several approaches to distributed estimation in sensor networks can be found. Their particular characteristic is the presence of an agreement step that aims at minimizing the discrepancy among sensory nodes [2, 3, 6]. Among the various sensors, cameras have been especially investigated as an effective solution for environmental monitoring. In [14], data fusion and tracking methods for decentralized and distributed camera networks are discussed. A review of distributed algorithms for several computer vision applications can also be found in [15], emphasizing the advantages of distributed approaches with respect to centralized ones.

The use of multiple sensors increases reliability and effectiveness in large environments. As a drawback, it imposes the need of modifying infrastructures that can be heavy and expensive. A smart and innovative solution to this problem is the exploitation of flexible moving sensors that can be mounted on semi or fully autonomous vehicles. These vehicles represent mobile nodes of the distributed network. They can be employed as individual agents or organized in teams to provide intelligent distributed monitoring of broad areas. Mobile sensors may significantly expand the potential of AmI technologies beyond the traditional passive role of event detection and alarm triggering from a static point of view. Mobile robots can actively interact with the environment, with humans or with other robots to accomplish more complex cooperative actions [1, 7, 16]. Nevertheless, mobile surveillance devices based on autonomous vehicles are still in their initial stage of development and many issues are currently under investigation [4, 5, 10].

Fig. 1.
figure 1

Conceptual representation of the proposed architecture for ambient intelligence.

In this paper, a Distributed Cooperative Architecture (DCA) is presented. It integrates fixed and mobile heterogeneous sensors to intelligently monitor large environments and track people. Figure 1 shows a conceptual representation of the system, which includes fixed calibrated cameras and a team of autonomous mobile robots equipped with different sensors. The system is being developed as part of the Italian National Research Program PON-BAITAH - “Methodology and Instruments of Building Automation and Information Technology for pervasive models of treatment and Aids for domestic Healthcare”, which is aimed to develop ICT AmI technologies to support fragile people in their domestic environments.

In the proposed system, mobile sensors supply two main functionalities: (1) they can supply information about the observed human target in areas not surveyed by the fixed cameras; (2) they can move close to the target to increase precision and reliability of scene analysis whenever fixed sensors are unable to provide robust estimates.

The main contribution of this work is related to the design of the DCA, which involves different challenges. The major one is the integration of high-level decision-making issues with primitive simple behaviors for different operative scenarios. This aim requires a modular and reconfigurable system, capable of simultaneously addressing low-level reactive control, general purpose and monitoring tasks and high-level control algorithms in a distributed fashion. The details on the DCA and its main components are described in Sect. 2. Preliminary real-world experiments using the proposed DCA system are presented in Sect. 3. Finally, conclusions are drawn in Sect. 4.

2 Distributed Cooperative Architecture

The Distributed Cooperative Architecture (DCA) is a control architecture for heterogeneous networks of sensors and robots, hereinafter referred to as agents. All the agents form a peer-to-peer network, and differ for their own sensor capabilities. Specifically, every agent is able to detect an event (e.g., to perceive moving people or objects) and to localize an event (e.g., tracking the position of a person) in the environment using one or more sensor devices; in addition, mobile agents are able to execute tasks, through their actuators. Both the static cameras and the mobile agents take advantage of people detection modules, which provide the input to a distributed target tracking algorithm, namely the Consensus-Based Distributed Target Tracking (CDTT) algorithm, previously proposed by some of the authors in [12]. The CDTT module constitutes the main layer of the DCA. In the following, first, the CDTT algorithm, is recalled. Then, a detailed description of fixed and mobile agents is provided.

2.1 Distributed Target Tracking Algorithm

The core functionality of the DCA system is the tracking of people, based on the fully distributed Consensus-based Distributed Target Tracking (CDTT) algorithm. This algorithm is aimed to enhance the performance of the people tracking task in a heterogeneous sensor network. It consists of a two-phase iterative procedure: an estimation step and a consensus step. In the estimation phase, each node of the network gives an estimate of the position of the target. If the node can directly take a measurement, then it will estimate the target position by means of a Kalman filter. Otherwise, the node will take a prediction of the target motion according to the embedded linear motion model of the Kalman filter. In the consensus phase, all the estimates in the network converge to a common value via a max-consensus protocol, performed on a measurement accuracy metrics called perception confidence value. This approach was proved to provide good performance in heterogeneous sensor networks composed by nodes with limited sensing capabilities [6]. The CDTT approach is totally distributed, as it does not involve any form of centralization. Moreover, it guarantees the agreement of the network nodes on the target position. The reader is referred to [12] for further information.

2.2 Fixed Agents

Fig. 2.
figure 2

Schematic representation of interconnections among modules composing the Fixed Agent node.

The Fixed Agents run on a workstation linked to each camera by a network infrastructure. The schematic representation of interconnections among the components of the Fixed Agent module is shown in Fig. 2. The Fixed Agents are implemented in the Robot Operating System (ROS)Footnote 1 framework. In particular, each Fixed Agent is a ROS node in which two main components run: the perception module and distributed target tracking module.

The perception module includes a set of functionalities illustrated in Fig. 2 (i.e., motion detection, shadow removal, object tracking, 3D moving object localization) aimed to robustly detect and localize people with respect to the cameras. Details about the developed algorithms can be found in [11].

The distributed target tracking module implements the CDTT algorithm, described in the previous subsection, and the communication procedures, which allow the cooperation of agents via the UDP protocol. The UDP protocol is used for the inter-agent communication because, with the multicast technique, it avoids the need for a centralized hub.

2.3 Mobile Agents

Fig. 3.
figure 3

Schematic representation of interconnections among modules composing the Mobile Agent node.

The Mobile Agents of the network consist of mobile robots. Each mobile agent is equipped with sensory devices to interact with the environment. Every mobile agent is able to localize itself in the environment and to safely navigate avoiding static and dynamic obstacles. It is also able to identify and track the position of a target in the environment. ROS has been adopted as a framework for communication management, sensor acquisition and actuator control. ROS provides a Navigation Stack, which enables the robot to navigate in a known environment avoiding obstacles, as well as sensor management packages [9]. The structure of the navigation stack of ROS was modified so as to develop a customized monitoring architecture enriched with new functionalities with respect to the native ROS framework. Specifically, surveillance capabilities were added to the mobile nodes. A coordinate transformation from local to global coordinates was also introduced for the people tracking task. A schematic representation of a mobile agent is shown in Fig. 3. All ROS nodes run on the on-board laptop, except for sicktoolbox_wrapper and p2os_driver, which run on the embedded pc of the robot. As can be seen, the Navigation Stack of ROS produces robot position estimates, as well as information about obstacles on the basis of laser measurements. The ROS node motion_control, implemented by our research team, sends velocity references to p2os_driver ROS node, responsible of the robot guidance. The people_tracker node estimates the relative position of people with respect to the robot. It is based on the openni_tracker, which uses input from an onboard Kinect camera. The relative coordinates of detected people, transformed in the world reference frame, provide input data to the distributed target tracking algorithm.

3 Experimental Results

In [11], the proposed system was validated through numerical simulation campaigns, showing that the presence of a mobile node in addition to the fixed agents improves the tracking accuracy. Here, experimental tests conducted in a real-world scenario are presented. First, the experimental setup is described, then results of the experimental tests are reported.

3.1 Environment Setup

Fig. 4.
figure 4

Map of one corridor of the office with overlaid the position of three static cameras (red circles) and one mobile agent (green triangle) (Color figure online).

The experimental setup is shown in Fig. 4. The picture displays the map of a corridor of the ISSIA-CNR building, as obtained by the ROS gmapping node using laser data acquired by a mobile robot during a complete exploration of the environment. In this experimentation, three fixed cameras and one mobile robot were employed. The positions of the fixed cameras (\(C_1\), \(C_2\), \(C_3\)) and of the initial position of the mobile robot (\(R_1\)) are overlaid on the map. Using its on-board sensors, the mobile agent is able to localize itself in the environment and to carry out surveillance tasks, such as people detection and tracking. Cameras are calibrated, therefore events detected in the image plane can be located in the real world and their positions can be communicated to the mobile agent. The mobile robot can explore areas that are unobservable by the fixed cameras, thus improving the accuracy in detecting events by reaching proper positions in the environment. Hence, the proposed system could be useful to reduce the number of fixed sensors or to monitor areas (e.g., cluttered environments) in which the field of view of the fixed cameras can be temporarily and dynamically reduced.

3.2 Sensor Network Setup

Fig. 5.
figure 5

The nodes of the network. On the left, the mobile agent PeopleBot. The robot is equipped with a laser range-finder SICK LMS200 and a Kinect. On the right, two different AXIS cameras: on the top, a Mpixel Axis IP color camera with \(1280\times 1024.\) On the bottom, an Axis IP color cameras with a \(640\times 480\) pixel camera.

Table 1. Average MSE and variance in the tracking of a person moving in the laboratory by means of a network of \(4\) nodes, \(3\) fixed and \(1\) mobile.

The fixed nodes consist of three wireless IP cameras (\(C_1\), \(C_2\), \(C_3\)) located in different places of the environment (see map in Fig. 4). \(C_2\) and \(C_3\) are Axis IP color cameras with a \(640\times 480\) pixel resolution and an acquisition frame rate of 10 frames per second. \(C_1\) is a Mpixel Axis IP color camera with \(1280\times 1024\) pixel resolution and full frame acquisition rate of 8 frames per second (see Fig. 5, on the right). A calibration step to estimate intrinsic and extrinsic parameters was performed for each camera using the Matlab Calibration ToolboxFootnote 2. This allows camera coordinates to be mapped to the map reference frame.

The mobile agent (denoted as \(R_1\) in Fig. 4) is a PeopleBot mobile robot platform equipped with a laser range-finder, a Kinect, and an on-board laptop (see Fig. 5, on the left). The SICK laser is connected with the embedded robot control unit. The Kinect camera and the PeopleBot control unit are connected with the laptop, via a USB cable and a crossover cable, respectively. The laser range-finder is used to build a map of the environment and to localize the vehicle. The Kinect is used for both navigation (e.g., obstacle avoidance) and high-level tasks, such as people detection and tracking.

Fig. 6.
figure 6

Trajectory 1. The measurement of the position of the target carried out by each sensor of the network (a) and the CDTT trajectory recovered on line and in distributed fashion by the network (b) (Color figure online).

Fig. 7.
figure 7

Trajectory 2. The measurement of the position of the target carried out by each sensor of the network (a) and the CDTT trajectory recovered on line and in distributed fashion by the network (b) (Color figure online).

3.3 Results of Experiments

A network of three fixed cameras and one robot, realized in our lab, was used to verify the DCA system performance in a real application: to track a person moving in the environment. The two different trajectories followed by the target during the experimentation are shown in Figs. 6 and 7. In both Fig. 6(a) and Fig. 7(a), the red line represents the real trajectory of the target, while different markers represent the sequence of target positions estimated by each single node and by the monitoring network as a whole using the DCA architecture. In particular, Fig. 6(b) and Fig. 7(b) compare the target trajectory (red line) with the trajectory estimated by the CDTT algorithm (blue dots). The CDTT algorithm requires all the network nodes to share the same information about the target location after convergence of the consensus step. Therefore, the estimated target position is the same for any node of the network. To quantify the tracking performance, the target is supposed to move at constant velocity to allow the calculation of the MSE. The collected results, shown in Table 1, exhibit a mean square error of \(1.15\) m and \(0.75\) m, for Trajectory 1 and Trajectory 2, respectively.

4 Conclusions

The paper introduced a distributed cooperative architecture for applications of ambient intelligence. Its main contribution is a monitoring network composed by fixed and mobile nodes. The use of mobile nodes produces a twofold advantage: it allows the complete coverage of large environments with a lower number of sensors (with respect to the use of fixed nodes), and it increases the accuracy of measurements by deploying the sensors in the most favorable positions to observe the current target. The global control architecture used by the system was presented and the software agents developed for both fixed and mobile nodes were described. The feasibility and effectiveness of the proposed system are shown by preliminary experimental results obtained using a monitoring network realized in our lab environment.