1 Introduction

Robots are becoming essential tools for helping humans in daily life in urban environments; and therefore, are required to have acceptable human-like behaviors. In the field of social robot navigation, it is crucial that robots develop the abilities to navigate socially among humans in uncontrolled environments. It is also crucial that those navigation skills be flexible and can be applied to various situations, including navigation [1], accompaniment of one [2] or two people at first [3, 4] and then more people in the future (shown in Fig. 1), approaching people in the environment [5], and displaying a combination of more than one behavior at the same time, such as approaching one person while accompanying another [6]. In addition, these methods must have the potential to be customized by specific users in the future, which would enable society to have personal robots.

Fig. 1
figure 1

Real-life experiments. Top: Three group formations using the ASP-VG method.Bottom: Three group formations using the ASP-SG method. Left: One volunteer is accompanied by our robot, named Tibi, using a Side-by-Side accompaniment. Center: Two volunteers are accompanied by Tibi positioned at the side of each formation. Right: Two volunteers are accompanied by Tibi positioned in the center of each formation

It is not an easy endeavor to develop human-like abilities and apply them to robots. In the field of urban robot navigation and focusing on group accompaniment, robots need to perform a large number of parallel tasks to achieve a natural and human-like interaction with people. For instance, predicting human motion, inferring the most likely destination of the person, dealing with perception problems, such as momentary or large occlusions, inferring the preferred path of the companions of the group to reach a common destination for the group, avoiding other pedestrians and static obstacles in the environment while navigating, performing an understandable behavior, either by displaying similar human behavior or by making robotics’ behaviors understandable through speech, and using social distances or other types of social behaviors that will allow people to experience pleasant interactions with the robot. Moreover, robots need to deal with humans and dynamic environments that are sometimes unpredictable and complex. All the tasks mentioned in this paragraph are accomplished by our robotic system, which includes the ASP as the core.

One of the main problems to be solved when accompanying groups of people is ensuring human comfort. Some studies on pedestrian models suggest that a side-by-side formation is the most comfortable formation for groups of two people and a V-formation for groups of more than two people [7,8,9,10]. Therefore, robots should be capable of using these two formations. Further, we are interested in analyzing whether these formations are perceived differently using the comfortableness criteria. Additionally, our robot system includes the theory of social distances defined by Hall [11] and other works that study the most comfortable behaviors for robots during navigation [12,13,14]. The theory of social distances is also called the theory of proxemic rules.

In this paper, we present the ASP, a general planning methodology that can be used to perform different Human-Robot Collaborative Navigation (HRCN). It is flexible and can be adapted to various situations (navigation [1], accompaniment [2,3,4], approaching [5], accompaniment and approaching at the same time [6, 15], or other behaviors [16]) and different robots (humanoids or drones [17, 18]). In addition, the method can be improved in the future by including mechanisms that will allow specific users to customize the ASP forces and costs to include their preferences. In this paper, we show how the ASP can be customized to develop different robot group accompaniment behaviors with (i) a robot’s Adaptive Social Planner (to match with the acronim ASP-VG using a V-formation model to accompany Groups of people (ASP-VG) and (ii) a robot’s Adaptive Social Planner ... Side-by-side... Groups (to match again with the Acronim ASP-SG). A first approximation of ASP-VG and ASP-SG methods have been previously introduced in [3, 4]. We developed these two group accompaniments because it is necessary to explore human preferences while being accompanied by robots; therefore, it is important to develop different types of accompaniment. These planning methods use the Bayesian Human Motion Intentionality Prediction (again to match with the acronim) (BHMIP) [19], Rapidly exploring Random Tree (again for the acronim) and several versions Extended Social Force Model (the same than previous ones.) (ESFM). The ESFM derives from the Social Force Model developed by Helbing [20]. The novelties of the present paper are included in Sect. 7.

In the remainder of the paper, the related work is presented in Sect. 2. Section 3 explains the system that allows a robot to perform different types of collaborative navigation, whose core is the Adaptive Social Planner of Sect. 3.2. This should be Sect. 3.3 the ASP-VG and ASP-SG. The performance metrics used to evaluate the social behavior of the robot for both methods are described in Sect. 4. Section 5 presents the results of the synthetic experiments of both methods. In Sect. 6, we provide the guidelines for performing real-life experiments with volunteers nonexperts in robotics and evaluate the real-life experiments. These experiments include five user studies to analyze the acceptability of these two methods and the preferences of nonexpert people. In addition, we include discussions in Sect. 7. Finally, conclusions are presented in Sect. 8.

2 Related Work

Robots designed to share urban spaces with people and assist them are required to have the ability to navigate autonomously and socially. Thus, many works and various surveys have been developed in the field of research on autonomous social navigation [21,22,23,24]. Several works, such as [22, 23], view social robot navigation as a cooperative activity with humans avoiding each other simultaneously, and other articles [24] combine different types of communication to achieve a more natural Human-Robot Interaction (HRI).

Some papers revealed that robots that move predictably and socially can increase people’s accessibility and satisfaction [25,26,27] with them, as well as people’s trust in [28] and comfort with [29] the behavior of the robot and the perception of safety in a robot’s presence [30]. If researchers do not consider these social conventions, this may result in low success rates in societal applications [31]. Therefore, it is important to include the human social conventions in the behavior of the robot to ensure that it can be accepted as a partner by inexpert people; our current work does this by including the theory of social distances defined by Hall [11] and other related works [12,13,14].

There are previous works in the field that did not implement equal and collaborative accompaniment in terms of human guidance [32,33,34,35,36] and following [37,38,39,40]. Recently, in the field of autonomous social navigation, a few of them have begun to consider robots as partners in a one-person side-by-side accompaniment [41,42,43,44,45] and in group accompaniment using other formations [13, 36, 46, 47]. Nevertheless, these group accompaniments do not promote a more natural human-robot interaction among the group members during the accompaniment, which we try to achieve in our current work.

In robot side-by-side formation accompaniment, some studies include a prediction to anticipate the behavior of the partners and to navigate more intelligently. In [48], the authors developed a side-by-side method to infer the final goal of the person, which has its basis in previous works [6, 43, 49]. In this paper, robots perform a reactive companion task by considering that two goals must be fulfilled: achieving a position of 90 degrees to the person and moving towards the goal of the person. Another side-by-side accompaniment incorporates learning techniques [44]. The authors present a method that applies reinforcement learning to teach a teleoperated robot how to navigate autonomously with a human in a cooperative way while avoiding obstacle collisions. In [50], a robot accompanies a person using the predicted trajectory and remains in a desired position relative to the human.

Several approaches have been developed to accompany and follow one person with several robots [43, 49, 51]. Nevertheless, very few of these considered more than one person being accompanied by a robot [13, 52,53,54]. Additionally, the works on group accompaniment tend to see the robot as a guider rather than as a companion or coworker who is part of the group. The approaches that consider more than one person and more than one robot can be found in works such as those by Saez et al. [52], Urcola et al. [53] and Garrell et al. [13]. In these works, researchers implement group strategies that use different robots to maintain the cohesion of the group by using attraction forces between the members of the group and repulsion forces to avoid obstacles. In the field of robot guides, Diaz et al. [55] developed an exploratory study on group interaction with a robot guide, which provided fruitful insight into understanding the relationship between robot positioning and efficient communication, the use of motion cues and collaborative walking together behavior. Triebel et al. [36] implemented a social guide robot in airports that considers social behaviors and moves in a dynamic environment.

The most complex approaches that have been developed to date for one-person or group-of-people accompaniment have been designed for use with wheelchairs [56], a social necessity. Prassler et al. [57] implemented a method of accompaniment for a wheelchair, designing a collision avoidance model based on velocity that incorporates a linear prediction of collision velocities. Kobayashi et al. [58] used a visual-laser tracking technique to carry out a side-by-side companion task between a wheelchair and a caregiver, demonstrating the same effect within the context of visiting a museum. Finally, Suzuki et al. [59] proposed a wheelchair system that navigates in a formation that renders a more natural communication between the user and the caregiver.

Here, we also explain the differences among the state-of-the-art methods and our methods (ASP and the derived methods for group accompaniment). Our methods are capable of accompanying more than one person, in addition to working on not only maintaining the group cohesion or only maintaining a fixed formation (side-by-side or other). Therefore, our algorithms allow a more dynamic positioning around the human partners in order to avoid obstacles, while using a people-robot best formation to allow the communicative interaction among the group members. This dynamic positioning enables the group to remain fully involved in their social interaction for a longer time, and it makes the robot capable of adapting to the environment. Furthermore, our ASP method can render a real-time prediction of the partners’ dynamic movements, as well as those of other people, over a time horizon. This type of prediction allows the robot to anticipate human navigation and react accordingly, and to facilitate the navigation behavior of all pedestrians, especially facilitating the navigation of its companions during the accompaniment.

Another difference compared with the methods of the state-of-the-art is that the presented methods include people’s social rules, which allow for more comfortable robot behavior. These rules are extracted from several state-of-the-art works focused on proxemics and comfortableness in HRI [11,12,13,14]. For example, Hall [11] studied human behavior and developed a definition of social distances depending on the situation or the relationship between individuals. If a person you just met does not keep a distance according to that relationship, in other words, comes closer than necessary indicating a more intimate relationship with you. Then, the proximity of the person may make you uncomfortable in certain situations. Using these state-of-the-art studies, we allow our robot to use distances and velocities that are more comfortable for all people (bystanders and people interacting with the robot), especially regarding people who interact for the first time with a robot. Additionally, we include other robot behaviors, such as not making sudden movements so as not to scare people, using comfortable and smooth velocities for these people during accompaniment, and not approaching very quickly. In general, we adapt the movements of Tibi to the human behavior during its navigation to make them feel comfortable while interacting with the robot.

Another difference from the state-of-the-art with respect to our algorithms is that we use several subcost functions to evaluate the planned paths and to select the best one regarding different criteria, in my version is Sec. 3.2.2. These criteria are the minimum group navigation effort to arrive at the group destination, while the maximization of the comfort of the group in terms of maintaining their communicative interaction during the accompaniment, and the minimum navigation effort for the bystanders of the environment in order to avoid the robot. Normally, humans try to select the optimal path regarding the same criteria that we are using to select the best path. This behavior allows the robot to anticipate which path will be selected by the accompanying people, taking into consideration the same objectives of the accompanied people, such as shorter distance and changes in orientation to arrive at the destination, avoiding or getting closer to several objects of the environment, maintaining the formation of the group as long as possible to speak properly, and using different social distances depending on the relation between people. Examples of this behavior may include maintaining the social distance defined by Hall for the people of the group and maintaining a greater distance from other people in order to not interact with them. In addition, other examples may include approaching a person by considering the social distances and a formation that people use if more than one person is involved in the interaction.

The last difference with respect to previous works is that our methods include four new skills to deal with random movements and uncertainties of human behavior, which previous methods do not consider. These skills are included in this should be Sect. 3.3.3. Additionally, these skills allow us to obtain a better group accompaniment by always maintaining the “exact” side-by-side or V-form using skills one to three and adapting the behavior of the robot to the velocity of its companions using skill four. Furthermore, skill one can be used in any other situation that implies laser occlusions between people. Skill two can be used in other situations that imply the rearrangement of people inside groups. Skill three can be used in methods that use punctual destinations in urban environments to solve the problems that arise when the real destinations are not punctual ones (stairs, entrances and exits of streets or squares), or these environments include objects and people to avoid. Skill four can be used to adapt to the velocity of people in situations of joint navigation.

Fig. 2
figure 2

Structure of the system that includes the ASP as a core. The diagram includes the inputs of the system and the methods that use the ASP. The ASP method has two parts. The first one is the ESFM combined with the RRT*, which computes all possible paths for robot social navigation. The second one is the gradient descent optimization of these paths, which allows the method to select the best path to obtain the adaptive robot motion in dynamic environments. Also, the AP-SG and ASP-VG are sub-methods of the ASP that implement the robot group accompaniment. In addition, other methods to implement robot alone navigation [1] and robot approaching people [5, 6] are sub-methods of the ASP

3 System for Human Robot Joint Navigation

3.1 Methods that Uses the ASP

The ASP local planner uses the information extracted from other methods to be able to implement a complex robot collaborative navigation. This modular implementation uses different robot operating system (ROS) nodes to implement the different parts. It is essential to achieve a good functioning of the complete system by simplifying the tasks implemented by each part of the system, allowing solving errors easily. Then, we will present here the external parts of the ASP that are also important to obtain the final robot behavior.

The ASP local planner needs four data as input: all the obstacles inside its time window of 5 seconds (block: environment obstacle detection), the robot localization inside a map (Block: Adaptive Monte Carlo localization (AMCL) Map Localization), the current position of all people and their future paths in the environment (block: ESFM to Predict Future Paths of Pedestionas/Objects (to match with the image. If not, we need also to change the names in the blocks of the image Fig. 2, and all predicted actual people destinations (Block: the BHMIP to Estimate the Most Feasible Destinations (To match with the name of this block in the image). In addition, the ASP needs the abilities implemented by the four robot skills to deal with random behaviors of It is Sec. 3.3.3, which are included for simplicity as input in Fig. 2 (Block: The same... Methods to Solve Navigation Issues of Groups. In addition, the BHMIP, the person prediction, and the robotic skills need the people detection and tracking methods as input. The complete structure of the system is shown in Fig. 2, which includes all inputs and outputs. We will explain the parts of this system in the following sections.

Environment Obstacle Detection:

We use two 2D Hokuyo UTM-30LX scanning range of laser for obstacle detection. The robot has one at the front and one at the rear. These two laser scans detect \(360^{\circ }\) around the robot. The lasers are mounted at 40 cm with respect to the ground. They allow the robot to detect the legs of people, the environment for localization, and obstacles in the environment, as explained in the current section. To compute the obstacles, we include cylinders every 0.2 meters in all the laser detections that are not considered a person. For computation reasons, we only detect obstacles in the window of the local planner that surrounds the robot. Our 360-degree laser is made up of two 190-degree laser scans because we need an overlay on the side of the robot to detect people accompanying it in side-by-side formations. Also, other more sophisticated object detection can be used, but this was outside our research.

AMCL Map Localization:

For the robot localization inside a map, we use the AMCL implemented by ROS developersFootnote 1 since there is a node that can be easily integrated into our structure, which is well documented and works well for us. This block inputs the lasers, the odometry of the robot, and the map of the environment.

People’s Detection and Tracking:

To detect all people, we use an algorithm that detects the legs of people [60]. This algorithm has its basis on [61]. The people detector defines a set of geometric features related to a cluster pattern of the legs detected by the laser. This detector uses a boosting method to determine if that set of laser points corresponds to a human being or not. The legs pattern comprises two semicircles positioned relatively close to each other, which are near in terms of distance during short intervals. We chose a laser-based detector due to its position accuracy, faster detection rate, and larger detection area.

Our tracking algorithm follows a similar approach to the work presented in [62] and some of the contributions in [63]. Our particular tracker implementation was published in [64], where it was used for DATMO systems. Our tracking algorithm uses a Kalman filter to propagate the pedestrian trajectories, and it combines the different detections with the existing tracks to calculate the most likely association hypothesis. This tracker uses a hypothesis based on probabilities to confirm, hold, associate, and delete the tracks. Only time-consistent detections repeated multiple times become confirmed tracks, hence, starting the tracking procedure.

BHMIP to Estimate the Most Feasible Destinations:

We use the BHMIP implemented in [19] to estimate all people’s destinations. The BHMIP method uses as inputs: the people detection and tracking blocks to obtain a window of previous positions for all the people and a set of predefined destinations (\({{\textbf {D}}}=\{{{\textbf {D}}}_1,{{\textbf {D}}}_2,\ldots ,{{\textbf {D}}}_n,\ldots , {{\textbf {D}}}_m \}\)) inside the environment. These destinations are physical places where people can go, for example, doors, streets or square entrances, and street furniture such as benches or vending machines. These destinations can be predefined on any map by the researchers knowing where these places are located. Another way to include these destinations can be using learning methods that find them using the features that define doors. Also, other forms to include these destinations can be found. All possible environment destinations are used to search and find the goal, which is more probable that people want to reach it.

The output of the BHMIP is a set of the most probable destination for each person \({{\textbf {D}}}=\{{{\textbf {D}}}_{n} (p_1),{{\textbf {D}}}_{n} (p_2) ,\ldots , {{\textbf {D}}}_{n} (p_n)\)}. The subscript p matches the identification number of each person that our people-tracker assigns. For implementation details about the BHMIP, the reader should refer to [19].

Then, to anticipate the path that all the people in the environment will take, we use the output destinations of the BHMIP combined with the ESFM, Sect. 3.2.1.1. These paths allow the robot to anticipate interactions with all people in the environment. The ASP uses the BHMIP most feasible destination for the accompanied people to obtain the paths of these people, and also the behavior of the robot. Additionally, all the methods derived from the ASP do the same (ASP-VG and ASP-SG). Moreover, our framework dynamically modifies the destination of the group, \({\textbf {D}}_{\textbf {n}}={{\textbf {D}}}_{n} (p_{c_1})={{\textbf {D}}}_{n} (p_{c_2})\), to obtain a more realistic destination that is dynamic, \({{{\textbf {D}}}_{n_d}}\), by including the direction of movement of the accompanied people (Sect. 3.3.3.3). Finally, to keep the most appropriate formation, it is advisable to compute the destination of all group members, considering their position inside the formation of the group. Then, we obtain one destination for each member, \(\{ {{\textbf {D}}}_{n_{d}} (r)\), \({{\textbf {D}}}_{n_{d}}({p_{c_1}})\), \({{\textbf {D}}}_{n_{d}} ({p_{c_2}})\}\). Where r, \(c_1\) and \(c_2\) are the positions of robot, and accompanied people, respectively.

ESFM to Predict Future Paths of Pedestrians/Objects:

The ESFM to predict future paths of pedestrians (or moving objects) needs as input the inferred destinations of the BHMIP, the tracks of all the people, the obstacles of the environment, and the planned paths of the robot. In the same way that the robot needs to consider people to plan its movements, the predictions of the movements of all people need to consider the interactions between people and the robot. Therefore, the people predictions are computed simultaneously as the ASP. This method predicts the future path of all pedestrians in a window of 5 seconds using the ESFM of Sect. 3.2.1.1. The same as for the ASP local planner. We include this method as a subpart of the ESFM of the ASP due to the necessity of presenting first the general formulation of the ASP local planner for part of the forces that we also used to predict the paths of all people. Also, this method has its basis in [65].

Methods to Solve Navigation Issues of Groups:

In Sect. 3.3.3, we explain the new abilities of the robot to deal with the uncertainties and randomness of human movements. These abilities are included in both methods of group accompaniment. We have implemented four abilities. The first allows the robot to deal with the occlusion problems of one of the group members. The second permits the robot to deal with changes in the position of the accompanied people inside the formation of the group. The third enables the robot to handle changes in the velocity of the people accompanying it. The fourth allows the robot to cope with differences between the current direction of movement of the group and the estimated destination of the group due to obstacle avoidance or destinations that are not an exact point of the environment, for example, stairs. These methods are explained next In is Sect. 3.3.3, because these methods need prior information on group accompaniment to be easily understood. This information is included in the sections explaining our two group accompaniment methods, Sect. 3.3.

3.2 Adaptive Social Planner

The ASP method is the core of our system that allows the robot to perform an HRCN, which includes different behaviors such as robot’s navigation, accompaniment, approaching, etc. This method has two main parts. The first one is a combination of the RRT* with the new formulation of the ESFM, included in Sect. 3.2.1. This part shows multiple paths that the robot can use to perform an HRCN. These paths include the social interactions of the robot and its environment using the ESFM. These interactions can be attractive or repulsive depending on the objectives of the task. For example, to go out of one room, we have an attractive force towards the door of this room. Also, when we interact with one person, we have an attractive force to interact with but a repulsive force to maintain one of the social distances defined by Hall [11] depending on our relationship with this person. The second part of the ASP is a new reformulation of the gradient descent optimization of the planned paths in Sect. 3.2.2. This part selects the best path to be used by the robot using a gradient descent optimization of a multi-cost function. The multi-cost function evaluates all paths using geometrical constraints, the work of the robot due to interactions, and human preferences. It includes, for example, the preferences of people in selecting the path that allows them to speak most of the time properly.

Finally, the ASP returns the best path that the robot should pursue and the immediate best motion for the robot to be able to follow this path. This best path may change in the next iterations due to dynamic environments. The basis of the ASP is in the Anticipative Kino-dynamic Planner (AKP) developed by Ferrer [1]. The AKP is enlarged in the ASP method to generalize the AKP method to perform not only single robot navigation but also any HRCN. In addition, we have included other forces that we do not use currently in any of our previously implemented methods. Therefore, the ASP is more than a general methodology for our methods. It can be applied to other human-robot collaborative navigation that we have not developed yet.

3.2.1 RRT* & ESFM to Compute Multiple Paths

The first part of the ASP algorithm uses an RRT*-planner, which propagates all the subject positions using the new formulation of the ESFM to explore all possible ways to arrive at their final objective. The ASP is a local planner embedded as a plugin in the global planner of ROS. We only use the global planner to obtain the projection of the final goal inside our local window because our local planner sends to the global planner the goal where the robot should go. How we select this goal is explained in the BHMIP of Sect. 3.1. This local window is computed from the current position of the robot until 5 seconds into the future, and the planner uses a maximum number of 500 nodes. These values have been selected experimentally to obtain a trade-off between a good anticipation behavior of the robot and the computational cost of the algorithm to allow the robot to behave in real-time. We do not have an exact number of paths because we use a maximum amount of tree nodes.

To calculate the paths, this time of 5 seconds becomes a circular area surrounding the robot \(C_{area}=h \cdot v_{max}\). Where \(h=0.2\) seconds and \(v_{max}\) is the maximum velocity of the robot, here, it will be \(v_{max}=1.2\) m/s. Before starting to compute all paths, the single final destination (\({{{\textbf {D}}}_d}\)= \({{{\textbf {D}}}_n}\) (static) or \({{\textbf {D}}}_{n_{d}}\) (dynamic)) is translated at the circular region of exploration around the subject (person or robot). Also, we have to convert this translated destination into several random local goals. We obtain these local goals using random sampling over a Gaussian distribution centered on the translated final destination to the exploration region \(C_{area}\). The covariance of this Gaussian distribution increases as the number of obstacles increases, introducing more randomness in the local goal computation. These random goals introduce a random factor to allow the planner to not fall into local minima. We obtain the subject paths by propagating the subject position using an RRT* that propagates each node using the ESFM until these random local goals over the time window.

The ESFM allows to include in the subject behavior real attractive and repulsive interactions between the subject and other elements of the environment (people, animals, places, objects, and robots), which are included using virtual forces. The first version of the ESFM that we use as a basis appeared in [49], and the first time that the SFM appeared was [20].

We will start presenting all the individual forces that compose the resultant force that we use to propagate the movement of the subjects. Finally, we will present the general equation of the resultant force of the ESFM to implement the ASP.

First, in an environment, the subject can interact with the target places of this environment, where people should go, for example, doors of shops/rooms, workstations, benches, entrances and exist of streets/squares. These interactions can be attractive, to approach these places, Eq. 1, or repulsive, to avoid passing through, Eq. 2.

Then, the sum of the attractive forces concerning more than one destination of the environment where the subject wants to arrive is defined next:

$$\begin{aligned} {{\textbf {F}}}_{s,D_a}^{ att}({{{\textbf {D}}}_{D_a} (s)})= & {} \sum _{d\in D_a } {{\textbf {f}}}_{s,d}^{ att} ({{{\textbf {D}}}_{d} (s)}) \nonumber \\= & {} \sum _{d\in D_a } k ({{\textbf {v}}}^0_{s}({{{\textbf {D}}}_{d}}(s))- {{\textbf {v}}}_{s}). \end{aligned}$$
(1)

This force assumes that the subject adapts its velocity with a relaxation time \(k^{-1}\) to reach each destination that the subject wants to arrive. The subscript \(s\in \{p,r\}\) refers to the subject, the \(D_a\) subscript includes the set of destinations where the subject has an attraction to go there, the d subscript means a concrete destination inside this set of destinations, the super-index att means attractive force because this force attracts the subject to arrive at one destination. The \({{{\textbf {D}}}_{D_a} (s)}\) are the set of physical positions in the environment of all the target destinations for this subject. The current velocity of the subject is \({{\textbf {v}}}_{s}\), and \({{{\textbf {v}}}}^0_{s}({{{\textbf {D}}}_{d}}(s))\) is the desired velocity of the subject to arrive at one concrete destination \({{{\textbf {D}}}_{{d}}}(s)\). This destination can be dynamic, \({{{\textbf {D}}}_{n_{d}}}(s)\), or static, \({{\textbf {D}}}_{n}(s)\). To compute a dynamic destination in the case of group accompaniment, we use the static destination of the environment and the group orientation of movement, computed in Sec. 3.3.3.3. In all our works, the dynamic destination is used only for the subjects of the group. We use the static destination, \({{\textbf {D}}}_{n}(s)\), of the environment directly to predict the movements of all other people. In addition, this dynamic destination can be used for all the other people by including some modifications in the algorithm.

\({{\textbf {F}}}_{s,D}^{ rep}\) is the sum of the repulsive forces with respect to different destinations of the environment, defined next:

$$\begin{aligned} {{\textbf {F}}}_{s,D}^{ rep} = \sum _{d\in D } {{\textbf {f}}}^{rep}_{s,d} =\sum _{d\in D } A_{sd} e ^{(d_{sd}-d_{s,d})/B_{sd}} w ( \varphi _{s,d} \hbox {,}\lambda _{sd}) \end{aligned}$$
(2)

Where each repulsive force can be represented as a circular repulsion between the subject and each destination, including an anisotropic factor to add the field of view of the subject. The super-index rep means repulsive force and the subscript D is the set of all destinations from which the subject is repulsed. \(A_{sd}\), \(B_{sd}\), \(\lambda _{sd}\) and \(d_{sd}\) are the parameters of the repulsive interaction between subject and the destination. \(A_{sd}\) and \(B_{sd}\) denote the strength and range of the interaction force, respectively. \(d_{sd}\) is the sum of the radii of the subject and the destination, which is the minimum distance of proximity between the subject and this destination. \( d_{s,d} = r_d - r_s \) is the real distance between subject and destination. We do not currently use repulsion from destinations, so we have not learned the parameters of this type of repulsive force. Nevertheless, in previous works these parameters were learned for other repulsive forces such as: with respect to people [66], obstacles [67], robots [49] and accompanied people [6]. The repulsion from accompanied people is less than for people that the robot tries to avoid. Then, to use this force, these parameters should have been learned previously. In [6, 49], how we have learned these parameters is explained. Using these repulsive forces in combination with the RRT*, the subjects can anticipate the interactions with the environment and avoid destinations in advance. Here, the subjects can be robots, people, or animals. Also, they can avoid other people, robots, and obstacles using analogous repulsive forces.

Now, we include all people in the environment, or in a more general way, any animal (dogs, etc.), because we can have similar attractive or repulsive social interactions with animals as well as with people. Here the force to model the sum of attractive interactions with respect to people appears in Eq. 3\({{\textbf {F}}}_{s,P_y}^{ att} (D_{s,P_{y}})\) , and the force to model the sum of repulsive interactions with respect to people, \( {{\textbf {F}}}_{s,P}^{ rep}\) in Eq. 4. The attractive forces allow the robot to interact with these people in different ways, for example, accompanying and approaching them. The repulsive forces ensure the robot does not invade the social space of any person. \(P_y\) is the set of people who interact with the subject and its subscript y shows the type of interaction. For example, for the group accompaniment, it is \(y=c_i\), which represents the companion people. However, in other works this index can be for an approaching person \(y=ap_i\) [6] or other possible types of interactions. \(D_{s,P_{y}}\) are the set of physical places to arrive to interact with these people. These two forces are shown in Eqs. 3 and 4. As we can see, both have the same form as the previous ones but now interacting with people and not with destinations. Then, all their components have the same meaning except for people. In Sect. 3.3, we include two concrete examples of customization of Eq. 3 for cases of group accompaniment.

$$\begin{aligned} {{\textbf {F}}}_{s,{P_{y}}}^{ att}({{{{\textbf {D}}}_{s,P_{y}}}})= & {} \sum _{p_y\in P_y } {{\textbf {f}}}_{s,p_{y}}^{ att}({{{\textbf {D}}}_{s,p_{y}}}) \nonumber \\= & {} \sum _{p_y \in P_y } k ({{\textbf {v}}}^0_{s}({{{\textbf {D}}}_{p_y}}(s)) - {{\textbf {v}}}_{s}) \end{aligned}$$
(3)
$$\begin{aligned} {{\textbf {F}}}_{s,P}^{ rep}= & {} \sum _{p\in P } {{\textbf {f}}}^{rep}_{s,p} \nonumber \\= & {} \sum _{p\in P } A_{sp} e ^{(d_{sp}-d_{s,p})/B_{sp}} w ( \varphi _{s,p} \hbox {,}\lambda _{sp}) \end{aligned}$$
(4)

Now, we can include objects in the environment that can be obstacles if we want to avoid them. Nevertheless, we can be attracted to approach different objects, such as a table to get a pen. Here the force to model attractive interactions with respect to objects appears, \({{\textbf {F}}}_{s,O_z}^{ att}\), which has a similar form as Eq. 3 but substituting people for obstacles. Now, the \({{{\textbf {D}}}_{s,O_{z}}}\) will be the set of ground positions where the subject can interact with different objects. The subscript \(O_z\) represents the set of objects to interact with, and its subscript z represents the type of interaction with respect to these objects. The force to model the sum of repulsive interactions with respect to objects to avoid them is \( {{\textbf {F}}}_{s,O}^{ rep}\), which has a similar form to Eq. 4 with its correspondent parameters [67] to model this repulsion. Also, these obstacles could be static or dynamic, like bicycles.

Now, we include the robot in the environment to obtain the social interactions of any subject of the environment concerning the robot. Using the same principle as in the previous interactions, we can have a sum of attractive forces with respect to the robots using \({{\textbf {F}}}_{s,R_l}^{ att}\) or repulsive forces using \({{\textbf {F}}}_{s,R}^{ rep}\). The attractive forces are included in the behavior of people that the robot accompanies. The repulsive forces are included in the behavior of people that do not want to interact with the robot. Here, \(R_l\) represents the set that includes all robots to interact, and its subscript l represents the type of interaction with respect to the robot, for example, accompaniment or approach. It has to be mentioned that maybe we can have more than one robot, and also we can have robot-robot interactions, not only people-robot interactions.

Finally, we arrived at the general equation of the ESFM to implement this robot’s ASP, which has the following form:

$$\begin{aligned} \left. \begin{aligned} {{\textbf {F}}}_{s}&= \alpha {{\textbf {F}}}_{s,D_a}^{ att}({{{\textbf {D}}}_{D_a} (s)}) + \zeta {{\textbf {F}}}_{s,D}^{ rep} + \beta {{\textbf {F}}}_{s,P_y}^{ att} (D_{s,P_{y}}) \\&\quad + \gamma {{\textbf {F}}}_{s,P}^{ rep} + \epsilon {{\textbf {F}}}_{s,O_z}^{ att} (D_{s,O_z} ) \\&\quad + \delta {{\textbf {F}}}_{s,O}^{ rep} + \iota {{\textbf {F}}}_{s,R_l}^{ att} (D_{s,R_{l}}) + \nu {{\textbf {F}}}_{s,R}^{ rep} \end{aligned}\right. \end{aligned}$$
(5)

The set \(\{\alpha , \beta , \gamma , \delta , \epsilon , \zeta , \iota , \nu \}\) represents the corresponding weights of the forces. We have learned in previous works only the weights \(\{\alpha , \beta , \gamma , \delta , \nu \}\) because we only have performed these interactions with our robot. Our weights and how to learn them are included in [6, 49]. We have only used the attractive force towards destinations, the attractive and repulsive forces with respect to people, the repulsive force for obstacles and the repulsive force concerning the robot for all the people in the environment. Then, we have not previously used the attractive forces concerning objects or robots, the attractive and repulsive forces for animals different than people, and the repulsive forces concerning places of the environment. Therefore, we can create other types of HRCN that we are not considering in our examples. In Sect. 3.3, the reader will find an example of the customization of this part of the ASP to implement two methods of group accompaniment.

Fig. 3
figure 3

All forces that we envision to be combined with the RRT*. Forces concerning obstacles are included in gray for attractive and black for repulsive. Forces with respect to people are included in light green for attractive and dark green for repulsive. Forces concerning robots are drawn in light purple for attractive and dark purple for repulsive. Forces with respect to destinations are in light blue for attractive and dark blue for repulsive. All also include the resultant force for robots and people in red. The time window of the local planner for the robot is a dashed black circle, and the possible paths for the robot are in orange and for people in blue; where the best path is included in red for the robot and in dark blue for people. The context of the three situations is included in the text that references this image

An example of all the forces that we envision is included in Fig. 3. Figure 3-Left includes a robot behavior to approach a glass of water with an attractive gray force to the ground position to interact with the glass and two black repulsion forces. One force is to avoid the chair and the other to not collide with the table. Fig. 3-Center includes a robot behavior to approach an animal with a light green attractive force towards the cat and two repulsive forces. The first repulsive force in dark green is to maintain a social distance with respect to the cat, and the second force in dark blue to avoid a place of the environment, namely a hole. Figure 3-Right shows a group accompaniment behavior of the robot, while one person is approaching the robot to take some pictures and another person is trying to avoid the group, both with their respective attractive and repulsive forces. These other people have different forces, for example, a light purple force to approach the robot, a dark purple force to prevent the robot from colliding with them, and a light blue force to arrive at an environment destination. Then, the robot has a light green attractive force and a dark green repulsive force with its companions. The attractive ones are to accompany them, and therefore these forces are attracted towards the future path of those people. Also, the robot has an attraction to the group destination, which is shown as a blue arrow in the image. However, in our ASP-SG method, the robot behavior to be attracted until the group destination is included inside the attractive forces towards the paths of the companions because these forces include a light blue attractive force until the group destination for the accompanied people. Finally, there are repulsive forces between all the people in the environment. It is essential to notice that, we only show the forces to propagate the first step of the best path, but we have these forces for all the paths and all their steps. We include different possible paths in the images to have a more realistic representation of the method, but the real number of paths can also vary from this representation. A more realistic image is included in Fig. 9 for our simulation experiments, but in it, we only show some of the best paths, not all the paths computed by the algorithm.

The reader should notice that we can not combine all forces in one type of attractive or repulsive ones because the parameters and even the formula for the forces can differ. This is because people do not interact with objects in the same way as with people, animals, places, or robots.

3.2.1.1 ESFM to Predict Future Paths of Pedestrians/Objects

To predict people/object movements using the ESFM, we only need to know the virtual interaction forces that these people have. Our method uses only the ESFM of Eq. 6 to predict the path of all other pedestrians and the accompanied people in the ASP-VG method. However, in the ASP-SG method, Eq. 6 is combined with the RRT*, similar to the robot case.

$$\begin{aligned} \left. \begin{aligned} {{\textbf {F}}}_{p_{i}}&= \alpha {{\textbf {f}}}_{p_{i},d}^{ att}({{{\textbf {D}}}_{d} ({p_{i}})}) + (\gamma {{\textbf {F}}}_{p_{i},P}^{ rep} + \delta {{\textbf {F}}}_{p_{i},O}^{ rep} + \nu {{\textbf {F}}}_{p_{i},R}^{ rep}) + \\&\quad + {{\textbf {F}}}_{p_{c_t},R_l}^{ att} ({{{\textbf {D}}}_{p_{c_t},R_l}}), \end{aligned}\right. \end{aligned}$$
(6)

where we use \({{\textbf {f}}}_{p_{i},d}^{ att}({{{\textbf {D}}}_{d}({p_{i}}))}\), because we expect only one attractive destination for all people. In case of companions this destination is dynamic \({{{\textbf {D}}}_{n_d}({p_{c_t}})}\) and for all other people is static \({{{\textbf {D}}}_{n}({p_{p}})}\). The subscript i refers to all people, \(i \in {p, {c_t}}\). Where, \({c_t}\) refers to all companions, and for the concrete case of two companions, it is \(c_t \in {{c_1}, {c_2}}\). \( {{\textbf {F}}}_{p_i,P}^{ rep}\) are the repulsive forces to avoid all other people in the environment. \( {{\textbf {F}}}_{p_i,O}^{ rep}\) are the repulsive forces to avoid obstacles. \({{\textbf {F}}}_{p_i,R}^{ rep}\) is different for all people that want to avoid the robot and the people that interact with the robot. In the case of the companions, to simplify, we can include in Eq. 6 the additional attractive term \(\iota {{\textbf {F}}}_{p_{c_t},R_l}^{ att}\), by means of diminishing the repulsion with respect to the robot for people that interact with it. In our case, we have only one robot.

3.2.2 Gradient Descent Optimization of Planned Paths

There are different feasible paths to perform the collaborative navigation, but we must select only one of them. This second part of the algorithm explains how we evaluate all possible paths to select the best one to obtain our general robot’s collaborative navigation of the ASP. To evaluate all paths, the ASP performs a Gradient Descent Optimization of a multi-cost function, which optimizes all our criteria included in Eq. 7. With this optimization, we obtain the path with minimum cost regarding criteria related to navigation OBjectives \(J_{OB}\), People Interactions \(J_{PI}\), People’s Preferences \(J_{PP}\), Object Interactions \(J_{OI}\), and Robot Interactions \(J_{RI}\). Although not all robot behaviors (solo navigation, one-person or group accompaniment, approaching, etc.) need to include all these criteria to select a path, we have provide a general formulation here. For example, the robot’s group accompaniment includes a cost to select the path that allows the group to maintain the side-by-side or V-formation most of the time, which allows them to speak while walking. However, the accompaniment cost is unnecessary when the robot unaccompanied approaches a human.

$$\begin{aligned} {{\textbf {J}}} (S, s_{goal},U)= [ \textit{J}_{OB}, \textit{J}_{PI}, \textit{J}_{PP}, \textit{J}_{OI}, \textit{J}_{RI}]. \end{aligned}$$
(7)

Then, \(\textit{J}_{OB}=\sum _{i \in OB } \textit{J}_{ob_{i}} \) is a weighted sum of costs that evaluate different navigation OBjectives. For example in our planners, it is \(\sum _{i \in OB } \textit{J}_{ob_{i}}=\textit{J}_{d} + \textit{J}_{or} + \textit{J}_{r}\). The function has this form because we want the minimum walking distance (\(\textit{J}_{d}\)), the minimum changes in orientation (\(\textit{J}_{or}\)), and the minimum effort to control the robot (\(\textit{J}_{r}\)), which is related to the force of attraction to reach the destination. The subscript d indicates destinations, the subscript or orientations, and r robot for the cost to control the robot. In this section, all these sub-indices are related to the name of the cost. In the sum, the subscript i is the iterator to pass through all the set of navigation objectives. i has the same meaning for the next subscript of the sums.

Moreover, \(\textit{J}_{PI}=\sum _{j \in PI } \textit{J}_{pi_{j}}\) is a weighted sum of costs related to all interactions with people inside the dynamic environments. For the group accompaniment, it is \(\sum _{j \in PI } \textit{J}_{{pi}_{j}}=\textit{J}_{p}+\textit{J}_{p_c}\), which includes the cost to avoid people that the robot does not want to interact with (\(\textit{J}_{p}\)), and the cost to not invade the personal space of the accompanied people (\(\textit{J}_{p_c}\)).

Furthermore, \(\textit{J}_{PP}=\sum _{k \in PP } \textit{J}_{pp_{k}}\) is a weighted sum of costs related to all the preferences of people interacting with the robot. For the accompaniment, it can be \(\textit{J}_{PP}= \textit{J}_{c}\) to select the best path, which includes the preferences of people to maintain the most comfortable formation to have a communicative interaction among the group members. This cost of the accompaniment can be considered a preference of the people since it refers to the choice of the path that allows them to maintain a specific formation to better communicate with each other. However, in other situations, they may prefer to break this formation to arrive at their destination faster. For example, at the airport to arrive at the boarding gate.

Furthermore, \(\textit{J}_{OI}=\sum _{l \in OI } \textit{J}_{oi_{l}}\) is a weighted sum of costs related to all interactions with objects of the environment. In our works, we explore the repulsive interactions to avoid obstacles (\(\sum _{l \in OI } \textit{J}_{oi_{l}}=\textit{J}_{o}\)). However, other attractive interactions can also exist regarding different objects of the environment that may have a cost similar to the previously presented cost of \(J_c\) for the accompanied people.

Finally, \(\textit{J}_{RI} = \sum _{m \in RI } \textit{J}_{{RI}_{m}}\) is a weighted sum of costs related to all interactions with the robots of the environment. In all our works, all the people in the environment have a repulsive force with respect to the robot, and the companions also have an attractive force with respect to the robot in interacting with it.

Now, if we customize the general formula with only the previously introduced costs as examples, we get the formula of Eq. 8, which is for only one-person accompaniment. Where the first five costs were introduced in [1], and the companion cost was defined in [6]. The first time the companion cost was defined was for the Adaptive Social Planner using a Side-by-side model to accompany an Individual person (ASP-SI). Furthermore, we explain the customization of the costs for the two methods of group accompaniment in Sect. 3.3.1.2 for the ASP-VG and in Sect. 3.3.2.2 for the ASP-SG because these methods include other sub-costs of the ASP that should not be taken into account in the accompaniment of a single person. Therefore, we have not explained them previously.

$$\begin{aligned} {{\textbf {J}}} (S, s_{goal},U)= [ \textit{J}_{d}, \textit{J}_{or}, \textit{J}_{r}, \textit{J}_{p} , \textit{J}_{o}, \textit{J}_{c_t}] \end{aligned}$$
(8)

Finally, the computation of all the costs of the paths is done in three steps. First, the robot computes each individual cost in each step of the path. Second, to avoid the weighted sum method’s scaling effect, each cost function is normalized between \((-1,1)\) using the mean and variance of an erf function, which are calculated after the computation of all the paths. Third, a projection via weighted sum \(J:{\mathbb {R}}^{n} \rightarrow {\mathbb {R}}\) is obtained giving the weighted cost formula. After the computation of each cost for each path, we perform a Gradient Descent Optimization to obtain the path with minimum cost as the best one to do the HRCN. For an extended explanation of this cost computation, see [1].

3.3 ASP Customization for Group Accompaniment

We have customized the ASP to obtain two methods for group accompaniment, the ASP-VG and the ASP-SG. The implementation of both methods was necessary in order to know human preferences regarding group accompaniment. In our cases of group accompaniment, we have reduced the ASP forces of the behavior of the robot to only use the forces related to attraction to destinations of the environment (\({{\textbf {F}}}_{s,D}^{ att}({{{\textbf {D}}}_{D} (s)})\)), repulsive (\( {{\textbf {F}}}_{s,P}^{ rep}\)) and attractive (\({{\textbf {F}}}_{s,P_y}^{ att} (D_{s,p_{Y}})\)) forces with respect to people, repulsive forces with respect to obstacles (\( {{\textbf {F}}}_{s,O}^{ rep}\)). For people predictions, we include the forces related to robot interactions: attractive (\( {{\textbf {F}}}_{p_{c_t},R_l}^{ att} ({{{\textbf {D}}}_{p_{c_t},R_l}})\)) and repulsive (\( {{\textbf {F}}}_{s,R}^{ rep}\)). In other methods of robot accompaniment as well, the reader can include repulsive forces with respect to destinations, or other forces to include other types of interactions. Also, regarding costs of the paths of the robot, we only use the costs related to navigation objectives (\(\textit{J}_{OB}\)), interactions with people (\(\textit{J}_{PI}\)), people’s preferences to select the best path for maintaining the formation of the group (\(\textit{J}_{PP}\)), and object interactions but only the repulsive ones (\(\textit{J}_{OI}\)). In this case, if the readers implement methods of people simulation evaluating different paths, they will need to include the costs related to the robot interactions (\(\textit{J}_{RI}\)).

3.3.1 ASP-VG Method

The ASP-VG method allows the robot to accompany a group of people using a V-formation. It is a sub-method of the ASP that uses only the required forces and costs involved in a V-formation group accompaniment. Thus, the method uses the general structure of the ASP method customized for the V-form group accompaniment. The modifications in the ESFM to accomplish this group accompaniment are included in Sect. 3.3.1.1 and the customization of the costs to evaluate the paths for the V-formation accompaniment in Sect. 3.3.1.2. Furthermore, this ASP-VG method has been presented before in [3], which combines the work of Zanlungo et al. [68]. Also, in the current website of the paperFootnote 2 a block diagram of the structure for this particular method is included.

3.3.1.1 ESFM of ASP-VG

This section explains the customization of the ESFM from the general one of the ASP, to compute all the planning paths to allow the robot to accompany groups of people using V-formation (Eq. 9). The basis of this method is the same as the basis of the ASP [1, 6], but with fewer improvements, where one of these improvements is the inclusion of the V-form pedestrian model [68]. Then, the robot plans all the possible paths to accompany the group using a combination of the RRT* with the ESFM, as explained in Sect. 3.2.1, but changing the ESFM final formula to Eq. 9 that includes only the forces to implement this group accompaniment.

$$\begin{aligned} \begin{aligned} {{\textbf {F}}}_{r}&= \alpha {{\textbf {f}}}_{r,d}^{ att} + \beta {{\textbf {F}}}_{r,p_{c_t}}^{ att} + \\&\quad + \gamma ( {{\textbf {F}}}_{r,P}^{ rep} +{{\textbf {f}}}^{rep}_{r,p_{c_1}} +{{\textbf {f}}}^{rep}_{r,p_{c_2}}) + \delta {{\textbf {F}}}_{r,O}^{ rep} \end{aligned} \end{aligned}$$
(9)

As stated previously, the parameters \(\alpha \), \(\beta \), \(\gamma \) and \(\delta \) were learned as described in [49]. Now, for the ASP-VG, we only use the next forces with respect to all the forces of the ASP. The attractive force towards the group destination which is a dynamic destination, \( {{\textbf {f}}}_{r,d}^{ att}({{{\textbf {D}}}_{n_{d}}})\). The repulsive forces with respect to people (\({{\textbf {F}}}_{r,P}^{ rep} \)) and obstacles (\({{\textbf {F}}}_{r,O}^{ rep}\)), which are computed analog to the forces presented previously for destinations and people repulsion in Eqs. 1, and 4. The repulsive forces with respect to the accompanied people, \({{\textbf {f}}}^{int}_{r,p_{c_1}}\) and \({{\textbf {f}}}^{int}_{r,p_{c_2}}\), which are defined like the previous ones and their parameters are learned in [6]. These repulsive forces need to be different from the previous repulsive forces with respect to other people because the robot is interacting with the accompanied people and not avoiding them. Finally, we use the attractive forces with respect to the accompanied people, which include the V-form method, Eq. 10. These forces do not include an attractive destination because they are based on potential fields and do not have a concrete physical destination from which to be attracted.

$$\begin{aligned} {{\textbf {F}}}_{r,{\mathcal {P}}_{c_t}}^{ att}= \sum _{k \in {\mathcal {P}}_{c_t}} {{\textbf {f}}}_{r,k}^{ att}, \end{aligned}$$
(10)

where, r denotes robot, p all pedestrians except companions and \({\mathcal {P}}_{c_t} \in \{ p_{c_1}, p_{c_2}\}\) both people that accompanies the robot. The force \({{\textbf {f}}}_{r,k}^{ att}\) can be \({{\textbf {f}}}_{r,p_{c_1}}^{ att} = {{\textbf {f}}}^{first}_{ij}\) and \({{\textbf {f}}}_{r,p_{c_2}}^{ att} = {{\textbf {f}}}^{first}_{ij}\) or \({{\textbf {f}}}^{second}_{ij}\) depending on the person accompanied by the robot to whom they refer and the position of the robot inside the formation of the group. When the robot is at the lateral of the formation of the group, \(c_1\) refers to the nearest person, and \(c_2\) to the furthest one. These forces are explained next in Eqs. 12 and 13.

To compute these attractive forces, we include the (discomfort) potentialFootnote 3 introduced in [68], which describes the dynamics of socially interacting pedestrian groups. These pedestrians feel some discomfort when they are not located in the optimal position for social interaction, which is modelled with the (discomfort) potential of Eq. 11.

$$\begin{aligned} \begin{aligned} U^\eta (r_{ij},\theta _{ij})&= R(r_{ij}) + {\varTheta }^\eta (\theta _{ij}),\\ R(r)&=C_r \left( \frac{r}{r_0}+\frac{r_0}{r}\right) ,\\ {\varTheta }^\eta (\theta )&=C_\theta \left( (1+\eta ) \theta ^2\right. +\left. (1-\eta ) (\theta -\hbox {sign}(\theta ) \pi )^2\right) , \end{aligned} \end{aligned}$$
(11)

where the relative position between two socially interacting pedestrians i and j is \({{\textbf {r}}}_{ij}\equiv {{\textbf {r}}}_i-{{\textbf {r}}}_j=(r_{ij},\theta _{ij})\), and where \(\theta =0\) gives the direction to the pedestrian’s goal. \(r_0\) is the most comfortable interaction distance (with our particular Tibi-robot, it was 1.5 m at the beginning of the experiments, and 1 m in the final experiments). Where, \(C_r\) and \(C_\theta \) are two weights. \(C_r\) weights the discomfort potential with respect to the distances between the components of the group and \(C_\theta \) weights the discomfort potential with respect to the orientation between the group members. \(-1 \le \eta < 0\) is a normalization parameter related to the intensity of social interaction. \(C_r\) and \(C_\theta \) are related to the curvature of the potential in its minimum values. \(C_r\) is related to the curvature in the r direction, and \(C_\theta \) is related to the curvature in the \(\theta \) direction. For example, if \(C_\theta = \frac{C_r}{2}\) the potential has circular shape near to its minimum values. For more information about it, see [68]. Regarding this potential, the radial term R assures that the pedestrians will have a separation close to \(r_0\), while the angular potential \({\varTheta }^{\eta }\) allows them to keep both their interaction partner and their walking goal in sight (the more negative \(\eta \) is, the more pedestrians will try to have interaction partners in their vision field). Also, in [10], they define that nearest people interact through the force of Eq. 12 and furthest people interact only taking into account distance using Eq. 13.

In this work, the robot is accompanying two people. However, they can be located at short or long distance with respect to the robot, depending on the position of the robot inside the formation of the group. Then, the attractive force to accompany \(p_{c_1}\), which is always nearest to the robot, is described by:

$$\begin{aligned} {{\textbf {f}}}^{att}_{r,p_{c_1}}={{\textbf {f}}}^{first}_{r,p_{c_1}}=-{\nabla }_i U^\eta ({{\textbf {r}}}_{r,p_{c_1}}), \end{aligned}$$
(12)

and the attractive force to accompany \(p_{c_2}\) is described next:

$$\begin{aligned} {{\textbf {f}}}^{att}_{r,p_{c_2}}= \left\{ \begin{array}{lcc} {{\textbf {f}}}^{first}_{r,p_{c_2}}=-{\nabla }_i U^\eta ({{\textbf {r}}}_{r,p_{c_2}}) \hbox {; robot in center} \\ {{\textbf {f}}}^{second}_{r,p_{c_2}}=-\frac{1}{2} {\nabla }_i R(r_{r,p_{c_2}}) \hbox {; robot at lateral} \end{array} \right. \end{aligned}$$
(13)

3.3.1.2 Optimization of the Planned Paths of ASP-VG

This section explains the customization of the costs to evaluate all paths and select the best to obtain the best V-form group accompaniment inside dynamic environments. The ASP-VG method only changes the costs related to people’s preferences \(J_{PP}\). That is to say, the companion cost \(J_{c_t}\) for both accompanied people, included in Eq. 14. To compute this cost, we use the discomfort potential [68]. This potential can be used because it evaluates the cost of breaking the accompaniment formation.

$$\begin{aligned} \begin{aligned} J_{c_t} = U^\eta (r_{ij},\theta _{ij})=&R(r_{ij}) + {\varTheta }^\eta (\theta _{ij}). \end{aligned} \end{aligned}$$
(14)

3.3.2 ASP-SG Method

The ASP-SG method allows the robot to accompany a group of people using a Side-by-Side formation. It is a sub-method of the ASP that uses only the required forces and costs involved in a Side-by-Side group accompaniment. Then, the method uses the general structure of the ASP method customized for the Side-by-Side group accompaniment. The customization of the ESFM to perform this side-by-side group accompaniment is included in Sect. 3.3.2.1 and the customization of the costs to evaluate all the paths for the side-by-side accompaniment is shown in Sect.3.3.2.2. Furthermore, this ASP-SG method has been presented before in [4]. In addition, in the website of the paper\(^2\) is included a block diagram of the structure for the ASP-SG method.

3.3.2.1 ESFM of ASP-SG

This section explains the customization of the ESFM from the general one of the ASP to compute all the planning paths to allow the robot to accompany groups of people using a Side-by-Side, the ASP-SG method.

The ASP-SG method has two possible positions of the robot inside the group. In the first one, the robot is located at the lateral of the formation. Here, the robot uses the ASP-SI [6] to accompany the group because we consider that the robot only interacts with the central person and we expect that the lateral person maintains the formation. In the second position, the robot is located in the center of the formation of the group, where it can interact with both people. This section will focus on the method used to accompany a group when the robot is located in the center.

To obtain the best robot accompaniment, we need to know all possible paths that may perform both accompanied people inside the time window of the planner by combining the RRT* with the ESFM of Eq. 6, which is the ESFM to propagate the people. Once we have all the paths for the companions of the robot, the robot can use them to compute all of its possible paths for escorting the group by using the RRT* combined with the ESFM of Eq. 15. Also, the method computes all the paths for the accompanied people and all the paths for the robot simultaneously to reduce the computational cost of the algorithm. Fig. 3-Right shows an example of the paths of people in blue and the paths of the robot in orange. The best path for people is drawn in dark blue and for the robot in red.

$$\begin{aligned} \begin{aligned} {{\textbf {F}}}_{r}&= \alpha {{\textbf {F}}}_{r,{p_{c_t}}}^{ att}({{{\textbf {D}}}_{r,p_{c_t}}}) + \gamma ({{\textbf {F}}}_{r,P}^{ rep}+ {{\textbf {f}}}^{rep}_{r,p_{c_1}} + \\&\quad +{{\textbf {f}}}^{rep}_{r,p_{c_2}} + {{\textbf {F}}}_{p_{c_t},P}^{ rep}) + \delta ({{\textbf {F}}}_{r,O}^{ rep} + {{\textbf {F}}}_{p_{c_t},O}^{ rep}) \end{aligned} \end{aligned}$$
(15)

where \(\alpha \), \(\gamma \) and \(\delta \) are the same as in Eq. 6. Also, the repulsive forces with respect to people and obstacles are analog to the forces explained in Eqs. 1 and 4, but now applied to this case. Also, \({{\textbf {f}}}^{rep}_{r,p_{c_1}}\) and \({{\textbf {f}}}^{rep}_{r,p_{c_2}}\) are the repulsive forces between the robot and its companions, which are defined the same as the other repulsive forces, but their parameters change as included in [6].

This method combines two types of robot attractions using the force of Eq. 16. The first attractive force is to accompany the group, and the second is to arrive at the final destination. So we are combining these two attractive forces because the robot uses the steps of all the planned paths for the accompanied people as attractive goals for the forces of the accompaniment, and the paths of the people use the group destination as an attractive force to create their paths.

$$\begin{aligned} {{\textbf {F}}}_{r,{p_{c_t}}}^{ att}({{{{\textbf {D}}}_{r,p_{c_t}}}}) = {{\textbf {f}}}_{r,p_{c_1}}^{ att}({{{\textbf {D}}}_{r,p_{c_1}}}) + {{\textbf {f}}}_{r,p_{c_2}}^{ att}({{{\textbf {D}}}_{r,p_{c_2}}}), \end{aligned}$$
(16)

where \({{\textbf {f}}}_{r,p_{c_t}}^{ att}({{{\textbf {D}}}_{r,p_{c_t}}}) \), \(t=1,2\), are the two attractive forces towards the next step of the planned positions of each accompanied person, \(P_{c_1}\) and \(P_{c_2}\). These attractive forces have an analog form to the force defined in Eq. 3.

We want to obtain an “intelligent” and anticipatory robotic behavior that facilitates the navigation of the people accompanying it. Therefore, we include in the behavior of the robot the repulsive interaction forces of the companions with respect to other people or obstacles of the environment by using the forces described in Eq. 17. These forces allow the robot to increase the personal space for the companions in order to avoid people and obstacles. These forces are omitted in the example of Fig. 3 to simplify the compression of the general formulation for the forces.

$$\begin{aligned} {{\textbf {F}}}_{p_{c_t},P}^{ rep}= \sum _{j\in P } \sum _{k\in {\mathcal {P}}_{c_t}} {{\textbf {f}}}^{rep}_{k,j} , \hbox { } \hbox { } \hbox { } {{\textbf {F}}}_{p_{c_t},O}^{ rep}= \sum _{o\in O } \sum _{k\in {\mathcal {P}}_{c_t}} {{\textbf {f}}}^{rep}_{k,o} \end{aligned}$$
(17)

\({\mathcal {P}}_{c_t} \in \{ p_{c_1}, p_{c_2}\}\) contains all the accompanied people of the formation and \({\mathcal {P}}\) is the set that contains all people except companions. \({{\textbf {f}}}^{rep}_{k,j}\) and \({{\textbf {f}}}^{rep}_{k,o}\) are the repulsive interaction forces between the accompanied people, and other pedestrians and obstacles in the environment.

3.3.2.2 Optimization of the Planned Paths for ASP-SG

This section explains the customization of the costs to evaluate all the paths and select the best one to obtain the best Side-by-Side group accompaniment inside dynamic environments.

In the ASP-SG, we can use the multi-cost function of Eq. 8 for one-person accompaniment. However, we need to reformulate the costs related to the preferences of people for accompaniment (the accompaniment cost: \(\textit{J}_{c_t}\)), one of the costs that are related to navigation OBjectives (the cost to control the robot: \(\textit{J}_{r}\)), and the costs related to repulsive Interactions with respect to People (\(\textit{J}_{p}\)) and with respect to Obstacles (\(\textit{J}_{o}\)).

Now, the cost to control the robot, \(\textit{J}_{r}\), includes the cost related to the two attractive forces (\({{\textbf {u}}}_{r-p_{c_1}}(t)\) and \({{\textbf {u}}}_{r-p_{c_2}}(t)\)) for each step of the paths of the accompanied people.

$$\begin{aligned} J_{r}(U) = {{\sum ^{t_{end}}_{ t=t_{ini}}}} ||{{\textbf {u}}}_{r}(t)||^2 = {{\sum ^{t_{end}}_{ t=t_{ini}}}} ||{{\textbf {u}}}_{r-p_{c_1}}(t) + {{\textbf {u}}}_{r-p_{c_2}}(t)||^2 ,\nonumber \\ \end{aligned}$$
(18)

The costs related to the repulsive forces with respect to other people (Eq. 19) and obstacles (Eq. 20) have been modified, as we include in the robot behavior the repulsive forces that the accompanied people have with respect to other people and obstacles of the environment.

$$\begin{aligned}{} & {} J_{p}(U) = {{\sum ^{t_{end}}_{ t=t_{ini}}}} {\sum _{j=1}^{P}} ||{{\textbf {u}}}_{r-{p_j}}(t)||^2 + {{\sum ^{t_{end}}_{ t=t_{ini}}}} {\sum _{j=1}^{P}} {\sum _{i=1}^{{\mathcal {P}}_{c_t}}} ||{{\textbf {u}}}_{{p_{c_i}}-p_j}(t)||^2 \nonumber \\ \end{aligned}$$
(19)
$$\begin{aligned}{} & {} J_{o}(U) = {{\sum ^{t_{end}}_{ t=t_{ini}}}} {\sum _{j=1}^{O}} ||{{\textbf {u}}}_{ r -{ o_j}}(t)||^2 + {{\sum ^{t_{end}}_{ t=t_{ini}}}} {\sum _{j=1}^{O}} {\sum _{i=1}^{{\mathcal {P}}_{c_t}}} ||{{\textbf {u}}}_{ {p_{c_i}} -{ o_j}}(t)||^2 ,\nonumber \\ \end{aligned}$$
(20)

where the costs for people and obstacle interactions have two parts. The first part is related to the repulsive forces of the direct interactions between the robot (\({{\textbf {u}}}_{r}\)) and other people or obstacles. The second part includes the repulsive forces between the accompanied people and other people or obstacles (\({{\textbf {u}}}_{{\mathcal {P}}_{c_i}}\), where \({\mathcal {P}}_{c_i} \in \{p_{c_1}, p_{c_2}\}\)).

In the cost related to people’s preferences, we need to add the accompaniment cost of the second person to obtain the accompaniment cost of the whole group, \(J_{c_t}(U) = J_{c_1}(U) + J_{c_2}(U)\). With this cost, the robot selects the path that allows the group to navigate for a longer time in side-by-side formation. This cost has been developed in [6] for one-person accompaniment and in [4] for a group of two people. If the reader wants to know the exact implementation of this cost, please refer to the citations [4, 6].

3.3.3 Robot’s Skills to Solve Navigation Issues in Group Accompaniment

3.3.3.1 Solving Occlusion Problems in the Group In both ASP-VG and ASP-SG, we deal with occlusion problems between group members. There are two possible types of formations concerning the position of the robot inside the group. The first is with the robot at the center of the group. The second is with the robot on either side. When the robot is accompanying the group at the side of the formation of the group, the laser can not detect the lateral person because the central one occludes him/her. The robot generates the target of the second accompanied person on the projection where \(P_{c_2}\) should be present, to deal with these large occlusions, as shown in Fig. 4. The track of \(P_{c_2}\) is created using Eq. 21, and its velocity is the same as the one of the track \(P_{c_1}\) (\({{\textbf {v}}}_{p_{c_2}}^{F} = {{\textbf {v}}}_{p_{c_1}}\)). It is the same because both walk in parallel when one person occludes the other during a considerable period of time. This person’s velocity is needed to obtain the prediction of \(P_{c_2}\) movement.

Fig. 4
figure 4

Dealing with self-group occlusions inside the group formation. We show the prediction of the position of \(P_{c_2}\) (in red), which is occluded by \(P_{c_1}\)

$$\begin{aligned} \left. \begin{aligned}&x_{p_{c_2}}^{F} = x_{p_{c_1}} + d_{r,p_{c_1}} \cos \left( \theta _{p_{c_1}} - {{\,\textrm{sgn}\,}}\left( \theta _{p_{c_1}} - \theta _{r,p_{c_1}}\right) {\theta _{r,p_{c_1}}}\right) \\&y_{p_{c_2}}^{F}= y_{p_{c_1}} + d_{r,p_{c_1}} \sin (\theta _{p_{c_1}} - {{\,\textrm{sgn}\,}}(\theta _{p_{c_1}} - \theta _{r,p_{c_1}}){\theta _{r,p_{c_1}}}) \end{aligned}\right. \end{aligned}$$
(21)

where \((x_{p_{c_2}}^{F}, y_{p_{c_2}}^{F})\) is the inferred position of person \(P_{c_2}\). The detected movement orientation and position of person \(P_{c_1}\) are \(\theta _{p_{c_1}}\) and \((x_{p_{c_1}}, y_{p_{c_1}})\). \(\theta _{r,p_{c_1}}\) is the accompaniment orientation between the robot and the \(P_{c_1}\) nearest accompanied person. \(d_{r,p_{c_1}}=2 \cdot R_i= 1.5\) m should be the ideal distance between each of the members of the group.

3.3.3.2 Solving Changes in Position of the Companions To allow the robot to deal with changes in position inside the group, we need to know all group members’ positions with respect to a reference frame located at the center of the group. To do this, we translate the group members’ positions to this reference frame. The reference frame located at the center of the group uses the group velocity like the y-axis. This coordinate change is shown in Eq. 22 for the robot; and for the accompanied person one, and two, it is analog to the shown equation.

Next, we order each position of the members inside the group from a more negative to less negative x component. For example, we can obtain \(x_{p_{c_1}}<x_{r}<x_{p_{c_2}}\). Then, the two accompanied people are at the sides of the formation, and the robot is in the middle. Knowing the position of each group member within the group, the robot can use the corresponding equations to accompany them depending on the position of the robot inside the formation of the group. Fig. 5 shows an example of rearrangement, where the robot changes its position from the side to the center.

Fig. 5
figure 5

Adaptive formation of the group. The robot adapts its position inside the group depending on the behavior of the people to avoid together a static obstacle similar to a door

$$\begin{aligned} \begin{array}{lcc} \left( \begin{matrix} {x_r}^g \\ {y_r}^g \\ \end{matrix} \right) &{} = &{} \left( \begin{matrix} \left| MG \right| cos(\varphi ) &{} \left| MG \right| sin(\varphi ) \\ - \left| MG \right| sin(\varphi ) &{} \left| MG \right| cos(\varphi ) \\ \end{matrix} \right) \left( \begin{matrix} x_r \\ y_r \\ \end{matrix} \right) \end{array} \end{aligned}$$
(22)

3.3.3.3 Solving Direction Changes Until the Destination

Both ASP-VG and the ASP-SG methods need to deal with changes in the direction of the movement of the group with respect to the direction to arrive at their final destination. Then, in dynamic environments, these destinations should also be dynamic. We compute the dynamic goal, \({{\textbf {D}}}_{n_{d}}=(x_{n_d},y_{n_d})\), extracted from the static goal of the environment, \({{\textbf {D}}}_{n}=(x_{n},y_{n})\). This computation is required to face situations when people do not go directly to any environment destination due to obstacle avoidance, and when the destinations are not exact, like the entrances of a square or street, etc., Fig. 6. For example, in a side-by-side accompaniment, if the robot expects that the group will move using a different direction than the real one, it does not position itself exactly on the side of the group. It will be advanced or delayed with respect to this position, depending on the difference it has with respect to the real direction of the group. To compute this dynamic destination, we use the projection of the line perpendicular to the static destination over the orientation of the movement of the group, as shown below:

$$\begin{aligned}{} & {} m = \hbox {tan}\frac{{{v_{y_g}}}}{{{v_{x_g}}}} \end{aligned}$$
(23)
$$\begin{aligned}{} & {} \left\{ \begin{array}{ll} y_r = m x_r + b_1\\ y_n = - \frac{1}{m} x_n + b_2 \end{array} \right. \left\{ \begin{array}{ll} x_{n_d} = \frac{b_2 - b_1}{m + \frac{1}{m} }\\ y_{n_d} = m x_{n_d} + b_1 \end{array} \right. \end{aligned}$$
(24)
Fig. 6
figure 6

Deal with direction changes until the group’s destination. This behavior of the robot allows performing a better formation considering the direction of the movement of the group. Left: The group is going to a street but needs to avoid a moving person in the environment, another person is going to a bench. Center: The group continues going to the street but now needs to avoid a bench to arrive. Right: Finally, the group arrives at the entrance of the street. Also, the other people arrived at their destinations

Here, \(({{v_{x_g}}},{{v_{y_g}}})\) is the average of the observed velocities of the companions of the robot, m is the slope of the straight line in the direction of the movement of the group, \((x_{r},y_{r})\) is the position of the robot, \(b_1\) and \(b_2\) are the origins of the straight lines and are computed using Eqs 23 and 24.

3.3.3.4 Robot’s Adaptation to the Group’s Velocity

We include an adaptation of the velocity of the robot to the velocity of its companions in both methods, ASP-VG and ASP-SG. For example, Fig. 7 shows the velocity adaptation of the robot using the ASP-SG when the robot accompanies two people using both group formations. In this case, the initial acceleration of the people is faster than the initial acceleration of the robot due to the robot initialization period. Also, the initial velocity of the robot is greater than the initial velocity of the people to allow the robot to reach its ideal position in the group.

For the ASP-VG, we have defined the \(v^{group}\) as the average of the detected velocities of the accompanied people. With these velocities, we compute the preferred velocity \(v^p\) of the robot using Eq. 25. This robot’s preferred velocity will be used as the desired velocity in the ESFM, which controls the movements of the robot.

$$\begin{aligned} v^p = v^{group}-\frac{8 C_\theta \eta \pi }{3 r_0 \kappa }. \end{aligned}$$
(25)

For the ASP-SG, the robot’s preferred velocity is computed using Eq. 26. Using this formula, we correct the error in the position of the robot with respect to the preferred position of the robot in the formation of the group for the previous iteration. This is done by transforming this error into a velocity that we sum to the group’s velocity, always taking into account the limit of maximum velocity of the robot, \(v_{max}=1.2\) m/s. Using this adaptation of the velocity of the robot to the velocity of its companions, we reduce the error in the robot accompaniment. With this velocity adaptation, we obtain smoother and more exact robot accompaniment behavior.

$$\begin{aligned} v_{lim}= & {} v_g+ (v_{max} - v_{g}) \cdot (d_{er} - d_{min,er}) / \nonumber \\{} & {} (d_{max} - d_{min,er}) \end{aligned}$$
(26)

Where \(v_{lim}\) is the actual maximum velocity of the robot to accompany the group, that must always meet the condition of being below the limit of the maximum velocity of the robot, \(v_{lim} \le v_{max}=1.2\) m/s. \((v_{max} - v_{g})\) is the maximum allowed increment of velocity, which includes the maximum velocity of the robot, \(v_{max}\), and the average velocity of the velocities of the people in the group, \(v_g\). \((d_{max} - d_{min,er})\) is the maximum increment of error that we desire to correct. Where \(d_{min,er}\) is the minimum error that we allow in the robot’s group positioning, and \(d_{max}\) is the distance from which we start to correct this error in the position of the robot.

4 Performance Metrics

We have included in the behavior of the robot, using the ESFM, our accompaniment formations in combination with the proxemic rules defined by Hall [11], also called social distances, and other navigation behaviors focused on people’s comfortableness, extracted from these methods [12,13,14]. In [13, 14], we have performed earlier works about HRI with Tibi to know what preferences people have regarding the distances and velocities between them and the robot. From these works, we have extracted our ideal companion distance [1–1.5] m and our maximum velocity of the robot, which is [1–1.2] m/s.

This section briefly describes the two sets of metrics developed, one for each formation of the group, to evaluate the most comfortable social behavior of the robot while accompanying a group of people. This most comfortable behavior of the robot evaluates different aspects of the robot’s accompaniment. First, the two sets of performance metrics for both group accompaniments evaluate that the robot is able to perform the best group formation (side-by-side or V-formation), which allows better communication among them. The side-by-side or V-formation can be exercised only without people or obstacles to avoid. Second, both groups of performance metrics consider keeping a comfortable distance among the members of the group, based on the proxemics distances defined by Hall [11] and other methods focused on people’s comfortableness [12,13,14]. Third, only for ASP-SG, its set of metrics evaluates that the robot performs the appropriate dynamic formation to avoid people or obstacles that facilitate the navigation of all people in the environment, including companions. In the ASP-SG, we can also evaluate the formation in obstacle avoidance cases. This method is adaptive and includes different configurations to avoid obstacles, not only the “ideal” side-by-side formation. In the case of the ASP-VG, its metrics do not consider when there are obstacles because the V-form potential only evaluates if the formation of the group is the ideal one, which is only possible in cases without obstacles.

The development of these metrics is crucial to assess the behavior of the robot in a objective way; as in all the HRI methods, it is difficult to know if the robot is doing the task correctly. The metrics for the ASP-VG are included in Sect. 4.1, which uses the V-formation’s discomfort potential. The ones for the ASP-SG are included in Sect. 4.2. We have defined these performance metrics with values in the interval [0, 1], where 0 represents the worst value of performance, and 1 is the best value of performance. The robot obtains the best performance value when it follows perfectly the best formation in the current instant of time for each method.

Fig. 7
figure 7

Robot’s velocity adaptation. The graph shows how the robot adapts its velocity to the mean of the two velocities of its people companions in the case of the ASP-SG when the robot is at the central and lateral position of the formation of the group

4.1 Robot’s V-Formation

To obtain the performance metric for the ASP-VG, we have created a potential grid [3]. For more details, please refer to the website of the paper\(^2\) or previous works [3]. This potential grid obtains similar performances to Fig. 8 for the side-by-side case, but the V-form does not consider the obstacles in its performance metrics. It only takes into account the ideal V-formation. However, the V-formation’s performance includes the velocity of the person (\(v_{c}\)) and the position of the robot with respect to the person and its speed (\(v_r\)). Previous works include images of this V-form metric.

We obtain the best performance for the robot (i.e., the potential value of 1) when it is placed in the position that corresponds to the “perfect” V-form for the two possible positions of the robot inside the group, Eq. 27. When the robot gets dangerously close to the pedestrians or any other configuration “not comfortable” by breaking the V-form, it gets a 0 value of performance. These potentials are computed using the potential equation of the V-form, Eq. 11, and taking into account the current distance between the group members, \(r_{ij}\), and the current angle between them, \(\theta _{ij}\), for the current potential (\(U_{current}\)). The maximum potential (\(U_{max}\)) is computed with the best distance and angle of the formation. The minimum potential (\(U_{min}\)) is obtained with the evaluation of the potential in each point of a grid around the center of the group and finding the minimum potential value.

$$\begin{aligned} {P_{erf}}^V=\left\{ \begin{array}{@{}ll@{}} 0, &{} \hbox {if } U_{current}>U_{max} \\ {\displaystyle \frac{U_{max}-U_{current}}{U_{max}-U_{min}}}, &{} \hbox {otherwise} \end{array}\right. \end{aligned}$$
(27)
Fig. 8
figure 8

Performance metrics of area for the ASP-SG. Top: Performances when the robot is located in the middle of the group‘s formation. Bottom: Performances when the robot is located at the lateral of the group‘s formation. In both rows, left: when there are no obstacles in the environment, center: when the group walks, for example, in a corridor or “narrow” street, and right: when the group needs to pass through a door, for example

4.2 Robot’s Side-by-Side Formation

In the ASP-SG, we have different performances for each one of the two possible positions of the robot (central and lateral). Regarding the area in Fig. 8 and Eq. 28, distance and angle in [4], the website of the paper\(^2\) also includes all the information about these metrics.

With these three performance metrics, we have evaluated if the robot is correctly performing the current formation of the adaptive accompaniment with respect to both companions, which includes rearrangements to avoid obstacles. Fig. 8 and Eq. 28 show how the robot obtains its performance values depending on its position with respect to its companions. The best performance value of 1, represented in red in the image, is obtained when it is placed in the perfect formation with respect to the companions in the current instant of time for cases of obstacle avoidance. When the robot was inside the yellow area, it obtained intermediate performance values because it was only keeping its position inside the proxemic area of social distances where people could notice a robot relation. Nevertheless, it is not maintaining a formation that promotes group communication. Finally, suppose the robot is located outside the proxemics area of social distances or inside collision distances with respect to its companions. In that case, it obtains the worst performance of 0, drawn in blue, because these behaviors should be avoided. Also, the real robot will never be in a real collision with a person because we have a safety distance of 0.3 m, which makes the robot stop if it detects something with the laser.

Next, we include the equation of the area performances:

$$\begin{aligned} {P_{erf}}^S= \frac{1}{2| {\mathcal {R}}|}\int _{\mathcal {B'}(p_{c_i}) \cap {\mathcal {R}}} dx + \frac{1}{2| {\mathcal {R}}|}\int _{\mathcal {A'}(p_{c_i}) \cap {\mathcal {R}}} dx \in [0,1]\nonumber \\ \end{aligned}$$
(28)

where \(\mathcal {A'}(p_{c_i})\) represents the proxemics area of social distances for each accompanied person, \(i \in {1,2}\). The reader needs to notice that when the robot is placed at the side of the formation, this area of social distances with respect to the lateral person needs to be multiplied by two because the lateral person notices that the robot is part of the group. However, there is a central person between them. \(\mathcal {B'}(p_{c_i})\) represents the area around the best position of the robot for the current formation, which includes obstacle avoidance. It has to be mentioned that without obstacles, this position is Side-by-Side, and with obstacles, it varies around the accompanied people depending on the collisions with obstacles. Therefore, the limit of this best positioning area needs to be customized to the robot size. In our case, the robot has a radius of 0.5 m that can be represented as a circle of 1 m in diameter, with its center at the position of the robot, whose area is \(|{\mathcal {R}}|=\frac{\pi }{4}\).

5 Simulation Experiments

5.1 Synthetic Environments

All methods were tested using two-complex simulation scenes developed in previous works: an iterative synthetic environment and another including Gazebo. Sample videos are included in the website of the paper\(^{2}\). The results were obtained from the iterative synthetic environment. Additionally, we tested the methods in the Gazebo simulator.

This synthetic environment included our robot, which used one of our methods, the ASP-VG or the ASP-SG, to perform groups accompaniment. Furthermore, this environment simulated two people being accompanied using the AKP to navigate [1]. The AKP is used to obtain a more realistic behavior for these accompanied people, allowing them to avoid other pedestrians and static obstacles in advance and turn in a more human-like manner. The prediction of the future path of the companions of the robot was represented in blue when the robot detected them and in red when the robot did not see the second person, in Figs. 4 and 9.

Table 1 Performance results of the simulation experiments of the ASP-VG

Moreover, the environment included static obstacles and other pedestrians that used the ESFM to move randomly from one destination to another while avoiding other people and obstacles. These other people were represented as green cylinders with identification numbers over them. In the images of the simulated environment, the obstacles detected by the robot were represented by gray cylinders. We show three steps of a situation of obstacle avoidance during the simulated experiments for both methods in Fig. 9. Furthermore, these images had other elements, such as a dark-blue cylinder representing the dynamic final destination, a black dashed circle around the robot depicting the limit of the local planner, a path in dark blue illustrating the global plan, a path in red describing the best local plan, and several paths in orange symbolizing the subset of potentially good local paths.

We tested the robot behavior at the beginning with a maximum velocity of 1 m/s, but finally, we increased it to 1.2 m/s to deal with accompanied people speeds within the interval of [0-1] m/s. We could increase these velocities since we improved the robot acceleration and deceleration behaviors and, in general, the abrupt changes in acceleration, allowing a smother behavior of the robot, Sect. 3.3.3.4. Other human velocities were randomly selected in the interval of [0,1] m/s. The ideal distance between the centers of each group members position was 1.5 m. There was 0.7 m of free space between them. We want to remark that during all our experiments, all people speeds vary within the interval of [0-1] m/s. Also, all human paths vary using different random directions inside the environment. During our simulations, we extensively tested all the situations planned to be evaluated in real-life experiments, including more randomness. All performance results are obtained using the performance metrics of Sect. 4, and the performance values are inside the interval [0-1]. 1 is considered the best value, and the values between brackets are the standard errors of each mean value (\(\sigma \)).

Fig. 9
figure 9

Synthetic experiments of the ASP-VG and ASP-SG. We include three steps of a static obstacle avoidance situation for the two group accompaniments. Also, the environment includes other pedestrians that can interfere with the accompaniment. Left: The group walks in a V-form or side-by-side formation with the robot at the lateral of the formation. Center: The robot goes behind the people to overpass the simulated door together. Right: The robot continues accompanying the group using a V-form or Side-by-Side formation. In the case of the ASP-SG, the accompanied people make space for the robot in the center of the formation of the group to allow the robot to accompany the group in the middle. This behavior shows the adaptive rearrangement of the group

5.2 Robot’s V-formation

We performed more than 1, 900 simulations to test and validate the ASP-VG model. Firstly, we tested the robot’s ASP-VG without any environmental obstacles. Here, we observed the behavior of the robot when it was allowed to fulfill the “perfect” formation. For both behaviors, one-person accompaniment and two-people accompaniment, performance results are summarized in Table 1. Moreover, we also used different goals inside the environment as final destinations to make the experimentation more complex. The robot used the orientation of the movement of the group to recalculate each time the best position of the final destination in order to obtain better performances of the robot’s accompaniment (see Fig. 7). In this group of simulations, we could see how the robot could arrive at its best position inside the formation of the group and adapt its velocity to the velocity of its companions.

Secondly, when the robot was placed at the side of the group, the central person could occlude the lateral one. We also included a group of simulation experiments for these situations in Table 1 with label: V-Form, \(P_{c_2}\) Occluded.

Thirdly, we simulated situations where the group needed to avoid static and random pedestrians walking through different destinations randomly selected. In these experiments, we could observe how the robot compresses or dynamically changes its position in the formation to facilitate the navigation of the accompanied people while they avoid obstacles together. Fig. 9-up shows this behavior of the robot in simulation. Furthermore, if other pedestrians blocked the entire group or people of the group, the robot waited in its position until these other pedestrians moved out of they way. Then, the group could continue walking. Also, when the group avoids obstacles, the accompanied people could change their formation with respect to the position of the robot. Then, we see how the robot could modify its position within the group accordingly. The results of these experiments are included in Table 1 labelled: obstacle type, static and dynamic.

In Table 1, the best performance values were obtained for the robot’s accompaniment of one person, doing side-by-side, which was easier for the robot than accompanying a group of people. In the specific case where we simulated the track of \(P_{c_2}\), because it was occluded by \(P_{c_1}\), the performance was lower than in the case where the real detection of all members carrying out the V-form existed. However, the creation of the track of \(P_{c_2}\) allowed the robot to keep its performance near the real case, obtaining only a difference value of 0.0915.

Finally, if obstacles were included, the performance decreased, as the V-formation did not consider the obstacles. It was different from the performances for the side-by-side accompaniment in Sect. 4.2 that consider obstacles. Still, the ASP-SG was also affected by obstacles. It reduced its performance in the central accompaniment, as the robot needed to deal with the unexpected behaviors of two accompanied people that could introduce contradictory situations.

5.3 Robot’s Side-by-Side Formation

We performed more than 3, 400 simulations to test and evaluate the robot’s ASP-SG. The robot accompanied a single person or a group of people, starting at the central or lateral position. The robot started at a particular position of the formation but might change it during the accompaniment because we allowed the dynamic positioning of the group members.

Firstly, the group was accompanied by the robot at the lateral or center, without environmental obstacles. Secondly, the robot accompanied the group, while other pedestrians had to be avoided. These other people walked randomly towards different destinations while crossing the group walking path from different directions. Also, other people might pass through the group, demonstrating that the robot could support small occlusions of any group member. Thirdly, the group navigated in an environment with different static obstacles. Fourthly, the group was accompanied by the robot in a scenario that included other random people walking around and static obstacles, as we showed in Fig. 9-down. Also, in these three images of the bottom, we can see how the robot rearranged its position within the group because the accompanied people changed their location after surpassed the static obstacle. Fifthly, we included situations where the central person occluded the lateral one to test in simulation the creation of the track of \(P_{c_2}\) that allows the robot to accompany the whole group correctly.

Table 2 Performance results of the simulation experiments of the ASP-SG

We acquired the performance results, and they are reported in Table 2. As always, the performance values for the one-person side-by-side accompaniment were better than the others because the robot only needed to deal with the unexpected movements of one person, and for the robot was easiest to adapt its behavior to only one person. For the one-person accompaniment, the best results were without obstacles, with a value of performance near 0.9. The worst performance values were obtained with people as dynamic obstacles because the robot needed to deal with different people movements simultaneously, and it may have gotten blocked by many people at some point.

All performance values obtained a value over 0.7, except the angle performance for the group accompaniment with dynamic obstacles, and the robot was in the central position. When the robot was in this position, it was more challenging to avoid the obstacles of the environment properly. Sometimes, other people in the environment momentarily blocked only the robot, and the simulated people left it behind, since the simulator did not incorporate any waiting behavior for the accompanied people. Nevertheless, expecting a waiting behavior with real people would be logical. This fact happened randomly, and if we had static obstacles, these obstacles could also block the accompanied people but did not block the path of the robot. Still, the robot waited for the accompanied people to walk again, but not the reverse. We can probably obtain more realistic simulations and better results by including a waiting behavior for the accompanied people when the robot is stuck.

In the central case, obtaining a good performance value of the robot was difficult because it had to follow two forces, which could be contradictory in situations where the two companions do not agree on which way to go. To allow the robot to deal with complex issues like group breakage, we allowed the robot to focus on the closest accompanied person. These problematic cases could be: if one group member moved more than 6 meters away from the group or if one group member stopped. This robot behavior could cause poor performance values when any of these cases happened. However, this behavior was better than moving away from both people because they were momentarily separated.

Finally, regarding those cases without obstacles, lateral and central group formations had similar performance values and results for the whole group accompaniment. We could, then, conclude that for the robot’s group accompaniment in the lateral position, it was enough to take into account the nearest person to perform a good side-by-side accompaniment of the whole group.

6 Real-Life Experiments

6.1 Guidelines for Experiments with Volunteers

In the current society of information in which we live, we have different social networks where people find information. Therefore, we included new ways of recruiting volunteers through announcements, thus, facilitating somewhat the arduous task of finding people willing to participate in our experiments. These ads ranged from posters scattered around the University campus to announcements on our social networks, or even dissemination of the information of these experiments through University groups (such as student associations), and also searching for some volunteers during the experiments by asking people who are passing through the university campus.

It has been required to give a consent document to the volunteers before participating in the experiment. The consent document informed the participants about the following aspects: why and how the study would be done, their benefits to participate in the study, and the minimum risks involved in their participation. The risks were minimal due to all the security systems of our robot. This document also requested consent to record the necessary anonymous data to extract results from the experiments (rosbags and questionnaires). We also asked them to record a video during the experiments, and not all the volunteers agreed (in these cases, we skipped the video recording). In addition, this document allows the participants to withdraw their consent to participate in the study at any time. They were also told to ask whatever they want at any time. The consent document and all the documents we use are included in the website of the paper\(^2\).

Moreover, we asked the participants to read the experiment instructions, which contained the minimum explanation to be able to be accompanied by our robot. These instructions can be included in the consent documents if these are short and can be explained well without images. However, our experience led us to carry out complex and detailed instructions. These instructions included actions such as toward which destination the robot would accompany them, the interaction phrases that the robot would utter to indicate what to do, and the phrases to play the game of guessing an animal with the robot, included in Sect. 6.2. Also, these instructions included information about the initialization period, safety distance, and maximum speed of the robot. This little knowledge about the behavior of the robot was necessary to allow a slight adaptation by humans unfamiliar with robots. Also, they were told that they could position themselves where they felt comfortable. We did not want to coerce the studied interaction with the robot, which includes their preferences in physical formations of group accompaniment.

Then, the real HRI starts, where the volunteers are accompanied by the robot and simultaneously play the game of guessing an animal. Finally, to conclude the experiments, we asked the volunteers to fill out a survey to know their opinion about the robot’s accompaniment during their interaction. These questionnaires evaluated the “intelligence”, comfortableness, and sociability of the robot. In the three last user studies, we included questions that assessed the interaction between the two people to know if the position of the robot interferes with the communication between them. These questionnaires are created and validated by us to be able to adapt the questions to our robot and our interaction. However, our questionnaires are based on evaluating the robot’s social acceptance of the USUS Evaluation Framework [69]. We validated them using the test-retest and we checked the reliability using Cronbach’s alpha.

Additionally, we have designed the experiments to be performed in 10 minutes to obtain a larger number of volunteers and to encourage a pleasant experience with the robot because we do not pay the volunteers.

In two of our five user studies of the robot’s group accompaniment, we have compared our methods with an expert teleoperating the robot. Our expert has used a PS3 command to control the robot, like a simulated avatar. To achieve this purpose, we have used the PS3 Joystick Teleop provided by ROS developersFootnote 4.In this paper, the mission was to accompany a group of people from a starting position to a goal, avoiding static obstacles and pedestrians while using a specific accompaniment formation (V-form or Side-by-Side).

6.2 Human-Robot Interaction to Facilitate Group’s Relation.

Previous experiments with Tibi demonstrated that most volunteers did not know how to behave naturally with the robot in terms of creating a relationship to be involved in a mutual human-robot accompaniment. They try to arrive at the destination faster, forgetting the mutual accompaniment with the robot. This fact was repeated during around 30 experiments distributed between the experiments of our previous papers of accompaniment and accompaniment plus approximation. This forced us to remove these experiments from the study or explain to people that the most important thing was not to arrive at the destination or the person to be approached quickly. Due to this, our work accompanying only one person [2] does not have a sufficient number of participants to achieve significant results using surveys. Then, we created a new robot’s spoken interaction to help people to interact with the robot in three ways: creating a relationship with the robot using a game, interpreting the behavior of the robot, and helping participants to remember different steps to be performed during the experiments.

First, we explain the part of the robot’s spoken interaction that allows people to create a relationship with the robot, ensuring that mutual accompaniment arises naturally. To do that, the robot performed a game inspired by the game “I see I see” but even simpler, as we noticed that it was challenging for the volunteers to find objects in the environment while walking with the robot. Then, we defined a new game consisting of guessing an animal. Tibi started the interaction with a phrase that indicated the letter with that the animal’s name begins. Then, Tibi repeated the letter to get a better understanding of it. Additionally, we did not include all the letters of the alphabet. Instead, we chose the animal letters for which people can come up with more animal names and thus not get frustrated. The website of the paper\(^2\) shows these selected letters for the animal names in Spanish since most of the experiments have been carried out in this language. However, the phrases were set in English so the reader could understand them well. In addition, the participants were told that they had to agree on the selected animal because we wanted to encourage the interaction between the two people to study if the position of the robot interferes with their communication. After that, Tibi answered them if the animal was correct or not.

Second, the robot’s spoken interaction included an automatic speech during the experiments to help the volunteers to remember the steps to perform during the interaction. In doing that, we minimized the interaction of the researchers with the volunteers during the experiments. First of all, Tibi reminded the volunteers that they needed to do a stroll with the phrase: Walk slowly, as if you were walking quietly. After that, Tibi indicated that it was ready to start the accompaniment experiment by telling them the phrase: You can start walking all. The word all in the sentence served to indicate that Tibi was accompanying two people. We needed to add the word all (which is not correct in English) to know if the robot correctly selected both people at the beginning of the experiment. After that, we started the game previously described. When they arrived at the first destination in the environment, Tibi reminded the accompanied people that they needed to return, with the phrase: Stop please and position yourself to return to the initial position. Afterward, the group continued with the game until they arrived at the starting position. Then, Tibi told them: Now, you can fill out the questionnaire. Thank you. This last instruction alerted the volunteers that the experiment had ended and reminded them that they must complete a survey before leaving.

Third, the implemented interaction had other automatic phrases that Tibi said to inform us or the volunteers to be able to interpret different behaviors of the robot. It is important to include this part of communication between the robot and people to get them to understand the behavior of the robot at all times. If it has lost the two volunteers, Tibi said: I lost you. Come closer, please. On one hand, it informs us that Tibi lost the people tracks and we need to activate the action that allows the robot to select the closest tracks as the people it accompanies. On the other hand, it informs the volunteers that Tibi needs them to come closer to detect them again. If Tibi used the creation of a new track of person \(P_{c_2}\), because this person was occluded by \(P_{c_1}\), it said only once: No person two. Select id, please, to inform us about this situation and if we see that the track of the second person appears again, the robot can select it again. If Tibi had no possible path because many obstacles were surrounding it, it said: I can not move, sorry. I can not find a path, obstacle very close. This allows people to understand why the robot is stopped and permits them to react and get apart from it or inform us that Tibi stops near obstacles. When people exceeded the limit of the maximum velocity of Tibi, it said: Walk slowly, please. I can not follow you. This allows volunteers to react and slow down to allow the robot to accompany them. Finally, if people did not slow down as the robot asked them and they moved away from the robot more than 3 meters, it uttered: Wait for me please. In this last case, people were already out of the area of social distances from the robot, and people needed to stop and wait for it if they wanted to continue the HRI. All these phrases of Tibi were included in the website of the paper\(^2\), in addition to an image and videos that show this robot’s spoken interaction.

All these robot speech phrases that communicate internal robot states and the interaction using a game facilitated the experiments’ development and the robot interaction with people. Furthermore, we have included non-verbal communications in the robot interaction. For example, Tibi moved its mouth while talking, and when it was not speaking, it smiled. If Tibi-robot was talking, it moved its head in the direction of the people that it accompanied. In the central case, it moved its head to both sides, and in the lateral case, Tibi moved its head towards the side where the companions were positioned. The Tibi face expressions were implemented in the paper [70].

6.3 Results of User studies

Real-life experiments were done with Tibi in two different locations: Facultat de Matemàtiques i Estadística (FME) and the Barcelona Robot Lab (BRL), located in the Campus Nord of the Universitat Politècnica de Catalunya (UPC). We tested both methods, the ASP-VG and the ASP-SG, including the three possible group formations of each one (one-person accompaniment or two-people accompaniment with the robot positioned at the center or side of the formation). Furthermore, we used the experimental procedure of Sect. 6.1, and the robot accompanies the volunteers from one destination to another while interacting with them (using the human-robot speech interaction of sect. 6.2). We have divided our results into the results obtained from objective measurements that use the performance metrics to evaluate the robot’s group accompaniment, Sect. 6.3.1, and the results obtained from subjective measurements, where we use questionnaires to extract the people’s preferences of group accompaniment, in Sect. 6.3.2. All performance results have been expressed on a scale between 0 and 1, and the value between brackets corresponds to the standard error of each mean value.

The maximum velocity of the robot was 1.2 m/s, which is close to the average of the velocity of people when they walk [71]. The ideal distance of accompaniment was 1.5 m initially, but in the final experiments, we reduced it to 1 m, since people got closer to the robot. Furthermore, the values of distances and velocities that we use during the accompaniment were obtained from a previous work [13], which determined the personal space and velocities desired by people when they interact with our Tibi-robot. The participants walked as they preferred regarding position and velocity, and changed their positions inside the group if needed. Also, we have included examples of spontaneous behaviors of people during these experiments in Fig. 10 and videos in the website of the paper\(^2\). These behaviors are challenging for robots because they do not follow the rules of a conventional accompaniment, which helps us demonstrate that our robot behavior can adapt to real-life situations.

Fig. 10
figure 10

People’s random behaviors in real-life experiments. We include five illustrative images to show the behavior of the robot while accompanying a group of two people when people exhibit strange walking behaviors

In the experiments of the actual paper, we focused on testing the robot’s group accompaniment methods. So, most of the results that this section shows are for the accompaniment of groups, and we only treat the accompaniment of a single person in the results obtained from the objective measurements. Due to that fact, most of the one-person accompaniment experiments were included in [6].

The volunteers were mainly students and workers of the Campus Nord. In all performed experiments, Tibi accompanied the participants in one of the three possible formations (side-by-side or group accompaniment at central or lateral) while the group walked between different places. Furthermore, other bystanders were walking around the campus; therefore, sometimes, they interfered with the path of the group, giving rise to obstacle avoidance situations. No instructions were given to the volunteers regarding their exact positioning with respect to the robot during the accompaniment. Thus, we do not coerce the behavior we want to study. Participants could also change their positions inside the formation, and the robot kept doing its job well. We could see how Tibi turned itself at the final of the square accompanying them and also how Tibi accompanied them following any possible direction until the goal, not only the straight line.

During all our experiments, we used the people’s leg detection and a tracking algorithm of Sect. 3 to select people that accompany the robot using the identification number of the tracker. At the beginning of each experiment, we chose the two nearest people with respect to the robot. Also, our tracking algorithm was able to keep track of both accompanied people during their interactions, even in unexpected cases where other pedestrians moved across the group and occluded one volunteer momentarily. The skill that generates a simulated person, Sect. 3.3.3.1, was used only when the occlusion of the second accompanied person persists over time because the other companion occludes it.

6.3.1 Results of the Objective Measurements Using Performance Metrics

Robot’s V-Formation

More than 70 people participated in the experiments of single-person or group accompaniment to evaluate the ASP-VG using objective measurements. Fig. 11-up shows three moments of a group accompaniment using the V-formation, and the results of all the experiments for this method are shown in Table 3.

Table 3 Performance results of the real-life experiments of the ASP-VG

Comparing the performance results obtained in both real environments (FME and BRL), the difference in performance values was due to the difference in available free space in each scenario. In the FME case, we had a square area of 15x15 m. However, in the BRL case, we had an area three times larger. In the FME case, this difference in walking space has a consequence that the position of the robot during most of the interaction time is not the best regarding the perfect formation due to the robot initialization period. The performance of the robot was significantly affected by this fact. The initialization period also affected the one-person accompaniment but less than in case of the group accompaniment. Also, we reduced this initialization period as much as possible in the final experiments at the BRL, where we compared both methods because the ASP-SG had a shorter initialization time with respect to the ASP-VG.

Fig. 11
figure 11

Real-life experiments of side-by-side. We include six illustrative images to show the behavior of the robot while accompanying a group of two people

In the BRL location, the robot obtained similar performance values to the performance values of the simulations for all cases (one-person accompaniment, group accompaniment at the lateral and in the middle) because the initialization period did not affect much. However, the performances of the group’s accompaniment had less value than the performances of the one-person’s accompaniment since it was easier for the robot to adapt to the behavior of one person than to two people simultaneously. Therefore, all performance values are over the 0.64, and when the initialization period does not affect, these performances are over the 0.77 value of performance. Videos of ASP-VG are included in the website of the paper\(^2\).

Robot’s Side-by-Side Formation

As in the previous section, we compared the results obtained in the FME with the ones obtained in the BRL. Also, for the ASP-SG, we got a reduction in performance due to the initialization time, but less than in the case of the ASP-VG method. Therefore, we have decided to test the ASP-SG with inexpert people directly in the BRL location to obtain a more realistic behavior of the robot.

We tested the robot’s ASP-SG during 74 real-life experiments of people accompaniments in the BRL, Fig. 11-down. Table 4 shows the robot’s ASP-SG performances for different cases of real-life experiments. All performances had a score over 0.6593, and the lowest performance score was similar to the one obtained in simulations with dynamic obstacles in terms of angle performance. Notice that this lowest value is for the angle accompaniment, which is not evaluated in the V-form. The small value of angle performance was obtained in the case of the robot positioned at the center of the formation of the group. In this situation, people get closer to interact with the other person, which causes the robot to stay slightly behind so as not to collide with them. Then, it was no longer the “perfect” side-by-side because they formed a very slight V-formation. Then, the angle is not exactly 90 degrees. If we do not consider this lowest case, all the other values are over the 0.77 value of performances as in the V-formation case. Videos of the robot’s ASP-SG experiments are included in the website of the paper\(^2\).

Table 4 Performance results of the ASP-SG real-life experiments

Discussion of Results Obtained Using Objective Measurements for Both Methods If we compare the ASP-VG and the ASP-SG performances, we only could compare the area performances. We observe that all performances fall in similar values for all cases. Also, the small differences between the results of both accompaniments could be due to different types of situations where the group needs to avoid obstacles or people. Because, with inexpert people and in a dynamic environment, it was impossible to accurately reproduce the same situations for both methods at different instants of time.

Also, in both methods, we have similar values of performances for the group accompaniment when the robot is at the side or in the center of the formation. We want to evaluate in deep the case of the side-by-side when the robot is at the side of the formation because we use only the one-person accompaniment with the central person. In these real-life experiments, the values of performances of the robot at the side of the group were similar to the simulation scores and higher scores compared to the performances of the robot in the center of the group. Therefore, when the robot was at the lateral position using only the side-by-side accompaniment of one person, it was enough to obtain good results for a whole group accompaniment.

Fig. 12
figure 12

User’s study results. Left: Comparative between the teleoperation and ASP-VG. Right: Comparative between the Teleoperation and ASP-SG

Some results can be extracted from the comparison of the velocities of the robot and people during the accompaniment, Fig. 7. We observe how the Human-Robot Interaction using the game reduces the walking velocity of people (now, it is enough for the maximum velocity of the robot of 1.2 m/s to obtain a good robot accompaniment). The difference in the walking speed of the group between the Central and Lateral robot’s accompaniment (0.1 m/s less in the central case) is due to the proximity of the people to interact with each other, which makes the robot stay a little behind the group due to repulsive forces to its companions, causing people to slow down, even more, to interact well with the robot. If the robot is placed on one side of the group, people can have a conversation closer without interfering with the robot accompaniment. Then, they can walk faster than in the other case. These velocities are for the side-by-side accompaniment, but we have obtained very similar graphical results in the case of the V-formation.

It has to be mentioned that it would have made the work much stronger if we had compared the current work to other existing approaches. Nevertheless, at this moment, in the state-of-the-art, there are no approaches that can be compared with our work. The state-of-the-art approaches do not consider the robot as an equal partner for group accompaniment of more than one person. They only accompany one person, or they only maintain group cohesion. They do not maintain a formation that facilitates the group members’ communication, where people can consider the robot as one more active member of the group. Nevertheless, this fact allows us to develop two group accompaniment methods, and perform an extensive and comprehensive user study to extract conclusions from non-expert participants about their preferences in robot’s group accompaniment. Also, the differences in the robots’ characteristics and the small availability of code for state-of-the-art approaches make the comparison between state-of-the-art methods difficult.

6.3.2 Results of Subjective Measurements Using Surveys

We have developed five different survey studies to determine the acceptability of the methods and to study the preferences of people who are not experts in robotics. We have used several questionnaires included in the website\(^2\) of the paper to convert the subjective opinions of the volunteers about our behavior of the robot into quantitative data to analyze it and extract some conclusions. First, we have compared each of the methods with the teleoperation of the robot by an expert to know if the utilization of each model enhances the robot’s companion behavior. Then, we have conducted three more studies about the differences between the two groups’ accompaniments and inside each method between the two different formations regarding the position of the robot. These three last studies are to ascertain human preferences regarding the type of group accompaniment and the position of the robot inside the group formation of both methods. Next, we show the sections for each user study.

  • Sect. 6.3.2.1 includes the comparison between a human teleoperating the robot and the ASP-VG. Results are included in S1) of Fig. 12

  • Sect. 6.3.2.2 includes the teleop comparison with the ASP-SG. Results included in S2) of Fig. 12

  • Sect. 6.3.2.3 includes a comparison between the two methods of group accompaniment (ASP-VG and ASP-SG). Results included in S3) of Fig. 13

  • Sect. 6.3.2.4 includes a comparison between the V-form group accompaniment (ASP-VG), when the robot is at the side or when the robot is in the center. Results included in S4) of Fig. 13

  • Sect. 6.3.2.5 includes a comparison between the side-by-side group accompaniment (ASP-SG), when the robot is at the side or when the robot is in the center. Results included in S5) of Fig. 13

Fig. 13
figure 13

User’s study results. Left Comparisons between the ASP-VG and ASP-SG. Center Comparisons of the ASP-VG with the robot located at the lateral or at the center of the formation. Right Comparisons of the ASP-SG with the robot at the lateral or center of the group

For all studies, we have as an independent variable the position of the robot with respect to the people (teleoperated, using the side-by-side group accompaniment and the V-formation group accompaniment). Also, we have as dependent variables: the “intelligence”, the comfortableness, and the sociability of the behavior of the robot perceived by the volunteers during the HRI. Also, in the new three studies, we include the people’s interaction between the two persons of the group as a dependent variable to know if the position of the robot interferes with the people’s interaction. Also, we focus on the null hypothesis that people will perceive as equal the two robot’s accompaniment behaviors that we are comparing in each study.

Questions of the user studies were rated on a seven-point scale from “Not at all” to “Very much”. To analyze the results, we grouped the questions into fourth topics related to our dependent variables: person’s interaction, robot’s sociability, intelligence, and comfortableness. Also, the questionnaires have been included in the website of the paper\(^2\). Regarding the robotics knowledge of the participants, we have obtained a mean of 3 with a standard deviation of 1.5. Therefore, most of the participants were nonexperts in robotics, and consequently, they were potential users. Participants were mostly students and a few workers of the University, with an age range between [11–58] years old. Furthermore, around 70% of the participants were men.

In order to know the reliability level of each scale, we used Cronbach’s alpha analysis. Each scale response was computed by averaging the results of the survey questions comprising the scale. These scales surpassed the commonly-used 0.7 level of reliability (Cronbach’s alpha)Footnote 5. We run ANOVAs tests on each scale to highlight differences between the two behaviors of the robot compared in each user study. We used the Shapiro-Wilk test in order to test the null hypothesis that the data was drawn from a normal distribution. Therefore, we were able to run ANOVAs tests. Furthermore, we include an extended discussion of preferences of people during the robot’s accompaniment, extracted from the experiments of the user studies, in Sect. 7.

6.3.2.1 Robot’s ASP-VG vs Robot’s Teleoperation

We performed 174 real-life experiments in the FME and North Campus of UPC with the Tibi robot: 87 using the ASP-VG method and 87 controlling the robot by teleoperation. Here, we are comparing the behavior of one of our methods, the ASP-VG , against the robot’s teleoperation by a human. We compared both using the three possible formations regarding the number of members of the group (two or three) and the position of the robot into the formation of the group (side or center).

Social Scales: We obtained a 0.71 level of reliability, Cronbach’s alpha, for both scales of the robot’s sociability and comfortableness felt by the volunteers. ANOVAs tests were run on each scale, the robot’s sociability and comfortableness. The mean and standard variation scores are shown in Fig. 12-S1). Pairwise comparison with Bonferroni demonstrates no statistical difference between the two navigation approaches, obtaining a \(p>0.05\). Concretely, we obtained a \(p=0.5\) for Robot’s sociability, and a \(p =0.2\) for Robot’s comfortableness. Therefore, we should highlight that there is no statistical significance between the proposed ASP-VG method and teleoperation. Then, our null hypothesis that both methods are perceived as equal is confirmed.

6.3.2.2 Robot’s ASP-SG vs Robot’s Teleoperation

The robot accompanied 148 people at BRL, where it was randomly selected if the robot was teleoperated or used our ASP-SG method to compare these two behaviors. Each person fulfilled a survey to know their feelings about the accompaniment experience. Then, we compared both robot behaviors using the three possible formations: one-person or two-people side-by-side group accompaniment, where the robot can position itself at the side or in the center of the formation.

Social Scales: We obtained a 0.75 level of reliability, Cronbach’s alpha, for both scales of the robot’s sociability and comfortableness felt by the volunteers. ANOVAs were run on each scale, robot’s sociability and comfortableness. The mean and standard variation scores are shown in Fig. 12-S2). Pairwise comparison using Bonferroni’s technique did not show a statistical difference because we obtained a \(p>0.05\). Concretely, we obtained a \(p=0.9\) for Robot’s sociability, and a \(p =0.2\) for Robot’s comfortableness. Then, there is no statistical difference between the ASP-SG method and the teleoperation behaviors. Then, our null hypothesis that both methods are perceived as equal is confirmed.

6.3.2.3 ASP-VG vs ASP-SG

We performed 120 experiments in the BRL, alternating both methods, the ASP-SG and the ASP-VG, to accompany groups of people. We compared the same formation for both methods; that is to say, they walked to the first goal using the ASP-SG with the robot in the center of the group, and they returned using the ASP-VG with the robot in the central position. The same procedure was also applied when the robot was situated at the lateral of the group’s formation. During the comparison, we reduced the distance to 1 m. Researchers desire the same conditions in both methods, and the side-by-side approach in the central position allowed a variable distance in reality.

Social Scales: We obtained a 0.7 level of reliability, Cronbach’s alpha, for all four scales (robot’s sociability, comfortableness, and intelligence, and people’s interaction). We included the person’s interaction to remark any perceived difference regarding the interaction of the two persons while the robot was accompanying them in a concrete formation, at lateral or in the center of the group’s formation, since the position of the robot can interfere with the interaction between both people. For example, when the robot is in the center, it can interfere in the conversation between the two people participating in the experiments; or if the robot is at the side, it can be uncomfortable for the central person to turn around every time he/she wants to interact with the robot.

We performed an ANOVA test for each scale to highlight similarities or differences between the two robot’s operation modes: the ASP-SG or ASP-VG methods. The results are included in Fig. 13-S3). Pairwise comparison using Bonferroni’s technique shows no statistical difference, \(p>0.05\), for the cases of: person’s sociability (\(p=0.54\)), robot’s sociability (\(p=0.2\)), and robot’s intelligence (\(p=0.2\)). Besides, we obtained a statistical difference of \(p<0.05\) for the case of the robot’s comfortableness (\(p=0.02\)).

If we analyzed the mean values of the robot’s comfortableness results, it was bigger for the ASP-SG. Perhaps this result was related to the comments included in the discussion section that some participants in the experiments told us. They might have considered it more comfortable to see the robot at any moment (Behavior of the ASP-SG and not seen it anticipated or delayed with respect to them (Behavior of the ASP-VG). Therefore, it seemed that both methods could be accepted in the same way by inexperienced people, except for the comfortableness factor, because people always preferred to see the robot closer, and this behavior is best achieved with the ASP-SG method. Then, our null hypothesis that both methods are perceived as equal is confirmed, except for the comfortableness. Also, in the case of finding differences between both methods, we expected that the V-formation would be considered the best since studies of the accompaniment between people [8,9,10, 72, 73] have found that it is the formation that arises naturally for people. However, in the case of our robotic platform and our robot’s group accompaniment methods, we have seen that this is not the case. Therefore, we had to use our volunteers’ comments to extract why they preferred the side-by-side over the V-formation if, in theory, the V-formation is more natural for people groups.

6.3.2.4 ASP-VG: Robot’s Lateral vs Central Positions

We performed a user study among 37 volunteers to compare the two ASP-VG possible formations with the robot at the lateral or the central position. People were asked a set of questions to compare both V-formations, central and lateral positions.

Social Scales: All scales obtained a 0.82 level of reliability Cronbach’s alpha. In the ANOVA test of each scale, a pairwise comparison using Bonferroni’s technique showed no statistical difference, \(p>0.05\), for all the cases: person’s sociability (\(p=0.81\)), robot’s sociability (\(p=0.2\)), robot’s intelligence (\(p=0.85\)), and robot’s comfortableness (\(p=0.2\)). Then, people did not show any preference regarding the position of the robot in the V-formation during the interaction with it. These results are included in Fig. 13-S4). Then, our null hypothesis that people will be indifferent to the position of the robot during the accompaniment is confirmed.

6.3.2.5 ASP-SG: Robot’s Lateral vs Central Positions

We also performed another study that asked a set of questions to 50 people referring to the two possible side-by-side formations regarding the position of the robot, central or lateral.

Social Scales: We obtained a 0.82 level of reliability Cronbach’s alpha for all scales. Besides, in the user study of the ASP-SG for the ANOVA’s, pairwise comparison using Bonferroni’s technique showed no statistical difference, \(p>0.05\), for all the cases, except for the person’s interaction. In the person’s interaction case, we found \(p<0.05\), with a higher mean for the robot positioned in the lateral.Then, we obtained a \(p=0.41\) for robot’s sociability, a \(p=0.91\) for robot’s comfortableness, a \(p=0.8\) for robot’s intelligence and a \(p=0.01\) for person’s interaction. These results are included in Fig. 13-S5).

Comparing the two possible formations concerning the position of the robot in the case of side-by-side accompaniment, non-expert people prefer that the robot accompany them at the lateral of the formation to be near the other person in the group to interact. Then, our null hypothesis that people will be indifferent to the position of the robot during the accompaniment is confirmed, except for the robot’s comfortableness. Also, in this case, we expected that this position of the robot at the side of the formation would be more comfortable for people since when the robot is in the center of the side-by-side formation, it can make it difficult for both people to communicate well with each other. Then, this fact was confirmed by the results of the surveys and by the comments of people about why they preferred the robot at the side of the formation. Nevertheless, we must also consider that this fact did not appear when comparing both methods. Therefore, perhaps a possible future work would be to implement an intermediate formation between both side-by-side and V-form to ensure that the volunteers always feel comfortable with the group formation while interacting. Also, this new method should be compared with the other two previous methods.

7 Discussion

There are different real-life roles and functionalities where robots can assist people using collaborative navigation [40]. Robots can be used as museum or city tour guides [34, 70, 74], shopping assistants [75], social companions for the elderly [45, 76], or wheelchair autonomous systems that can navigate alongside their caregivers [57,58,59].

7.1 Novelties of the Current Paper

The novelties of the present paper with respect to the previous ones and the state-of-the-art are as follows:

(1) The proposed Adaptive Social Planner provides a general methodology to implement HRCN. This general formulation was not presented earlier in any of our previous works nor in any state-of-the-art works. Some of the applications that can result directly from the ASP are robot navigation [1], a robot accompanying a person [2] or multiple people [3, 4], or a robot approaching people without any companion or with an accompanied person [5, 6]. Furthermore, other robot behaviors previously implemented had the ESFM as their core, which is part of the ASP. These methods combine the ESFM with learning algorithms or use the ESFM to achieve human-drone interaction [16,17,18]. Then, these robot behaviors can use the ASP method to include more functionalities.

These functionalities can be extended because the ASP includes at least three improvements for these methods. First, the ASP includes a planning algorithm that allows the robot to anticipate the movements of people and not only react to these motions. Second, it includes other interaction forces that model robot interactions with the environment, for example, the interaction between the robot and objects. Third, the ASP includes a path evaluation to select the one considered the best. For instance, we can include some preferences of people in the selection of the path. Additionally, we have not studied all the interaction forces that include the ASP generic method. Therefore, future works will possibly include other types of collaborative robot social navigation with humans.

One difference between our method and state-of-the-art works we have not included previously is that they focus on their formulation, concretely on the SFM, in only one application. This does not show the complete potential of the SFM methodology. Additionally, most of them do not combine any planning method with the SFM, which only allows reactive robot behavior that can not anticipate its actions in the environment. This fact makes them obtain suboptimal results in navigation with humans. Then, in the ASP where we combine the RRT* with the ESFM, we obtain a complex behavior that allows the robot to navigate in uncontrolled environments with people. Moreover, we provide a general ESFM formulation that includes attractive and repulsive forces with respect to all the elements included in urban areas: places, objects, people, animals and robots. Finally, we include a general cost formulation to evaluate the planned paths related to all these force interactions, the geometric properties of the path, and the preferences of people to select the best path.

(2) The experimentation procedure has been developed and evolved over the years at the Institut de Robòtica i Informàtica Industrial (IRI), which can be used by other researchers as a step-by-step guide to successfully complete real-life experiments with people inexperienced in robotics. Our experimental procedure has evolved over more than 600 experiments with potential users, resulting in a robust procedure for conducting experiments. This procedure enables users to fully understand the interaction with the robot and feel safe by explaining all the robot safety protocols while allowing researchers to know their tasks during the experiments. Furthermore, to these experiment guidelines, we add a robot speech interaction that allows the members of a group to form a relationship among them and to better understand the robot behaviors, as described in Sec. 6.2. The speech interaction allows the group to perform a mutual social and natural human-robot accompaniment.

Moreover, concerning the state-of-the-art, most works omitted including an explanation of the protocols that they followed during their paper experimentation. In addition, many of them did not include the documents used. This fact means that many beginning researchers have to start from scratch when carrying out their experiments, especially in the case of potential users who are not experts in robotics. Therefore, we considered it a contribution to include our entire procedure for carrying out the experiments and the documents we used. We believe that it is an important contribution to the state-of-the-art, considering our experience in uncontrolled urban environments with people who are non-trained volunteers in robotics. In addition, this procedure enables us to provide a pleasant robot interaction for people who do not usually interact with robots. Encouraging an enjoyable interaction with the robot is crucial to ensure that robots are accepted in our societies.

(3) We have developed a complete evaluation of the implemented methods for the accompaniment of groups by including additional results and evaluations of all simulations and real-life experiments and by including three user studies that have not been published previously. These three user studies are focused on the comparison between the robot’s group accompaniments and the comparison between each method concerning the two possible formations regarding the position of the robot inside the group, in the center or at the lateral. With these three new user studies, in combination with the two previous ones (that compare the two robot behaviors with the teleoperation by a person), we provide better insight into the accompaniment preferences of people in groups of three.

As far as we know, concerning the state-of-the-art, we are the first to develop two methods that allow the robot to accompany more than one person in uncontrolled urban environments; while promoting communication among all group members. Our social interaction refers to the ability to see the face of each other to be able to communicate through speech or gestures. Our methods attempt to maintain as much as possible these two formations that promote social interaction, only breaking them in cases where the only possible path for the group includes avoiding obstacles or other people in these uncontrolled environments. For this reason, we can not compare ourselves with other methods. However, this fact has allowed us to develop two types of group accompanying methods, and thus, to perform a better user study of the preferences of people when they are accompanied by a robot which is an active part of the group as one more “coworker/friend.”

7.2 Preferences of People for Robot’s Accompaniment

The first two user studies demonstrate that people accept the two group accompaniments: V-formation and Side-by-Side formation. Regarding the comparison between both formations, the V-formation and the Side-by-Side formation, people prefer the ASP-SG in terms of comfortableness because the robot was within their field of view. We extracted some conclusions from the comments after the experimentation process.

First, inexpert people preferred to see the robot at all times and feel the robot as close as possible. Second, to be accepted, the central V-form needed to be as small as possible to feel that the robot was close to them. Then, they did not think that the robot was behind them (sometimes, with the V-form at the central position, they thought that the robot was not able to accompany them, and they reduced their walking velocity), or they did not think that the robot was advanced with respect to them (sometimes with the V-form at the side position, they thought that the robot went “alone” to the goal and it did not wait for them). Third, if the group could perform a very small V-form, it would be considered very similar to the side-by-side formation. Then, unskilled people would not have been able to differentiate between the two formations.

Regarding the preferences with respect to the position of the robot inside each formation, V-form and Side-by-side, we identify different conclusions. In the ASP-VG, we did not find any preference with respect to the position of the robot. However, in the case of the ASP-SG, we observed that people preferred the robot at the lateral position to interact easily with the human partner. During the experiments, several volunteers told us that speaking with the other volunteer was difficult if the robot was in the middle position. Additionally, this difference was not seen in the V-form. However, in the user study where we compared the ASP-VG with the ASP-SG in Sect. 6.3.2.3, people felt more comfortable with the robot when it was within their field of view as in the side-by-side formation, even in the case where the robot is in the center.

Finally, most volunteers said they preferred the accompaniment when the speech interaction was included. Then, people appreciated the verbal exchange with the robot. However, we attempted to extract a comparison between the case with and without the robot speech interaction, but the surveys did not show any statistical difference. This maybe be because these surveys were customized to highlight differences in terms of accompaniment and not human-robot spoken interaction.

7.3 Limitations

Due to occlusions, it may be challenging to track all members of a large group, and communicative interaction among all members would be difficult. Thus, it is natural to focus on a specific limited implementation to groups of 2 or 3 components, as larger groups tend to split into two and three people subgroups [8, 9, 77]. We would like to emphasize that both implemented methods in this paper can be easily extended to accompany groups of more than 3 members. The mathematical model of the ASP-VG is a general N-pedestrian model. Also, in the case of the ASP-SG, it could be easily extended to accompany groups of more than 3 members due to its construction of independent forces for each group member. Furthermore, we could add more forces between other accompanied people, only considering their expected positions inside the formation of the group.

Moreover, we extracted some conclusions from all the real-life experiments. For instance, people attempted to behave as naturally as possible with the robot. However, due to the size of our robot, it was complicated to reduce the distance for safety reasons. Additionally, we believe that a final distance of 1 m between the robot and the person is a reasonable distance that provides security and comfort. In addition, it is difficult to increase the velocity due to the mass of Tibi, but studies of walking behavior of people [71] demonstrate that people use similar velocities when walking around.

Finally, we must address some cultural and spatial limitations of both methods. The ideal distances of accompaniment and the maximum velocities are customized for our robot and for European people. Therefore, if these methods must be applied in other cultures, the parameters must be adjusted. Our two methods of group accompaniment can also deal with passageways. Then, those ideal formations can be obtained only in vast spaces, such as museums, airports, malls, or urban areas.

7.4 System Modularity

In addition, we would like to highlight that the reader can use other methods as input for the ASP by only respecting the data that it needs at its input: all the actual and future positions of people, all the obstacles of the environment inside its navigation window, the localization of the robot inside the map, and one destination for each person of the environment.

For example, the ASP method uses the localization of the robot inside the map to compute all the distances between the robot and the elements of its environment (people and obstacles). Then, if the map is removed, this information should be provided in another way. In addition, the ASP does not exactly need a real destination of the environment. Then, it can work without knowing the map, only using a destination projected from the group movement 5 meters ahead. However, it is indeed more realistic if we use the places in the environment where people should go. Furthermore, the reader can also change the RRT* by other planning algorithms, but this will require redefining the method because this planner is integrated inside. In addition, we selected the RRT* because it allows us to obtain multiple paths in real-time whose origin is the current position of the robot, which enables us to integrate the ESFM at every step of the way more easily.

7.5 Future Work from the ASP

Future work can be extracted from this paper to develop the interactions between the robot and the environment that we have not been able to study. For example, we have not explored how to model the repulsive forces regarding one destination and its corresponding cost to avoid paths near this destination. Therefore, it can be very interesting that the robot knows that we are in a wheelchair and applies this to its behavior by using repulsive forces with respect to destinations to avoid stairs where we cannot go. Also, we have not included attractive forces concerning objects of the environment to combine the planning with other interactions, such as grasping a glass. In the ASP, these forces include a part related to the costs of selecting these paths, which should be modified at the same time.

Furthermore, other types of HRCN can have different forms or parameters for the forces and the cost that we already studied. For example, dancing with a robot should include repulsive and attractive forces with respect to the person dancing with the robot. However, these forces are different from those for accompaniment or approach. Finally, regarding costs, we have not explored the enormous possibilities that can arise when using the preferences of people in path selection, which can lead us to obtain customized robots. Additionally, these preferences can be incorporated directly by the person, including an interface to change these characteristics by potential users not related to robotics. For example, if we are tired and do not want to climb stairs, the robot should select paths without stairs.

8 Conclusions

This work presents an entire system that can perform different behaviors of HRCN, whose core is the Adaptive Social Planner (ASP). This method is one of the main contributions of the paper. The ASP combines an RRT* path planner with the new Extended Social Force Model and a new formulation of path costs to select the path with a minimum cost using a gradient descent optimization. The output of the ASP is the best robot behavior for accomplishing human-robot collaborative navigation.

In previous works, we demonstrated that the ASP can be customized to perform other types of HRCN, such as robot navigation [1], robot accompaniment of a person [2], or robot group accompaniment of two people [3, 4], or a robot approaching people [5], or a combination of accompaniment and approaching people [6].

Furthermore, there are other methods that only include the part of the ESFM of the ASP [16,17,18], where the ESFM customization allows a Humanoid robot to perform navigation tasks that combine the ESFM with learning or enable the interactions of people with other types of robots, such as a drone. Furthermore, these methods can be improved by including the ASP or at least part of its characteristics. Then, the ASP method can be applied to accomplish these types of tasks using different types of robots.

In this paper, this navigation framework has been customized to perform two different robot group accompaniment methods, ASP-VG [3] which uses a V-formation and ASP-SG [4] which uses a side-by-side formation. These two formations allow the group to communicate among themselves most of the accompaniment time, only breaking this formation to facilitate future interactions with other pedestrians and obstacles.

Moreover, the ASP and its two derived methods include social distances and other works of human-robot comfortableness to allow more social and natural robot behavior. To evaluate these aspects and the performances of the two robot formations, we developed two sets of performance metrics, one set for each method.

Another contribution of the presented paper is the development of a complete evaluation of the group accompaniment methods. We tested both methods in synthetic experiments (more than 5, 300 simulations) and real-life experiments (322 experiments with nontrained volunteers) in two outdoor environments, obtaining promising results. The real-life experiments include five different user studies. The results of these studies show that nonexperts in robotics accept both accompaniment methods. However, they prefer the side-by-side over the V-form because they consider that the robot is closer to the group, relates more to them, and makes them feel more comfortable with this behavior of the robot. In addition, in the side-by-side accompaniment, they prefer that the robot accompanies them on the side to communicate better with the other person in the group as the robot does not interfere with their field of view.

The final contribution of this work is that we describe the methodology that we develop to perform real-life experiments with nonexperts volunteers in robotics. Most state-of-the-art approaches do not include their methodologies, much less the documents they use to carry out their user studies, such as people’s consent, experiment instructions, and survey questionnaires.

Furthermore, we include a robot’s spoken communicative interaction in the new methodology. This robot’s speech interaction has three advantages: it allows people to create a relationship among all the members by using a game; it allows people to only interact with the robot by including automatic phrases to remember them to do some actions (for example, filling in the survey when the HRI ends); and it allows people to better understand the behavior of the robot by informing them of internal robot states. Moreover, we expect that our robot’s speaking interaction can be an inspiration for new researchers who encounter similar problems, as we need to create a relationship among the group that facilitates HRI. Additionally, we expect that our example will allow researchers to develop complete interactions between people and robots in the future, not only including the spatial interaction during accompaniments or other tasks. Volunteers highly appreciated the new spoken communicative interaction of the robot. We are “social animals” who not only communicate through actions or gestures, but a large part of our communications are spoken. This fact makes us prefer robot behaviors that include speech rather than just actions or gestures.