Introduction

The improvement of productivity is a key feature to solve the problem of labor shortages in an aging society. Industrial robots have contributed to this issue by automating the repetitive tasks in structured environments such as factories, but there are still numerous tasks that are difficult for them and human workers have to work on. Humanoid robots are expected to be promising solutions to take over these tasks because they have human-like body structures to work in unstructured environments designed for humans. In these scenarios, humanoid robots need to traverse challenging environments and perform tasks in a human workspace while interacting with the environment with both its arms and legs, which we call multi-contact activities, as shown in Fig. 1. For example, disaster response in a plant is one of the scenarios where humanoid robots are needed because the environment is originally designed for human workers and cannot be adjusted for robots. In that case, humanoid robots are required to perform multi-contact activities such as crawling to expand their support area for safety and climbing up a vertical ladder [2,3,4]. Large-scale manufacturing such as house construction [5] and airplane manufacturing [6] is another promising application for humanoid robots. Since it is difficult to introduce a production line for the large structures, the humanoid robots are expected to traverse the inside of the target structure using stairs with handrails [7•] or go into a confined space [8] interacting with the environment.

Fig. 1
figure 1

Example of multi-contact activities by the humanoid robot HRP-5P [1]. Left: crawling by the quadrupedal locomotion, Center: getting over a fence, Right: reaching an object while grasping handrails in a scaffold

This review provides a brief overview of the recent research, mainly on model-based approaches, to solve the problems of planning and controlling multi-contact activities for humanoid robots. We clarify the difficulties to achieve multi-contact activities and decompose the entire problem into hierarchical subproblems to make it tractable. Then, we investigate the technological advancements in each subproblem and discuss the open problems to improve the capability of a humanoid robot in real-world applications. We note that there is a book chapter on this topic written by Bouyarmane et al. [9]. Although it mainly focused on the methodologies to formulate multi-contact activities as optimization problems, we cover not only optimization-based motion generation but also search-based contact planning and review recent research updates in this field.

Fig. 2
figure 2

A general structure of a robot system to achieve multi-contact activities by humanoids. Based on the environmental measurements, a contact planner finds a feasible contact sequence and a motion planner generates the kinematically and dynamically consistent motion of the robot. Given the desired motion, the real-time controller sends commands to each joint of a robot while generating a CoM trajectory and contact forces to stabilize it under disturbances

Problem Statement

The notable characteristics of multi-contact activities by humanoids are the non-coplanar contact and acyclicity of the contact transitions [10]. They expand not only the capability of humanoid robots but also the complexity of the problem formulation compared to the classical bipedal locomotion framework. The search space of candidate contact transitions increases because we need to consider a large set of potential contact between the robot and the environment in the 3D space. The acyclic contact pattern, which may change the number of contacting limbs by making or breaking contact, also increases possible contact transitions to be considered. To guarantee the feasibility of the resulting motion, we need to explicitly consider the whole-body kinematics and centroidal dynamics of the robot with a high degree of freedom (DoF). The constraints of contact with the environment make the multi-contact motion planning problem in configuration space more difficult than traditional path planning because contacting limbs should reach the environment while other parts of the robot need to avoid collision with it.

Since it is computationally expensive to find feasible motion considering the above complex features, a robot system for multi-contact activities generally consists of the offline planning to generate globally optimal motion of the robot and a real-time controller that achieves the desired motion while compensating disturbances utilizing the sensor information as shown in Fig. 2. In the later part of this review, we focus on the planning and control of environmental interaction to achieve multi-contact activities assuming that environmental information has already been obtained. In terms of the offline planner, its major difficulty is the coupled problem of contact and motion planning. The whole-body motion of the robot is governed by the joint torques and contact forces through contact of the limbs, but the sequence of contact should be selected so that there is a continuous motion of the robot which satisfies its kinematic and dynamic constraints. Some holistic approaches consider contact in the optimization problem for whole-body motion planning [11, 12, 13•, 14]. However, it is difficult for these optimization-based approaches to globally search both the optimal contact sequence and motion of the robot because they can find only local optimum around its initial guess in principle. Therefore, decomposing the entire problem into contact and whole-body motion planning sequentially as shown in Fig. 2 is commonly adopted though the decomposed problems are not strictly equivalent to the original one.

Contact Planning

The target of contact planning for multi-contact activities is finding a globally optimal contact sequence to navigate the robot from the current to the desired state with a feasible motion in the given environment. Traditionally this problem was solved by the hierarchical method that first searches a sequence of discrete sets of simultaneous contact, which are called stances, and then plans continuous motion to reach them. This method is called the “stance-before-motion” approach, which is originally investigated in the field of climbing robots [15] and then introduced to humaonids [16]. The problem is that there are infinite sets of possible contact between the robot and environment in mult-contact activities. A sampling-based approach such as a probabilistic roadmap (PRM) is a classical solution to this problem [17], but the required number of stances to cover the possible contact transitions is extremely large. The major issue of contact planning is how to efficiently reduce the search space of the candidate contacts by focusing on its promising part.

Search-Based Contact Planners

There are several contact planners which utilize search-based methods such as A* [18]. The sequence of contact transitions can be modeled as the graph structure, and graph-search algorithms can efficiently find an optimal path in it without the exhaustive search if we can properly design the costs and heuristics. Dornbush et al. [19] proposed the single search algorithm to plan multi-modal humanoid locomotion such as walking, crawling, and climbing in an adaptively-dimensional search space with multiple heuristics. However, it is not easy to design appropriate costs and heuristics for graph-search algorithms in contact planning because we need to estimate the cost of the target contact transition in the future. In recent works, these features which are not easy to be estimated are represented by neural networks instead of explicit functions. Lin et al [20] trained neural networks to estimate the robustness in centroidal dynamics of the contact transition and consider it in the cost function for Anytime Non-parametric A* (ANA*) search to find a dynamically consistent contact sequence. Noda et al. [21] used neural networks to estimate whether keyposes can be expected to the goal in multi-contact locomotion planning.

Improve Efficiency of Global Contact Planning

If the motion of the robot can be roughly estimated in advance, it is reasonable to search for candidate contacts around it. Escande et al. [22] introduced a potential field to guide the best-first search for contact transitions. This potential field is generated from the small number of key postures for guidance. These can be automatically computed by projecting a collision-free posture to the contact manifold in the configuration space with the statics constraint [23], but it is computationally expensive to solve inverse kinematics in each projection to confirm its validity. Tonneau et al. [24] tackled this problem by focusing on the reachability of the robot. They constructed the reachable workspace of the limb, which is the convex volume of the end-effector poses for randomly sampled configurations of the target limb, and search the trajectory of the root link such that the trunk of the robot is collision-free while reachable workspaces of limbs can interact the environment. Then, they deterministically find the contact sequence along the obtained path by generating the sequence of whole-body configurations with the static equilibrium while retrieving the sampled limb configurations. This idea of approximating the contact manifold by geometrical condition improved the computational efficiency of contact planning. Kumagai et al. [25] extended this approach by narrowing down the possible range of feasible contacts focusing on the sustainability of contact. Lin et al. [26] evaluated the guiding path for the contact planner based on the traversability of the environment, which represented its difficulty of contact planning by learned regressors. Since we decomposed the multi-contact motion planning problem into contact planning and motion planning, it is important in contact planning to guarantee that the motion planner can generate a kinematically and dynamically consistent trajectory to follow the resulting contact sequence. Fernbach et al. [27] proposed the convex resolution of centroidal dynamic trajectories(CROC), which represents a conservative CoM trajectory for the contact transition by a Bezier curve. This method was extended to the continuous formulation and integrated with a contact planner to provide a conservative feasibility criterion of the target contact transition because it can be efficeintly computed by solving linear program (LP) [28••].

Motion Planning

Motion planning is responsible for generating the whole-body joint trajectory for the robot to follow the contact sequence found by global contact planning discussed in Section “Contact Planning”. To explicitly consider the kinematics and dynamics of the robot, optimization-based motion planning approaches are widely adopted to solve the whole-body motion planning problem for a humanoid robot.

Optimization Problem for Motion Planning

In general, an optimization problem for whole-body motion planning is formulated as Eq. 1, where \(\varvec{x}\) is the decision variable [7•, 29].

$$\begin{aligned} \min _{\varvec{x}}&F(\varvec{x}) \end{aligned}$$
(1a)
$$\begin{aligned} \text {s.t.}&A(\varvec{x}) = \varvec{b} \end{aligned}$$
(1b)
$$\begin{aligned}&C(\varvec{x}) \le \varvec{d} \end{aligned}$$
(1c)

In multi-contact motion planning, the target of the optimization is finding an optimal \(\varvec{x}\) which satisfies desired task function \(\varvec{e}(\varvec{x}) = \varvec{0}\) under the kinematics and dynamics conditions represented as the equality constraints in Eq. 1b and inequality constraints in Eq. 1c. In practice, equality constraints may be integrated into the task function with large weights in favor of solvability over strictness. Therefore, we can represent the optimization problem of multi-contact motion planning as Eq. 2.

$$\begin{aligned} \min _{\varvec{x}} F(\varvec{x}) \quad \text {s.t.} \quad C(\varvec{x}) \le \varvec{d} \end{aligned}$$
(2a)
$$\begin{aligned} \text {where} \quad F(\varvec{x}) \equiv \frac{1}{2} \Vert \varvec{e}(\varvec{x})\Vert ^2 \end{aligned}$$
(2b)

Equation 2 is a non-linear optimization problem, but it can be solved by sequential quadratic programming (SQP) [30]. SQP solves Eq. 2 by solving the quadratic program Eq. 3 for \(\Delta \varvec{x}^i\) and iteratively updating the decision variable as \(\varvec{x}^{i+1} = \varvec{x}^i + \Delta \varvec{x}^i\).

$$\begin{aligned} \min _{\Delta \varvec{x}^i}&{F(\varvec{x}^i) + \nabla F(\varvec{x}^i)^T \Delta \varvec{x}^i + \frac{1}{2} \Delta \varvec{x}^i \nabla ^2 F(\varvec{x}^i) \Delta \varvec{x}^i} \end{aligned}$$
(3a)
$$\begin{aligned} \text {s.t.}&{\nabla C(\varvec{x}^i)^T \Delta \varvec{x}^i \le \varvec{d} - C_k(\varvec{x}^i) } \end{aligned}$$
(3b)
$$\begin{aligned} \text {where}&{\nabla F(\varvec{x}^i) = \left( \frac{\partial \varvec{e}(\varvec{x}^i)}{\partial \varvec{x}^i} \right) ^T \varvec{e}(\varvec{x}^i)} \end{aligned}$$
(3c)
$$\begin{aligned}&{\nabla ^2 F(\varvec{x}^i) \approx \left( \frac{\partial \varvec{e}(\varvec{x}^i)}{\partial \varvec{x}^i} \right) ^T \frac{\partial \varvec{e}(\varvec{x}^i)}{\partial \varvec{x}^i}} \end{aligned}$$
(3d)

In Eq. 3, \(\nabla C(\varvec{x}^i)^T\) is a Jacobian matrix and we approximate the Hessian of the Lagrangian by \(\nabla ^2 F(\varvec{x}^i)\).

Feasibility Constraints for Multi-contact Motion

We define task function and the inequality constraints to obtain feasible multi-contact motion of a humanoid robot to follow the desired contact sequence. First, we consider the kinematic constraints of the robot. The contact constraints, which require the pose of the end-effector \(\varvec{p}_e \in SE(3)\) to reach the target contact pose, are imposed as equality constraints in the task function. We may allow the contact poses to be locally modified by the motion planner to satisfy other mandatory constraints [25] if they have unconstrained DoF. We can apply some additional tasks such as reference joint angles, reference pose of the base link, reference trajectory of a contact-free limb, and reference CoM position as the task functions. As the inequality constraints, the joint limits and collision avoidance constraints are imposed. For collision avoidance, the velocity damper [31] is commonly used. The distances between the objects and their gradient can be efficiently obtained if they have strictly convex shapes [32], but the velocity damper still can be applied when they are not strictly convex [33]. Using the configuration of the robot \(\varvec{q}\) and its difference \(\Delta \varvec{q}\), the above kinematic conditions can be represented as task functions and inequalities which can be imposed in Eq. 3 [7•, 29]. Note that \(\varvec{q}\) consists of the joint angles and base pose of the robot.

Fig. 3
figure 3

The variables for representing the centroidal dynamics of the robot. In this review, we assume that the robot makes surface contact with the environment

We also need to consider the dynamics of the robot to make the resulting motion feasible. Considering the full-body dynamics of a humanoid robot [34] can fully leverage the capability of the human-like body structure. However, since it is computationally expensive and has many local optimums, the centroidal dynamics of the robot are widely used in multi-contact motion planning [11]. Let \(\varvec{f}_j\) and \(\varvec{p}_j\) represent a contact force and its point of application as shown in Fig. 3. The Newton-Euler Equations can be formulated as Eq. 4 where \(\varvec{c}\) is the CoM, m is the mass of the robot, \(\varvec{g}\) is the gravitational acceleration, \(N_f\) is the number of contact points, and \(\varvec{L}\) is the angular momentum.

$$\begin{aligned} m \ddot{\varvec{c}}= & {} \sum _{j=1}^{N_f} \varvec{f}_j + m \varvec{g} \end{aligned}$$
(4a)
$$\begin{aligned} \dot{\varvec{L}}= & {} \sum _{j=1}^{N_f} \left\{ \left( \varvec{p}_j - \varvec{c} \right) \times \varvec{f}_j \right\} \end{aligned}$$
(4b)

Equation 4b includes the cross-product, which makes the centroidal dynamics nonlinear. Ponton et al. [13•] proposed the convex relaxation of the centroidal dynamics to make the momentum dynamics tractable in an optimization problem. Tazaki [14] restricted the direction of contact forces to obtain the closed-form solution of Eq. 4. In multi-contact motion planning, the quasi-static assumption is also widely approved when the robot is expected not to perform dynamic motion [24, 35], where we can assume \(\ddot{\varvec{c}} = \varvec{0}\) and \(\dot{\varvec{L}} = \varvec{0}\). Then, we can summarize Eq. 4 by the linear equation as Eq. 5, where \(\varvec{f} = \left[ \varvec{f}_1 \dots \varvec{f}_{N_f}\right] ^T\) and \(\hat{\varvec{p}}_j\) is the skew-symmetric matrix of \(\varvec{p}_j\).

$$\begin{aligned} \begin{bmatrix} \varvec{I}_3 &{} \dots &{} \varvec{I}_3 \\ \hat{\varvec{p}}_1 &{} \dots &{} \hat{\varvec{p}}_{N_f} \end{bmatrix} \varvec{f} = \begin{bmatrix} \varvec{O}_{3 \times 3} \\ m \hat{\varvec{g}} \end{bmatrix} \varvec{c} + \begin{bmatrix} -m \varvec{g} \\ \varvec{O}_{3 \times 3} \end{bmatrix} \end{aligned}$$
(5)

The contact wrench \(\varvec{w}\), which is the left side of Eq. 5, should also satisfy the inequality constraint as Eq. 6.

$$\begin{aligned} \varvec{C}_S \varvec{w} \le \varvec{w}^\pm _l \end{aligned}$$
(6)

In Eq. 6, \(\varvec{C}_S\) is the selection matrix and \(\varvec{w}^\pm _l\) is the wrench limits of each end-effector. In addition to the Newton-Euler equations, we need to describe the non-slipping condition of each contact point. According to Coulomb’s law, we can represent it as Eq. 7 using the normal vector \(\varvec{n}_j\) of the contact surface and the friction coefficient \(\mu \).

$$\begin{aligned} \varvec{n}_j^T \varvec{f}_j\ge & {} 0 \end{aligned}$$
(7a)
$$\begin{aligned} \Vert \varvec{f}_j - \left( \varvec{n}_j^T \varvec{f}_j\right) \varvec{n}_j \Vert\le & {} \mu \varvec{n}_j^T \varvec{f}_j \end{aligned}$$
(7b)

Equation 7a is the unilateral constraint of the contact force. Equation 7b is the friction cone, which is a second-order cone constraint [36]. We can obtain its inner approximation using a polyhedron. For example, let \(\varvec{\gamma }_{j1}\) and \(\varvec{\gamma }_{j2}\) represent tangent vectors at the j-th contact point. We can describe a matrix \(V_j\) which represents spans of the friction pyramid in Fig. 3 as Eq. 8 [24].

$$\begin{aligned} \varvec{V}_j \!=\! \left[ \varvec{n}_j \!+\! \mu \varvec{\gamma }_{j1} \hspace{3mm} \varvec{n}_j \!-\! \mu \varvec{\gamma }_{j1} \hspace{3mm} \varvec{n}_j \!+\! \mu \varvec{\gamma }_{j2} \hspace{3mm} \varvec{n}_j \!-\! \mu \varvec{\gamma }_{j2}\right] ^T \end{aligned}$$
(8)

Then, the condition for the contact forces \(\varvec{f}\) to be inside of their corresponding friction pyramids can be represented as Eq. 9 using coefficient vector \(\varvec{\beta }_j\).

$$\begin{aligned} \exists \varvec{\beta }_j \in \mathbb {R}^{4}, \quad \varvec{\beta }_j \ge 0 \quad \text {and} \quad \varvec{f}_j = \varvec{V}_j \varvec{\beta }_j \end{aligned}$$
(9)

According to the polyhedral convex cone theory [23], we can obtain the set of inequality constraints equivalent to Eq. 9 as Eq. 10, where \(\varvec{a}_1 \dots \varvec{a}_4\) are the normal vector of faces in the friction pyramid [37].

$$\begin{aligned} \forall k \in \left\{ 1, \dots , 4 \right\} \quad \varvec{a}_k^T \varvec{f}_j \le \varvec{0} \end{aligned}$$
(10)

Del Prete et al. [38] proposed the computationally efficient algorithm to test the static equilibrium with these non-slipping conditions.

Coupled Kinematics and Dynamics in Motion Planning

The major difficulty in multi-contact motion planning is the problem of coupled kinematics and dynamics. As Eqs. 4, 6, and 7, the centroidal dynamics of the robot is managed by the CoM of the robot and contact forces. On the other hand, the CoM is constrained by the body structure of the robot, and points of application for contact forces can be modified according to kinematic constraints such as collision avoidance. The optimization problem formulated in Eq. 2 can solve kinematics and statics simultaneously [29, 39] by considering variables for statics such as contact forces and joint torques in \(\varvec{x}\) while introducing Eqs. 5, 6, and 10 to the constraints of the optimization. However, these approaches take a long computational time because the size of the decision variable increases, especially in multi-contact activities where the number of contact points is large. One possible approach is planning a centroidal trajectory according to the centroidal dynamics first, and then using its result as the constraint of whole-body kinematics [40]. Ruscelli et al. [35] computed optimal force distributions, CoM, and contact positions which can satisfy centroidal statics by non-linear programming. Then, they searched for the feasible motion to connect obtained stances by a sample-based algorithm. This approach can decompose these two problems, but it cannot guarantee the existence of valid whole-body configurations that satisfy the pre-planned CoM trajectory.

Geometrical Projection of Stability Constraints

There are several approaches that compute the range of CoM where the robot can satisfy the stability constraints [36, 41]. This range can be used as the inequality constraint of the CoM in the optimization problem, which can project the stability conditions of the robot to kinematic constraints without increasing the size of the decision variable [42]. Nozawa et al. [43] proposed the algorithm which approximately computes the CoM Feasible Region (CFR), where the CoM can satisfy the centroidal dynamics and non-slipping conditions, by iterative linear programming. Audren et al. [44] showed the method to compute the 3D static stability polyhedron. Orsolino et al. [45] proposed the feasible region, which guarantees static equilibrium, non-slipping conditions, and existence of the joint torques to support the robot’s body weight. Although these methods assume unilateral contact due to Eq. 7a, Kumagai et al. [7•] extend CFR with a static equilibrium by approximating bilateral contact as a pair of surface contacts, which enabled a humanoid robot to climb up steep stairs while grasping handrails.

Online Motion Control

After planning the reference multi-contact motion in a structured environment, we use online motion control to adapt it to a real environment with disturbances. In position-controlled humanoid robots, motion control takes charge of generating the CoM trajectory according to the actual contact points and deciding appropriate contact force distributions to compensate for the error of the CoM. In torque-controlled humanoid robots, passivity-based controllers and inverse dynamics-based controllers are developed to achieve multi-contact stabilization under the model uncertainties and disturbances.

Computational Efficiency for Real-Time Computation

One of the major issues in it is computational efficiency to provide the reference CoM and compute contact wrenches within a control cycle of the robot. The model predictive control (MPC) is a commonly adopted method of CoM trajectory generation because it can guarantee the existence of dynamically feasible contact forces over the future [43, 46,47,48,49]. However, the computational cost of MPC tends to be large because it imposes constraints on all the state variables in the predicted horizon to guarantee their dynamic feasibility. Since multi-contact motion requires a large number of contact points, the dimensionality of the constraints in MPC makes it difficult to be solved in the high frequency. Morisawa et al. [50] proposed the fast CoM trajectory generation method for multi-contact locomotion. They sequentially generate the CoM trajectory based on the centroidal dynamics formulated as a linear time-varying system (LTVS) by introducing the force distribution ratio, which can consider variable motion parameters such as the height of the CoM and adjusted contact poses while largely reducing computational cost compared to QP in MPC. They also designed the multi-contact stabilization framework which can estimate CoM position and compute optimal contact wrench based on the force distribution ratio and achieved multi-contact locomotion on non-coplanar surfaces [51]. Murooka et al. [52] proposed the CoM trajectory generation and stabilization control method based on preview control with centroidal feedback and wrench distribution. Although the proposed preview control did not explicitly consider constraints of dynamics and assumed rough approximations in angular momentum dynamics, it largely reduced the computational cost of the CoM trajectory generation while satisfying contact constraints by the wrench projection.

Compensating Unexpected Disturbances

It is also important to compensate for unexpected disturbances when stabilizing the robot in the real world. The passivity-based controller enabled a torque-controlled humanoid robot to stabilize it with contact points not only on its end-effectors but also distributed across the entire body such as its knee [53]. Abi-Farraj et al. [54] extended that approach to adapt to high forces by introducing Gravito-Inertial Wrench Cone (GIWC) and achieved multi-contact stabilization in pushing a desk and kicking an extinguisher without the knowledge of their weights. Although the above passivity-based controllers assume torque feedback, Cisneros et al. [55] proposed the QP-based stabilizing framework with inverse dynamics-based feedforward torques and kinematic feedback to achieve passivity-based Lyapunov-stable control. This framework enabled a humanoid robot to achieve bipedal and multi-contact locomotion in a narrow construction site despite disturbances caused by unexpected collisions. Hiraoka et al. [56] proposed online motion generation and control framework based on prioritized inverse kinematics with Position-Wrench-Torque (PWT) Jacobian matrix, which represents the relationship between joint angle commands, actual joint positions, contact wrench, and joint torque. It enabled a position-controlled humanoid robot to perform quasi-static multi-contact motion with links without force-torque sensors while estimating the contact wrench on it and reducing the temperature of joints.

Conclusion and Future Directions

In this review, we investigated the general framework to achieve multi-contact activities for humanoid robots and provided an overview of the recent research. Since it is computationally expensive to handle this problem by a holistic formulation due to the complexity caused by non-coplanar contact and acyclic contact transitions, the entire framework is decomposed into contact planning, motion planning, and motion control. In contact planning, recent research investigates how to efficiently focus on the promising part of a large number of candidate contacts in 3D space. After globally planning the contact sequence, the optimization-based approaches are widely used to generate the whole-body joint trajectory while explicitly considering the kinematics and dynamics of the robot as its constraints. To compensate for unexpected disturbances and model errors within the cycle of the real-time control, efficient CoM trajectory generation, and wrench distribution methods are developed. By combining these components as an integrated framework, humanoid robots have achieved complex multi-contact activities in the real-world [3, 7•, 56].

The major problems in each component have been tackled and solved by recent research, but we still have a large gap between structured experiments and unstructured real-world environments. The adaptability to unknown environments is one of the remaining issues to achieve multi-contact activities in real-world scenarios. We think that improvement of the reactiveness in multi-contact planning, especially integrated with the perceptive information [57], is important to achieve it. A human can immediately decide where to make contact or grasp when looking at the environment. To introduce this ability to a humanoid robot, it needs to decide the next contact according to the semantic information of the environment and physical constraints of the robot in a short time. Recently, Chemin et al. [58•] introduced reinforcement learning to provide the root trajectory which makes it easier for the contact planner to find a feasible contact sequence. Although it is still challenging for end-to-end learning to generate precise motion [59], we consider that learning-based approaches are promising to improve the capability of multi-contact activities by humanoids.