Keywords

1 Introduction

Humans and animals undergo morphological development processes from infancy to adulthood that have been shown to facilitate learning [1, 2]. Some of the developmental principles observed in nature have been applied to different robot morphologies, with the main goal of improving their learning abilities. The implementation of these principles has led to different results, showing that the development of the morphology while learning can be positive [3,4,5,6], irrelevant [7, 8] or even detrimental [8, 9] for the learning process. Although the mechanisms through which morphological development may influence learning are still not very well understood, there are several studies that provide some indications on why and how learning may be influenced by morphological development. For example, Bongard and Buckingham [5, 10] relate task complexity to the influence of morphological development, indicating that morphological development does not provide any advantage for a simple problem. In another study [9], Bongard also presents instances where morphological development is detrimental for learning, due to the abrupt changes in the controller that occur during the development phase in that particular experiment. Also, Ivanchenko and Jacobs [8] show how morphological development may be beneficial for learning if a suitable development sequence is followed. They show how an inadequate sequence may even produce results that are worse than those of learning without morphological development. The relevance of finding not only a morphological development strategy that is suitable for the particular problem being addressed, but also the necessity of finding its adequate developmental sequence is pointed out by Vujovic et al. [6]. In their article, they find that a suitable developmental sequence may improve learning while an unsuitable one is irrelevant in their case.

With the aim of complementing the insights gained by those authors, Naya-Varela et al. [11, 12] have carried out a series of experiments analyzing the performance of different morphological development strategies over different morphologies. Firstly, on a study that analyzes the performance of morphological development based on growth and on the variation of the Range of Motion (ROM) of the limbs [11], they find that growth-based morphological development improves performance in a quadruped morphology, while ROM is irrelevant. They hypothesize that the success of the growth strategy is motivated by two main reasons: (1) Starting the learning process with an initially smaller morphology lowers the center of gravity, thus increasing the initial stability of the morphology and allowing it to maintain an upright position without falling for more behaviors than in the adult and larger morphology. (2) This increment in the stability increases the exploratory behavior at the beginning of learning, avoiding the stagnation of solutions in local optima as it is observed for the no-development case (learning directly using the adult morphology). Regarding the ROM strategy, they argue that its lack of effectiveness for learning in their case is motivated by the intrinsic characteristics of the strategy and its incorrect alignment with the morphology of the quadruped. Based on these results, they study the influence of the growth strategy in two additional morphologies: a hexapod and an octopod [12]. In this study, they find that the influence of growth decreases with the number of limbs of the morphology, being relevant for the quadruped and irrelevant for the hexapod and octopod. The authors hypothesize that this decrease in the relevance of growth is due to the reduction of the task complexity with the number of limbs: as the number of limbs increase, the stability of the morphology increases, thus the problem becomes easier to learn.

The results and conclusions obtained in all of these studies, as formalized in [13], can be condensed into a series of insights or hypotheses that a problem and a morphology must fulfill to be susceptible of being influenced by morphological development. Section 2 describes them in detail. In fact, these insights can also lead to a series of considerations that should be taken into account and steps that should be followed in order to appropriately design morphological development processes.

The objective of this paper is to provide some experimental results on the application of the insights mentioned in the previous paragraph. To this end, we have addressed a problem that has been classified as quite difficult in the literature: Learning to walk on two legs, with the aim of designing an appropriate morphological development sequence to make learning the task easier and more efficient.

Although there are numerous examples of bipedal legged robots in the literature [14,15,16], learning to walk in bipedal robots is still a complex task. In fact, most walkers are programmed to walk, they do not learn to walk. There are few examples that study how to learn to walk in bipedal robots from a developmental perspective. In this line, Lungarella and Berthouze [17, 18] analyze the influence of morphological development by freezing and freeing Degrees of Freedom (DOF) of a bipedal robot sustained by a harness attached to its shoulders. More recently, two articles address the problem of learning to walk in bipedal robots using a growth based morphological development strategy [19, 20]. On the one hand, Hardman et al. [19] compared the performance obtained utilizing an annealing optimization algorithm of learning with and without morphological changes. They show how their selected morphological development strategy outperforms learning without it. These results are presented for two different morphological changes, for the case of developing the length of the foot and for the case of increasing the mass and inertia of the body of the robot. Furthermore, they also found that their methodology reduces the number of catastrophic failures, considering as catastrophic failures behaviors in which the robot falls or collides with its own body. On the other hand, Zhu et al. [20] show how a suitable constraint of the morphology in a bipedal robot that learns to walk by a genetic algorithm allows it to improve on the learning performance of the system without any constraint. However, they also pointed out the necessity of a suitable match-up between the task and the selected restrictions, because they also report worse results when different ones are applied.

Summarizing, morphological development has been studied by analyzing different developmental strategies and comparing them to the no-development case. In addition, it has been shown that the morphological development process needs to be carefully chosen as it could lead to completely inadequate results. Based on these results, different authors have extracted some basic knowledge and hypotheses about when it could make sense to apply morphological development. However, to the best of our knowledge, we have not found any work that addresses the opposite problem: Given a morphology and a task, find an appropriate morphological developmental process that makes learning easier.

Thus, this paper presents a first experiment to design or synthesize a morphological developmental process, and we show that it is possible to improve the learning abilities of a selected morphology taking inspiration from the morphological changes that happen in nature. The paper is structured as follows: In Sect. 2, we describe the requirements for designing an application case for morphological development in bipeds. Section 3 is devoted to present the experimental setup we will be using during this experiment. The results of the application case of morphological development, followed by a discussion, are presented in Sect. 4. Finally, we provide some conclusions and future lines of work in Sect. 5.

2 Designing a Morphological Development Strategy

This section is devoted to presenting the design process of a morphological development strategy to improve the learning performance of a bipedal robot [21]. To provide a guide of the aspects that should be taken into account in order to design a morphological development sequence for a particular problem, we have resorted to previous work in the literature. In particular, Naya-Varela et al. [13], after a thorough analysis of the fitness landscapes obtained for several morphological development processes, have suggested a series of general considerations that should be followed. As a summary, they have indicated that:

  • As other authors have already mentioned before, the learning problem must be complex enough to justify morphological development. Otherwise, it may not have any impact during the learning phase.

  • Learning with the initial morphology must be simpler than learning with the final morphology. This simplifies the problem at the early stages of development, allowing us to gradually increase the complexity of the learning task through the different developmental stages. Thus, the maximum complexity of the problem will be achieved with the final morphology.

  • It is necessary to have an adequate synergy between the morphology, the control system, and the selected developmental strategy. Especially, the development of the morphology must be in accordance with the capacity of the controller to adapt to the morphological changes.

  • To avoid misleading learning, we consider that optimal solutions must be available from the beginning of the learning process. Reducing the solution search space could imply that an optimal solution would not be available until the final morphology is reached, limiting the capability of the learning algorithm to find optimal solutions and allowing for deceptive paths.

Of course, these are general principles that are mostly related to the evolution of the sequence of fitness landscapes defined by the sequence of morphological changes when viewing the learning strategy as an optimization process. Consequently, they need to be translated into specific features that we want to see in our bipedal robot learning to walk. These features are:

  • Problem complexity: In order to have a complex enough problem for learning, we have selected the task of bipedal walking. Learning to walk is a complex task due to the intrinsic difficulties associated with the instability and dynamics [22]. Concretely, we have selected the task of learning to walk in a NAO robot.

  • Learning simplification: With the aim to start learning with an initially simpler morphology than the final one, we have selected growth as a morphological development strategy. The reason is that we consider that learning with a lower center of gravity may simplify learning, thanks to an increase in the stability of the morphology.

  • Morphology, control system, and development synergy: To maintain a synergy between morphology, control system, and developmental strategy we have selected a progressive and continuous developmental stage, avoiding abrupt changes in the controller or morphology that may distort the relationship between them.

  • Availability of solutions: A growth based developmental strategy does not imply any limitation or constraint in the movement of the motor system. Thus, with this developmental strategy, the space of possible solutions is invariant from the initial morphology to the final one.

Fig. 1.
figure 1

Different versions of the NAO robot. Right: NAO model without any modification. Left: NAO model with the upper leg and lower leg modified to allow growing.

Thus, to improve NAO’s learning performance thanks to growth, a series of different design characteristics have been implemented: (1) To allow the NAO robot to grow, we have modified its legs allowing their extension. Although morphological development could involve more and different parts of the NAO (like the body or arms), as a first approximation, we have decided to apply it only to the legs. We consider that modifications of the length of the legs provide the highest impact in terms of the stability of the robot (Fig. 1); (2) Symmetric growth. We have considered a symmetric growth of the upper leg and the lower leg. This preserves the initial stability as well as the center of gravity as close as possible to the initial position in the xy-plane (bearing in mind that as the morphology grows, it will move upwards along the z-axis), avoiding the possible static and dynamic imbalances that may arise if one part of the leg grows more than the other; (3) Progressive growth. With the aim of avoiding drastic changes in the morphology and control system, a progressive and linear growth sequence has been selected, rather than an abrupt one; (4) Reduction of the maximum ROM available. Finally, we have reduced the maximum available ROM of each joint given by the documentation of the NAO in order to reduce the search space.

3 Experimental Setup

To test the application of the morphological development strategy presented in the previous section, we have created the following experimental setup. As indicated before, we will make use of a NAO platform as the base robotic structure. For convenience, the NAO will be simulated using the CoppeliaSim simulator [23] and the PyRep extension [24]. To apply morphological development based on growth as indicated above, a series of modifications to the legs and feet of the NAO model in CoppeliaSim have been made with the objective of allowing leg growth and increasing stability:

  • The upper part of the legs was changed to two links joined by a prismatic joint. Each link is 8 × 8 × 7.2 cm and has a mass of 458.7 g. The prismatic joint has a maximum force of 50 N. The maximum extension of the prismatic joint is 3.5 cm. This group of two links and the prismatic joint, will be considered as a unique link that is able to grow.

  • The lower part of the legs was also changed to two links joined by a prismatic joint. The upper link is 8 × 8 × 3 cm and has a mass of 192 g and the lower link is 9 × 8 × 3 cm and a mass of 215.8 g. Both links are equal but present different orientations. The prismatic joint has a maximum force of 50 N. The maximum extension of the prismatic joint is 3.5 cm. Again, this group will be considered as a single link that can grow.

  • The size of the feet has also changed from the original NAO foot size, increasing it up to 16 × 8.5 × 1.5 cm in dimension and 204 g in weight.

  • The different leg parts, as well as the feet, have been modified to represent, in a simplified manner, the same dimensions as the original NAO. The mass and inertia of the legs and feet are automatically adjusted by the simulator.

The controller of the robot is a neural network whose weights and structure are learnt using NEAT [25], specifically the MultiNEAT implementation [26]. It has 3 inputs plus one bias and 14 outputs, each controlling the actuation of one joint. The inputs are sinusoidal functions of amplitude 2.0 rad and frequency 1.0 rad/s. The phase offsets of the sinusoidal inputs are 0, 3.0 and 5.0 rad respectively.

A series of learning experiments using NEAT have been run over different implementations of the robot and environment using the CoppeliaSim simulator with the ODE physical engine [27] in the CESGA [28] computer cluster. Each NEAT learning run evolves a population of 150 individuals and is trained for 300 generations. A total of 40 independent runs have been carried out for each experiment with the objective of gathering relevant statistical data. Each individual is tested for 5 s with a simulation time step of 50 ms and a physics engine time step of 5 ms.

As the controller is obtained using NEAT, the learning strategy is based on a neuroevolutionary process, where the fitness depends on the distance travelled by the head of the robot in a straight line and whether or not the robot falls during learning. If the NAO does not fall, the fitness value is the distance traveled in a straight line in meters. However, if the NAO falls the simulation is stopped and we consider as the fitness value the distance traveled 16-simulation time steps before the moment the NAO fell. In this sense, we consider that the NAO falls when its head is below 0.3 m. We have selected 16-time steps because 16-time steps before falling, the NAO is still in a stable position.

In order to evaluate the developmental strategy that was designed, we have performed two different types of experiments:

  • Reference Experiment. This experiment is run with a fixed morphology (the same as the final morphology for the rest of the experiments) from the beginning to the end. The robot starts at generation 0 with the maximum length of the legs and the neuro-evolutionary algorithm seeks a neural network-based controller to achieve maximum displacement.

  • Growth Design Experiments. The robot morphology starts with the shorter version of the legs. That is, at the beginning of learning, the prismatic joints are fully contracted, their extension is 0 cm. The length of the upper legs is 14.334 cm and the length of the lower leg is 11 cm. The leg length is grown linearly for a number of generations until the upper leg reaches 17.0834 cm of length and the lower length reaches 14.5 cm. This growth takes place in a set number of generations for each experiment. That is, the final morphology is reached at generation 40, 60, 80, 100 and 120, depending on the experiment.

This permits identifying the best growth ratio for the selected morphology and control system and evaluating the relevance of the growth rate with regards to performance.

4 Results and Discussion

The results of the morphological development process designed for the NAO robot can be observed in Fig. 2 and Fig. 3. Figure 2 displays the results obtained after the learning process through neuroevolution in the case of no development and in the design case of growth up to generation 120, as it is the one which results in the highest median. It displays the median of the best fitness obtained for the 40 independent runs at each generation for each configuration. The shaded areas in the graph represent the areas between percentiles 75 and 25 for each case. Figure 3 displays the statistical results at the end of the learning process for the different growth rates and the no development case. Each boxplot represents the median and the 75 and 25 quartiles in the last generation for 40 independent runs of each of the different types of experiments. The whiskers are extended to 1.5 of the interquartile range (IQR). Single points represent values that are out of the IQR. All developmental samples are compared to the no-development case. The statistical analysis has been carried out using the two-tailed Mann-Whitney U test. We want to test whether the performance of the different design cases is similar or not to the reference case. We consider a p-value of 0.05 as the significance value for accepting or rejecting the null hypothesis. All the p-values have been adjusted using the Bonferroni [29] correction. The results show that the design case based on the morphological learning sequence we have designed offers better results than learning without morphological development. Only the growth up to generation 40 case (p-value of 0.37133) does not offer better results than the reference case, while the less representative results, growth up to generation 60 and growth up to generation 100, have a p-value of 0.04128 and 0.02037 respectively, both under the reference value of 0.05. Furthermore, two cases have offered notable improvement concerning the no-development case. These are the growth up to generation 80 (p-value of 0.00189) and growth up to generation 120 (0.0021).

Fig. 2.
figure 2

Results obtained after 40 independent executions for the selected design process, considering different growth speeds and the case without morphological development. For the sake of clarity, we only show the comparative results of the learning process for the case of no development (black) and the design case of growing until generation 120 (blue), which presented the best results.

Fig. 3.
figure 3

Statistical representation of the performance obtained from the 40 independent experiments at the end of the neuroevolutionary process. The statistical values of the Man-Whitney test adjusted by the Bonferroni correction have been replaced by asterisks in order to makes the figures clear. “Growth up to generation” is abbreviated by a G.

Table 1. Median fitness for each growth ratio and the no-development case at the end of the experiment

Analyzing Fig. 2 and Fig. 3 it can be observed that:

  • In the morphological development case, there is a noisy behavior in the curve representing the median of the fitness value during the developmental period. It is motivated by the adaptation of the controller to the morphological changes that happen during development. This means that the best solution in a specific generation may not be the optimal solution in the next generation due to the variations in the morphology. This is not observed in the curve that represents the median of the no-development experiment. In this case, the fitness value progresses gradually without oscillations as the morphology does not change.

  • The selected morphological development strategy improves the learning ability of the algorithm not only at the end of the optimization process, but from the beginning it already surpasses the performance of the no development case.

  • In Fig. 3, it can be observed how the median of the fitness value increases as the growth speed decreases (Table 1). Although this increment is small in absolute values, it clearly shows a tendency. That is, the selected design strategy allows to improve the learning capacity of the NAO, but the control system needs time to adapt to the changes in the morphology. Especially relevant is the comparison between the fitness value of the no-development case and the growth up to generation 120 case. This last one, more than doubles the fitness value of the no-development case.

Analyzing these results in more detail and relating them to our initial design conditions about how to design a specific morphological development strategy based on the insights extracted from the literature, we can say that:

  • Learning to walk for the NAO robot is a complex task. This assumption is supported by the high number of falls encountered at the first evaluation of the individuals. Considering the fitness value of all of the individuals in the first generation, we obtain a total of 44 individuals whose value is over 0, which is only a 0.733% of the total number of individuals (150 individuals in each independent run, with 40 independent runs makes a total of 6000 individuals).

  • The rapid increment of the fitness performance during the first generations supports our hypothesis that starting the learning process with a smaller morphology than the final one may help to improve learning efficiency. This rapid increment may be motivated by the fact that an initial smaller morphology increases the stability of the NAO, compared to a final large one. This increment in the stability may help to find initial behaviors that allow the NAO to start walking and avoid falls. Behaviors that could be rejected in the adult morphology. Thus, the number of optimal behaviors increases as well as the exploration capacity of the learning algorithm. This hypothesis is supported by Fig. 2, where the fitness value of the growth experiments improves on the fitness achieved by the non-developmental case from the beginning of learning. This shows that robots with shorter legs are able to walk further than robots with longer legs (final morphology).

  • Furthermore, we consider that our selected design condition of progressive and gradual development is supported by the results of Fig. 3. On the one hand, the medians with the highest fitness are obtained in those experiments with slower speeds. That is, in those cases with more gradual development. On the other hand, only growth up to generation 40 has not improved learning, being this growth ratio the one that caused the most abrupt changes in the morphology.

  • It seems that the assumption of a symmetrical development in both parts of the legs (in the upper leg and in the lower leg) based on the idea of maximizing stability was also a good choice. However, we do not know if this is the optimal solution and further experimentation analyzing the influence of an asymmetric development strategy would be interesting.

Finally, it is important to mention that during the evaluation time, the morphology of the robot is fixed. What helps to improve learning to walk in the adult morphology is the sequence of developmental stages the morphology follows from the initial morphology to the final one. This is different from cases where the morphology changes during the evaluation time to increase their adaptation to the environmental conditions, without considering initial or final stages of development, such as in Ahmad et al. [30].

5 Conclusion and Future Work

This paper deals with the design of a morphological development strategy with the aim of improving the learning ability of a bipedal robot when learning to walk. The design of the morphological development process was based on design considerations extracted from previous studies and analyses in the field, that can be summarized as: (1) The task must be complex enough to warrant morphological development; (2) Start learning with a morphology that makes the task simpler. In our experiment, this means a more stable initial morphology; (3) Progressive and gradual development. To avoid abrupt changes in the control-morphology relationship we have selected a developmental strategy based on the linear growth of the legs; (4) Finally, we have selected a morphological development strategy that does not omit optimal solutions during the developmental phase, to avoid the learning algorithm getting stuck in suboptimal behaviors while the morphology grows.

The results of applying this morphological development process support the design decisions we have made as morphological development clearly improves the learning performance in the majority of the cases considered. In fact, in the best cases, it doubles the performance of no development. However, much work is needed to provide robust engineering indications about the design considerations we have made. In this sense, further analysis and research about them and their implications should be carried out. For example, it would be interesting to produce a design implementation with slower growth rates than those presented in this paper in order to verify whether the relationship between growth ratio and fitness is consistent or not. Furthermore, it would also be interesting to see whether an asymmetrical growth of the legs could improve the results. Finally, to conclusively state that the selected design considerations are suitable tools for improving morphological development-based learning in general, further research should be carried out using different algorithms and morphologies.