A framework based on evolutionary algorithm for strategy optimization in robot soccer

Larik, Asma; Haider, Sajjad

doi:10.1007/s00500-018-3376-6

A framework based on evolutionary algorithm for strategy optimization in robot soccer

Methodologies and Application
Published: 12 July 2018

Volume 23, pages 7287–7302, (2019)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Soft Computing Aims and scope Submit manuscript

A framework based on evolutionary algorithm for strategy optimization in robot soccer

Download PDF

Asma Larik¹ &
Sajjad Haider¹

608 Accesses
4 Citations
Explore all metrics

Abstract

In any competitive and uncertain environment, designing an optimal strategy is a challenging task. The manual hand-coding of strategy is a tedious job, and its evaluation on all possible situations becomes even more complicated. This paper proposes a novel distributed framework, named FEASO, based on evolutionary algorithms, for strategy optimization in the domain of robot soccer. In the context of robot soccer, strategy denotes the critical areas where home team agents should be positioned. The focus of this study is to optimize the strategic placements of agents that are defending the goal. The presented approach comprises three modules: evolutionary algorithm execution, parallel fitness evaluation and fitness computation. It executes matches in parallel on different machines for fitness evaluation. The fitness function takes into account three parameters: the goal difference, regions occupied by defending players and ball possession by the home team players. The framework has been successfully implemented in our 3D soccer simulation team that participates in RoboCup event. Experiments are conducted using binaries of various teams taking part in the competition. A comparison of strategies between teams is conducted and analyzed. The results clearly demonstrate that the team that executes optimized strategy is able to defend more goals as compared to the team with hand-coded strategic points.

Path planning and control of soccer robot based on genetic algorithm

Article 12 December 2019

Path Planning in RoboCup Soccer Simulation 3D Using Evolutionary Artificial Neural Network

A Graph Theory-Based Evaluation of Strategy Set in Robot Soccer

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Strategy development for a team of agents/robots, with decentralized decision making, is a challenging task in an uncertain and adversarial domain. An agent playing soccer not only has to score a goal against an opponent team but also has to defend its own goal. Team strategy, thus, becomes a collective outcome of individual agents’ strategies. To evaluate a team strategy, a number of variables have to be considered. For instance, if a team is on defense, the list of variables may include alignment of defenders with the goal post, distance maintained between the ball and the defenders, the alignment of players in a supporting role with the ball and the distance between the opponents. In contrast, when a team is on the attack, then the number of players involved in the attack, their attacking pattern of moving ahead or staying behind at a certain distance becomes important for devising a strategy. In situations like these, it becomes difficult for an expert to design a strategy that works well in every scenario. In addition, manual evaluation of the strategy on all possible scenarios becomes even more time-consuming. To overcome these challenges, this paper presents a parallel evolutionary computation-based framework for producing an optimized strategy. Parallelization is introduced to reduce the time taken by a single machine in evaluating all possible strategies. Offspring in the evolutionary algorithm is produced by a server, while the workload is distributed among the clients for strategy evaluation.

It is important to note that in the context of this paper a strategy denotes the critical areas that should be occupied by defending players. The alignment of players with the goal and the ball has been evolved to obtain optimized strategic points. These optimized points are then executed in real time for team performance evaluation. The fitness of a solution takes into consideration the difference between goals of the home and the opponent teams, the average ball possession by the home team and certain regions occupied by home team agents.

For the experimental purposes, RoboCup Soccer Simulation Server 3D has been chosen as a test bed. The environment provides a promising platform which is truly dynamic and partially observable in nature. A RoboCup Soccer Simulation 3D League team (Darab and Ebrahimi 2007), namely Karachi Koalas (Haider et al. 2014), that was developed as a part of a larger project has been taken as the home team in this research. This team has been frequently participating in the RoboCup competitions with the best ranking of being the fifth ranked team in the year 2013. The base code of this team has been taken as a benchmark as matches of the base code team, and the evolved team are executed against a similar opponent. It is also worth mentioning that much work has been done in the literature in which other techniques rather than the evolutionary approaches have been applied such as reinforcement learning (Salustowicz et al. 1998; Riedmiller et al. 2009), handcrafted structuring of code (Kazakov and Kudenko 2001), layered learning (Stone and Veloso 2000; Cherubini et al. 2007; Urieli et al. 2011). In this paper, we have compared other approaches with the evolved approach by executing matches to demonstrate that the evolved strategy works well. The rest of the paper is organized as follows. Section 2 provides an overview of the existing literature along with a subsection presenting novelty of FEASO. Section 3 briefly describes the technical background of the RoboCup Soccer domain as well as evolutionary algorithms, and Sect. 4 gives a comprehensive overview of FEASO with its implementation details. Experimental setup and results are discussed in Sect. 5. Section 6 compares FEASO with most relevant evolutionary approaches proposed previously, while Sect. 7 concludes the paper and provides future research directions.

2 Related work

There are various techniques reported in the literature to incorporate strategies in robot soccer. The list of techniques includes reinforcement learning, decision trees; handcrafted rule-based techniques, coordination graphs, layered learning approaches, evolutionary algorithms. The notion of strategy varies among each of the specified techniques. This literature review focuses on learning team strategy and thus techniques that employ strategy optimization are discussed in detail in each of the subsections.

2.1 Reinforcement learning techniques

In reinforcement learning, strategy denotes learning a good policy for a team of agents and concept of collective reward and punishment was introduced. The earliest approach was proposed by Salustowicz et al. (1998) who compared several reinforcement learning algorithms in multi-agent learning scenarios. Riedmiller et al. (2009) suggested learning strategy at a tactical level, such as moves to intercept the ball, wait at a position, passing the ball to a teammate, shoot to goal. Many variants of reinforcement learning such as observational reinforcement learning, team partitioned opaque transition reinforcement learning, Q-learning have been proposed in the domain of RoboCup soccer simulation 2D, but in 3D there are many challenges in current reinforcement learning research. Firstly, with the addition of each new agent, it becomes memory expensive. Secondly, similar behaviors are exhibited in robot soccer and the problem of learning everything all over again exists. Lastly, due to limited perception of an agent, it is often impossible to fully determine the current state and this result in loss of performance of the algorithm. As this study considers an optimized strategy for a team of agents, reinforcement learning has not been used.

2.2 Rule-based techniques

In contrast to reinforcement learning, rule-based and inductive logic programming-based approaches (ILP) have also been proposed by different researchers. The inductive learning agent (Kazakov and Kudenko 2001) uses first-order formalism and ILP to acquire rules to predict failures. The agent gathers instance of actions and classifies them. Next prediction rules are formed that assist soccer agents in deciding each action taken in the game to be good or bad. The limitations of this approach are that its focus lies on verification and validation of knowledge-based system with no new knowledge acquisition. Consequently, agents cannot adapt their own behavior using rules or knowledge acquired by ILP. Murray et al. (2000) developed a multi-agent team based on script language, where procedural aspects were specified by state charts and declarative aspects by using decision trees. The rules have to be designed by an expert and it becomes a tedious job to think of all sets of possibilities. To overcome this limitation, the study (Kok et al. 2005) presented a strategy that was devised by using a few rules provided by experts and the rest of the coordination rules were computed online using coordination graphs. Bezek (2005) and Wang et al. (2009) proposed the use of coordination graphs in the assignment of roles and actions to the players. Svatoň et al. (2014) proposed a methodology to improve the description of strategy by creating sub-strategies and thus ensuring a smooth implementation of actions defined by each strategy. The main focus of coordination graphs is to aid humans in strategy building. A human modeler analyzes the various patterns in the graph and designs hand-coded agents to act under certain circumstances. The focus of this study is not on rules or graph creation but the generation of an optimized strategy that works well in an automated fashion. Thus, no handcrafting of rules is performed but the evolution process applied has the advantage of giving multiple strategies, which are equally good in a specific situation.

2.3 Layered learning technique

Layered learning has also been applied for flexible team formation in RoboCup Soccer Simulation Leagues. Stone et al. (2000) pioneered the architecture in which the problem consisted of breaking into a bottom-up the hierarchy of subproblems. The subproblems are solved and they serve as input to the next layer. The problem used Genetic Programming for selection of an action by a team of agents. Urieli et al. (2011) presented an approach for optimizing interdependent skills in the 3D domain. This layered architecture has the benefit that primitive locomotive skills can be learned precisely and multiple skills are learned in conjunction with each other. The issue lies in learning strategy because it is a high-level behavior exhibited by a team of agents that requires numerous learning efforts if performed via layered learning approach.

2.4 Evolutionary algorithm-based techniques

Evolutionary algorithms have also been applied to evolve strategies in robot soccer. Nakashima et al. (2004, 2006) proposed the use of Genetic Algorithms to learn a team strategy in the domain of 2D simulation league. They divided the field into 48 subregions and the 10 players of both the teams excluding the goalkeeper could exist in any of the regions. The chromosome comprised of the action to be executed in a certain situation. The fitness of a strategy is computed by running simulated matches and taking average goal difference. One limitation of the approach was that a single opponent was fixed and strategy would not be generalized. Cultural Algorithms (CA) was applied (Salhieh et al. 2012) to generate a team strategy for playing robot soccer. A chromosome contained regions and the corresponding player numbers. This was a centralized approach where a coach agent was developed that sent messages to all the agents and only the intended agent executed the action. Goal difference was used as the evaluation function. Recent advances in team strategy were contributed by Ali et al. (2014) in which they investigated the use of a simplified and adaptive version of CA to develop defensive and offensive plays and cooperative strategies in robot soccer. This work was very relevant to the presented approach, and a detailed comparative discussion is provided later in Sect. 6. Lekavy (2011) applied evolutionary approach for evolution of pass execution in standard situations such as kickoff in 2D simulation league. Okada et al. (2011) utilized particle swarm optimization to evolve team formation. Cartesian coordinates of the ten players and 15 possible positions of the ball were modeled as elements of an individual solution, while the initial population was generated randomly. They reported how well formations for various team performances (e.g., offensive, defensive, balanced) could automatically be obtained. Ant Intelligence (Ramani et al. 2008) was utilized for player strategies generation. Multi-group ant colony optimization algorithm (Chen et al. 2016) was proposed that used ant pheromone evaporation mechanism to learn offensive strategies in 2D soccer simulation league. A comparative analysis of this approach is presented in Sect. 6. Luh et al. (2006) used the immune system to develop cooperative strategies in robot soccer. The system selected a behavior for the player such as shot, pass, kick, chase, track, and guard. The approach was validated on the SimuroSot Middle League. It is important to note that all of the above-mentioned techniques using evolutionary computation face the challenge of the time needed to evolve a solution as a large number of fitness evaluations are needed before a well acceptable solution can be found. In light of this requirement, distributed evolutionary approaches are proposed (Gong et al. 2015) and this research also contributes to a distributed approach.

2.5 Novelty

The FEASO framework is a distributed approach for evolution of defensive strategy that has not been addressed in other techniques proposed in the literature. The major contribution of this framework is that it facilitates automatic generation of optimized strategy against a specific opponent team and then generalizes it across various similar opponents. This paper focuses upon the strategic placements of all the agents in the field, while most of the above-mentioned evolutionary techniques consider either subfield or two-three players in the field to demonstrate the required behavior. Moreover, the domain of RoboCup Soccer Simulation 2D in which the above-mentioned approaches were tested is less complex and knowledge about the action to be executed is available to the players that make it easier to exhibit a cooperative strategy. But the domain of RoboCup Soccer Simulation 3D is highly dynamic and unpredictable in which only positions of the agents are known and actions have to be inferred by observational data. The author finds no technique for strategy evolution in the domain of RoboCup 3D. The proposed framework is both robust and time inexpensive in real-time strategy execution. This framework can also be applied to similar competitive domains as well.

3 Technical background

This section presents an overview of the domain of RoboCup and also discusses some basics of evolutionary algorithms.

3.1 Domain description

RoboCup Soccer (Kitano et al. 1998) is a scientific venture that provides an exciting platform that has been used for the advancement of research in artificial intelligence and robotics. The competition has a goal that, by the middle of the twenty-first century, a team of robots would defeat a team of humans in soccer. This competition is held every year, and several leagues have been designed to cater for different problems and challenges in robot coordination, locomotion, etc. The main leagues in this tournament are a middle-sized league, small-sized league, simulation 2D league, simulation 3D league, standard platform league and the humanoid league. In contrast to other leagues, the simulated leagues do not involve physical robots and instead simulated robots play soccer on the virtual field. In the simulation 2D league, the ball and the players are represented by circles on the plane of the field, while in the simulation 3D League, the players are represented as articulated, rigid bodies having 22 hinges. In the 2D league, commands such as move, dash, turn and kick are available; however, in 3D these commands do not exist and locomotion is a big challenge there. The 3D game comprises of two teams of eleven robots in a field of 21 * 30 m as shown in Fig. 1.

This research utilizes RoboCup Soccer Simulation 3D as a test environment for the development of opponent-specific strategies. The match is divided into two halves of 5 min each. Continuous time is approximated with discrete cycles with each half being 300 cycles and 1 min equals to 60 cycles. Each robot has its own local view of the field and is controlled by a separate autonomous program. All robots can move and act independently as long as they comply with the league rules. They can freely communicate with each other via message broadcasting, but their visual and hearing perception is distance limited. Due to humanoid locomotion, there are chances of robots falling in the field or colliding with other robots so the server, in such cases, can penalize them by throwing them out of the field. The domain is a complex multi-agent system and is easily understandable by humans due to its soccer-related content.

3.2 Evolutionary algorithms

In many real-world applications, there exist problems that are hard to solve such as traveling salesman problem, time series prediction, bankruptcy prediction, credit scoring. Traditional algorithms to solve these are either very specialized or more general. To approach such hard problems, evolutionary algorithms (EA) were introduced a few decades ago. Evolutionary Algorithms (Bäck 1996) are stochastic search and optimization heuristics derived from classical evolution theory. It follows the Darwinist evolution, which is described as survival of the fittest. EA methods only need the target (fitness) function for a given problem, which is to be optimized.

A typical evolutionary algorithm passes through the following phases as shown in Fig. 2. At first, the population is initialized randomly. The fitness of each solution is computed, parents are selected via selection procedure and offspring are created by variation operators. Next fitness of the new offspring is computed. Members of the population die using survival selection and the process continues until a termination criterion is met.

3.2.1 Parent selection mechanism

Parent selection is used to distinguish among individuals based on their quality to allow better individuals to become parents of the next generation. There are various selection schemas (Bäck 1996), namely fitness proportional selection (FPS), rank-based selection (RBS), tournament selection (TS). In FPS, a probability distribution proportional to the fitness is computed and individuals are selected by sampling the distribution. This schema has high selection pressure. Rank-based selection attempts to remove problems of FPS by computing selection probabilities on relative fitness rather than absolute fitness. RBS mechanism ranks population according to fitness and selection probabilities on the basis of the assigned rank. This schema imposes a sorting overhead on the algorithm, but this is usually negligible compared to the fitness evaluation time. The advantage of this schema is that the selection is independent of actual fitness values and it preserves a constant selection pressure.

3.2.2 Variation operators

The role of variation operators is to create new individuals from the old ones. There are two types of variation operators: mutation and crossover. Mutation is a unary variation operator and is always stochastic. Crossover is a binary variation operator that merges information from two parents to generate two offspring. The two fundamental design concepts in evolutionary algorithms are exploration and exploitation. Exploration means discovering promising areas in the entire search space, while exploitation deals with optimizing within a promising area. Crossover is explorative as it makes a big jump to an area between the parents. Mutation is exploitative as it creates random diversions while standing near the parents.

3.2.3 Survival selection mechanism

The role of survivor selection is to distinguish among individuals based on their quality. A survival selection scheme can be categorized as generational (current population replaced by its offspring) or steady-state model (few members move are replaced by its offspring). Truncation is a survival selection schema in which out of a 2n population of parents and offspring, the top n survive to the next generation.

3.2.4 Performance indicators

Following are some of the performance indicators for evolutionary algorithms:

Best-so-far (BSF): The best solution found by the algorithm in each generation.
Average-so-far (ASF): the average solution found by the algorithm in each generation.

4 A framework based on evolutionary algorithm for strategy optimization (FEASO)

This section provides a comprehensive overview of the entire system architecture of the proposed framework. The framework focuses on devising an optimized strategy in robot soccer by discovering the key points where the agents should place themselves under various scenarios.

4.1 Optimization parameters

To develop an understanding of the parameters being optimized, an example from robot soccer is presented. Figure 3 depicts an instance of a simulated soccer field in which agents of both the teams (red and blue) are placed. The red circles with ‘O’ represent the opposing team, while the blue circles represent the home team. A small white circle denotes the ball. The scenario shown is an instance of attack being executed by the opponent team. There are different roles associated with the home team players such as goalkeeper (G), main attacker (MA), left forward (LF), right forward (RF), left supporter (LS), right supporter (RS), left midfielder (LM), right midfielder (RM) and finally three defenders left defender (LD), right defender (RD) and center defender (CD). Each of these roles occupies certain positions in the field. These positions are in turn determined by factors such as the distance of a player from the ball, distance from the goal posts and distance with other teammates. The red arrows highlight the variables (d1, d2, …., d9) that needs to be optimized. Table 1 signifies the relationship of these variables with the roles.

Table 1 Solution Representation

A framework based on evolutionary algorithm for strategy optimization in robot soccer

Abstract

Similar content being viewed by others

Path planning and control of soccer robot based on genetic algorithm

Path Planning in RoboCup Soccer Simulation 3D Using Evolutionary Artificial Neural Network

A Graph Theory-Based Evaluation of Strategy Set in Robot Soccer

Explore related subjects

1 Introduction

2 Related work

2.1 Reinforcement learning techniques

2.2 Rule-based techniques

2.3 Layered learning technique

2.4 Evolutionary algorithm-based techniques

2.5 Novelty

3 Technical background

3.1 Domain description

3.2 Evolutionary algorithms

3.2.1 Parent selection mechanism

3.2.2 Variation operators

3.2.3 Survival selection mechanism

3.2.4 Performance indicators

4 A framework based on evolutionary algorithm for strategy optimization (FEASO)

4.1 Optimization parameters

4.2 System architecture

4.2.1 Module I: evolutionary algorithm execution module (EAEM)

4.2.2 Module II: parallel fitness evaluation module (PFEM)

4.2.3 Module III: fitness computation module (FCM)

4.3 Fitness function

4.3.1 Goal scored against defending team

4.3.2 Ball possession by home team players

4.3.3 Average time ball out of danger zone

4.4 Parser application

5 Experimental setup and results

5.1 Design and analysis of results for parameters of evolutionary algorithm

5.2 Analysis of the evolved strategy

5.3 Benchmark scenarios

5.3.1 Competition among the KKRandom, KKBase and KKEvolved

5.3.2 Competition with other teams participating in RoboCup tournament

6 Comparison with most relevant evolutionary approaches

6.1 FEASO versus MACO

6.2 FEASO versus adaptive CA

7 Conclusion

References

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation