1 Introduction

The primary objective of The Exoplanet Characterisation Observatory (EChO) [9, 27] is to study a representative sample of exoplanets around nearby stars, with masses ranging in sizes from Jupiter to a few Earths by using the differential technique of transit spectroscopy. Temporal variations in the observed signal from spatially unresolved observations of an exoplanet in orbit around its parent star, at different points in its orbit, will be used to determine the spectrum of the planetary atmosphere. This can be achieved using high-precision spectrophotometric observations of two types of events: (1) primary eclipse and (2) secondary eclipse, which can be also referred as transit and occultation, respectively. In order to yield measurements of sufficient Signal-to-Noise Ratio to fulfil the mission objectives, the events of each exoplanet may need to be observed several times. EChO can only examine one exoplanet event at a time, so the observations cannot be done simultaneously. In addition, several criteria have to be considered to carry out each observation: (1) exoplanet visibility, (2) time and duration of events, (3) number of events to be observed, and (4) exoplanet priority. The observations of the selected targets are only valid when the events required to reach the target Signal-to-Noise Ratio are observed. Finally, other tasks have to be considered along the mission for calibrating the telescope or transferring data from the spacecraft to stations on Earth.

A suitable long-term mission plan for all the mission time is expected to increase the efficiency of telescope operation, which will represent an important benefit in terms of scientific return and operational costs. However, this process becomes unaffordable for human planners due to the complexity in computing the huge number of possible combinations in search for an optimum solution. This class of optimization problems is considered NP-hard, and there are many mathematical tools to solve the planning/scheduling issue: from simple heuristics to more complex Artificial Intelligence (AI) approaches [8].

Although several tools for automatic computation of long-term plans have been previously developed [3, 12], they have been designed in the context of specific missions (e.g., constraints, scientific goals, proposal definition) and thus are unsuitable to the needs of the EChO mission. In this contribution we present a standalone Long Term Mission Planning Tool (LT-MPT) for EChO based on Genetic Algorithms (GAs) [15], which are an AI approach focused on emulating natural evolution by means of combining potential solutions using selection, combination and mutation operators [10]. This kind of algorithms attempt to generate solutions to optimization problems by exploring a large amount of potential solutions, including the most efficient ones [13]. A solution is considered efficient when it highly optimizes the objectives defined in the problem that, in our case, correspond to maximize the time spent on scientific observations and the scientific return, measured in terms of the coverage of the mission sample. Taking into account that new exoplanets that may be of interest to EChO will be discovered until 2022, it is necessary to have a LT-MPT able to obtain optimal planning solutions in any exoplanet sample. For this reason, the capabilities of the LT-MPT have been tested with the sample of known exoplanets and with different hypothetical scenarios of 2022.

The paper is organized as follows. Section 2 briefly summarizes the related work on automatic mission planning. Section 3 describes the planning tool in the operational design of the mission. Section 4 details the EChO mission optimization problem. Section 5 presents the LT-MPT and how GAs are applied in the optimization process. Section 6 describes the experimentation done and discuss the results. Finally, Section Section 7 ends with conclusions and further work.

2 Related work

In the past few years, the ESA has increasingly focused its attention on the use of Automated Planning and Scheduling solutions to support space operations [20]. Several ESA tools have been developed for computing a long-term mission plan for each mission, such as AIMS [24] and Local Search Solution [19] for INTEGRAL mission or the scheduler of XMM-Newton mission [2]. Outside ESA we can find SPIKE, a well-known toolkit used to promote scientific research through the effective and efficient use of ground- and space-based astronomical observatories, which is used in the long-term planning of the Hubble Space Telescope [17]. All these tools use local search algorithms and specific heuristics for finding a long-term schedule of the mission, so they may find a solution in a local optimum that is far of the global optimum [26].

Global search algorithms, such as GAs, are powerful techniques for obtaining optimal solutions to a specific problem. The successful application of these optimization techniques has been properly demonstrated in scheduling problems [4, 28] and in other domains [7, 18, 22, 23, 25]. Other tools based on applying GAs in long-term scheduling of space missions are MrSPOCK for MARS EXPRESS mission [3] of ESA, the evolution of SPIKE that will be used in James Webb Space Telescope [12] and the scheduler of SOFIA mission [5]. To complement these examples, a summary of other planning and scheduling tools used in astronomical observatories can be found in [6].

3 Long term mission planning tool in the operational design

Regardless of the algorithm used to plan the observations, the scheduling time-cycle is a critical ingredient in space missions to determine the best design approach. Figure 1 illustrates the interaction of the LT-MPT with the other control modules in the ground segment data flow of EChO. It indicates a high level of interaction between the LT-MPT, the Science Operation Centre (SOC), the Mission Operation Centre (MOC), the Instrument Operations and Science Data Centre (IOSDC), and the EChO Archive. It can be observed that SOC updates the EChO Archive after transferring data from the spacecraft to the station on Earth (downlink process), and manages the information making it available for the IOSDC. The EChO Archive sets as resolved all correct observations, so they will no longer be considered by the LT-MPT. Therefore, the LT-MPT is able to plan the remaining part of the mission when necessary, in order to add new objects to the set of exoplanets or to repeat some exoplanet observations.

Fig. 1
figure 1

LT-MPT interaction with the other elements that communicate with it, which are Science Operation Centre (SOC), Mission Operation Centre (MOC), Instrument Operations and Science Data Centre (IOSDC) and EChO Archive. The origin of the arrow indicates the element that prompts the action

Two types of interactions can be identified in the operational design between MOC and LT-MPT. The first one happens every six months and aimed at on building a new optimal mission plan according to the remaining time of the mission, and delivering the first six months to SOC. The second type of interaction is designed to respond to an unexpected problem in the observation of the planned events. In this case, the LT-MPT tries to reschedule the wrong observations in the current mission plan by placing them in the gaps between observations or replacing lower priority targets. Also, when LT-MPT delivers to SOC the first six months of the mission plan, SOC can propose some modifications (usually related to the scheduling of the downlink communications).

4 EChO mission optimization problem

The aim of this section is to present the main aspects of the mission that have to be considered for the LT-MPT design.

4.1 Operation tasks

The EChO mission will have to deal with a variety of observation patterns that are described as follows:

  • Science observations are the observations of target events. A target event is defined as a time interval when the exoplanet is transiting its host star or is hidden by it.

  • Downlink communication is used for transferring data from the spacecraft to stations on Earth. The design solution for nominal communications during science operations phase is a fixed antenna, requiring special communication attitudes during downlink. A downlink communication has to be planned every 3.5 days, with some flexibility, and has a duration of 2 hours.

  • Station keeping operations are defined to keep the spacecraft in the assigned orbit. They are initially defined to be carried out every 28 days, with some flexibility, and with a duration of 8 hours.

  • Calibration tasks based on several calibration items (e.g., instrument noise, instrument absolute wavelength or instrument pointing) will be done. One hour of calibration tasks should be planned each day if possible.

Several operation tasks have to be done in fixed slots of time and they involve a temporary stop of the scientific operations. Thus, any possible collision between them must be solved.

4.2 Constraints

A scheduling process can be considered a constraint satisfaction problem, which is a mathematical problem defined as a set of objects whose state must satisfy a number of constraints or limitations. Two kinds of constraints are identified: hard constraints and soft constraints. The first ones have to be necessarily satisfied, and the other ones express a preference of some solutions over other ones. Thus, the final scheduling solution must fulfil all the hard constraints and it should optimize the soft ones.

4.2.1 Hard constraints

The hard constraints identified in the EChO mission are:

  • Orbital Constraint: The satellite orbit constraints the visibility of exoplanets (see Fig. 2). Thus, it must be considered by the LT-MPT when computing the suitability for the observation of a target event.

  • Transit Constraint: The strategy to execute the science observations depends on exoplanet transit or occultation events and their duration. The exact occurrence of an event (T c ) can be calculated in advance. The duration of an event T o b s , see (1) results from T14, which is the time between first and fourth eclipse contacts (i.e., time between the beginning and the end of an eclipse), and an additional time (T b ) devoted to determine the flux baseline. T b is split in two time intervals: T b e f o r e , time allocated before the beginning of the eclipse, and T a f t e r , time interval after the end of an eclipse. We consider the time window of an exoplanet as the duration of an event that is visible from the telescope (i.e., when the orbital constraint of the exoplanet is achieved). Figure 3 shows a typical target event.

  • Target Completeness Constraint: This constraint is related to the science observations. In terms of scientific interest, only the observations of complete targets are useful. A target is complete when it is observed a minimum number of times. Presently, this value is adopted as 80% of the required number of event observations of each target.

  • Slewing Constraint: Pointing to a particular target and acquiring data requires a specific configuration. Thus, time to transfer to a new configuration must be taken into account when computing the mission planning. This reconfiguration time mainly depends on the slewing speed of the satellite.

  • Overlapping Constraint: A key constraint in the EChO mission is that the telescope can only do one task at a time. Thus, the LT-MPT must plan the operation tasks avoiding overlapping between them (see Fig. 4), including the time to reconfigure the system (i.e., slew time).

  • Contamination Constraint: To avoid contamination of the targets, these should be observed when the brightness of the Moon or light from other objects does not affect significantly the measurements.

$$\begin{array}{@{}rcl@{}} \begin{array}{llllll} &&T_{b} = T_{before} + T_{after} \\ &&T_{before} = \frac{T14}{2} & \quad T_{obs} = T14 + T_{b} \\ &&T_{after} = \frac{T14}{2} & \quad\phantom{T_{obs}} = 2\cdot T14 \\ \end{array} \end{array} $$
(1)
Fig. 2
figure 2

Overall sky visibility for a roll angle of 36

Fig. 3
figure 3

Transit light curve of an exoplanet with the total observation time of its event marked with a solid line

Fig. 4
figure 4

Transit light curves of different exoplanets. The best target at a given time has to be selected to avoid overlapping. In this case, the planned targets are marked with P and the unplanned ones with U

4.2.2 Soft constraints

In addition to the hard constraints identified, some soft constraints related to promote the exploitation of the resources and the scientific return can be defined. Each soft constraint is explained in the next points:

  • Promotion of the time that the telescope is observing target events throughout the mission.

  • Promotion of the reduction of the time spent in slewing for pointing to the targets in the entire mission.

  • Promotion of the number of priority targets observed during the mission. The computation of the priority of each exoplanet is detailed in the next section.

  • Promotion of the number of targets completed (observed, at least, 80% of its required number of events) at the end of the mission, because the observations of events related to incomplete targets will not be considered in the execution of the mission plan.

The optimization of these soft constraints is a key factor for obtaining a suitable mission plan because they are focused on obtaining a mission plan with an adequate exploitation of the resources and with a high scientific return. Thus, the LT-MPT proposed in the next section has the aim of obtaining a mission plan that optimizes these soft constraints and respects all the hard constraints.

4.2.3 Target priority

In EChO, the targets are classified in classes in order to guaranty that each class has some targets observed by promoting the most difficult ones to be planned. In particular, the priority of the targets is defined in two stages: (1) targets of classes with more criticality, which are the classes that are more difficult to plan in terms of number and duration of the events of their targets; and (2) within targets of the same class, the less demanding targets in terms of number of events and duration. Targets that belong to the most critical (i.e., most difficult) classes are the higher priority ones and, if several targets belong to the same class, the less critical targets (i.e., easier to observe) are the higher priority ones inside that class. A priority level is assigned to each target, this level is an increasing value starting at 1, where the lower values indicate the higher priority targets (i.e., a target with level 1 has higher priority than a target with level 2).

Finally, in terms of scientific criteria, some targets can be considered more important than other ones and should be promoted in the mission plan. This scientific priority is not considered in the current model but, if necessary, the priority level can be modified (e.g., adding new priority stages) without affecting to the process of the LT-MPT.

Note that other ad-hoc methodologies to determine target priorities can be defined. If, e.g., a specific object or a target class is deemed of great importance, individual priorities or class priority multipliers could be easily defined and accounted for. These ad-hoc target priorities are not considered in the present exercise.

5 Long term mission planning tool based on artificial intelligence

Planning of astronomical observations is an example of the classical task allocation problem known job-shop problem, where several tasks are assigned to identical resources while minimizing the total execution time. In most cases there is no single best approach to solve the planning system and, therefore, AI techniques can allow us to find a solution near to the optimal one in a reasonable time. In this section we present the LT-MPT approach for the EChO mission based on GAs [10], in spite of the standard process to be followed for building an automatic LT-MPT is independent of the AI technique used. This means that the global optimization process used can be based on other paradigms such as simulated annealing or swarm intelligence, which may perform similarly.

5.1 Long term mission planning tool process

The LT-MPT process is designed to minimize conflicts between high priority targets, downlink communications and station keeping operations, and to optimize the planning of science observations according to the constraints and objectives mentioned in Section 4. Note that the proposed process is independent from the operational design and it can be adapted to follow various approaches (i.e., order of observation and downlink placements, etc.). We can identify two main aspects based on the problem conditions described in Section 4: (1) the optimization of the positioning of downlink communications and station keeping operations focused on restricting lower priority targets, and (2) the optimization of the observation scheduling of each target event respecting the hard constraints and optimizing specific objectives. At the end of this process, a feasible and optimal Long Term Mission Plan (LTMP) is obtained. The LTMP is represented by a timeline for all the planned days (i.e., the remaining time of the mission) that indicates when each operation task will be done and its duration (see Fig. 5). Note that the LT-MPT sends to SOC only the first six months of the LTMP. Taking into account these considerations, we propose the LT-MPT process shown in Algorithm 1.

figure a
Fig. 5
figure 5

Inputs needed by the LT-MPT to obtain the LTMP that indicates in a timeline the tasks to be done at each specific time

The process specified in Algorithm 1 has several steps that are explained below:

  • Step 0, calculate the time windows of target events: This is an initial phase based on obtaining the windows of time where the target events can be scheduled. This is obtained by using the target ephemeris together with the restrictions coming from the spacecraft sky visibility data. The resulting information is a list of time windows for each target. Each time window will correspond to the duration of an event of its corresponding target t, so it will have a duration of 2⋅T14 t .

  • Step 1, clean-up impossible targets: The aim of this step is to remove targets that are unschedulable. These are targets that, according to the time available for the entire mission, cannot be observed at least m % of the requested number of events because of visibility limitations.

  • Step 2, observation planning optimization: This step places the observations of target events obtaining a LTMP. This is an optimization problem that aims at planning the observations respecting the hard constraints and optimizing some specific objectives related to promote the soft constraints. The only hard constraint that can be violated in this step is the target completeness constraint.

  • Step 3, drop observations of incomplete targets: This step responds to the target completeness constraint. Each of the incomplete targets (i.e., targets observed less than an m % of the requested number of events) of the LTMP obtained at Step 3 is attempted for completion, and if the target cannot be completed it is removed from the LTMP. At the end of this step, a feasible LTMP for the EChO mission is obtained.

  • Step 4, downlink and station keeping optimization: This part of the process is based on planning downlink communications and station keeping operations in the gaps between the observations by respecting their periodicity (with some flexibility). SOC can request modifications related to this operations according to the ground station conditions.

  • Step 5, fill gaps with calibrations: This phase is aimed at placing calibrations in the gaps of time that remain between the scheduled tasks in the LTMP obtained at Step 4. Moreover, other operation tasks or external observations can be planned in the gaps. The first six months from the resulting observation plan produced from this step are sent to SOC.

The optimization phase of Step 2 can be done by using several mathematical tools such as simple heuristics or AI techniques. We propose to apply GAs, which are able to solve optimization problems and have the ability to be adapted to new environments and constraints.

5.2 Observation planning optimization

The aim of the Observation Planning Optimization (OPO), which is the Step 2 in Algorithm 1, is to plan the science observations finding the optimal time windows for their execution, and avoiding any overlap with other operation tasks. Moreover, the LTMP has to promote the soft constraints defined in the mission. Figure 6 describes the cycle followed by the OPO algorithm. In the next points, we present the individual representation, the genetic operators, how to repair an unfeasible individual, the objective functions more suitable to this optimization problem, and how to improve a final LTMP.

Fig. 6
figure 6

Cycle followed in the OPO algorithm based on Genetic Algorithms

5.2.1 Individual representation

In a Pittsburgh-style GA, each potential solution in the genetic process is referred as individual and its representation is based on the definition of genotype, which is a set of genes that can have different values (alleles) [1, 15]. The proposed individual genotype is made up of integer numbers which represent the time windows where the targets are planned. Each individual consists of \({\sum }_{t\in T} \bar {\bar {E}}_{t}\) genes \(\{o_{1,1}, ... , o_{1,\bar {\bar {E}}_{1}}, ... , o_{\bar {\bar {T}},1}, ... , o_{\bar {\bar {T}},\bar {\bar {E}}_{\bar {\bar {T}}}}\}\), where \(\bar {\bar {E}}_{t}\) is the number of events of target t, \(\bar {\bar {T}}\) is the cardinality of the set of existing targets T, and o t,i corresponds to observation i of target t. Moreover, the o t,i value has to be between the range [-1, \(\bar {\bar {W}}_{t} - 1\)], where -1 indicates that the corresponding event does not have a window assigned (i.e., it is not planned), and \(\bar {\bar {W}}_{t}\) is the number of potential time windows of target t. The order of the targets in the genotype does not indicate a temporal sequence, it is only the order of the targets in the input file. The temporal sequence of targets is defined by the alleles, because they indicate the time window assigned to each target observation. Moreover, the individual represents all the observations required for each target but also there is not a temporal sequence in them. For instance, an observation in the position i of the observations of a specific target can be planned in a time window previous to the time window of observation i−1, and also observation i−2 can be unassigned. An example is shown in Fig. 7, where the first part of the image describes three targets to be planned (the number of events to be observed, and the time windows where they can be observed) and the second part of the image presents a possible fragment of an LTMP (note that it is not the optimal one). In particular, the LTMP is depicted with the genotype of the individual and its interpretation, where the first two genes are referred to the first target (it has two events), and the third and fourth genes to the second and third target, respectively. The allele of each gene indicates the window assigned to each observation from the potential time windows of the corresponding target (the matching between alleles and windows of each target is shown in grey colour). Note that the third target is not planned.

Fig. 7
figure 7

The example provides some targets to be planned and a potential LTMP using the proposed individual representation. The targets to be planned are described with the number of events to be observed and the time windows where they can be observed. Each gene of the individual genotype refers to a target event, and its value (allele) indicates the window assigned to each observation from the potential time windows of the corresponding target (the grey colour indicates the matching between alleles and target windows)

Individuals consist in planning the observations of the events by assigning to each potential observation a time window where an event can be observed. However, the slew time between two consecutive observations is not considered. Therefore, the slew time must be added to the time window assigned to each observation, as described in Section 5.2.3.

The initial population is built by creating N I new individuals assigning to each allele a value between the range [-1, \(\bar {\bar {W}}_{t} - 1\)]. The process to build each individual is based on placing the observations of the targets, selected in a random order, and avoiding overlap. In case of overlapping, the event is dropped.

5.2.2 Genetic operators

The GA process is roughly based on applying selection, reproduction (crossover) and mutation [10, 13] operators for several iterations.

A tournament selection strategy has been chosen for this problem because it is one of the most widely used selection strategies in GAs and it works efficiently for a wide range of problems [10], and our approach has been designed to use elitism in the population. The crossover and mutation operators are described as follows:

  • Crossover: In our case, two new individuals are obtained from two previously selected parents by using uniform crossover, which is based on assigning for each gene of the first child the allele of the same gene of the first parent or the second parent with a probability of 0.5. The alleles of the genes of each parent not assigned to the first child are copied in the corresponding genes of the second child. Parents are crossed with a specific p c probability, which means that there are some situations where parents are not crossed and the two offspring are the parents themselves.

  • Mutation: The mutation is applied to each gene of every new individual with a probability of p μ , which means that some genes are not mutated. Usually, p μ is a low value because only few genes have to be mutated in order to make minor changes to the individual, which is the key of diversity. In our approach, a mutated gene g is obtained by changing its allele with one of the potential time windows of its corresponding target. Thus, a mutated gene g changes its allele with a random value μ between the range [0, \(\bar {\bar {W}}_{t} - 1\)] (i.e., g =μ).

Note that the crossover of two feasible individuals can generate unfeasible offspring due to overlapping, and the mutation of a feasible individual can also generate an unfeasible solution. This is solved by a repairing procedure devoted to obtain new individuals with no overlapping, as the next point explains.

5.2.3 Repair of the individual

An individual represents the time windows assigned to target observations, but it does not consider the slew time between two observations. Thus, this aspect has to be considered for obtaining the final planning codified by each individual. This modification can produce an unfeasible individual because it can have conflicting observations (i.e., presence of overlaps in the observations). There are two ways for obtaining an unfeasible individual that requires repair during the GA process: (1) the individual has overlapping time windows between two or more observations, and (2) there is overlapping between two or more observations when slew time is added to each observation. We may find that it is necessary to repair the individuals after the mutation process in order to obtain feasible ones. So, the main idea of the repair operator is to solve all overlaps in the individual by changing the window assigned to the conflicting observations. The process is based on a hill climbing strategy that follows the steps of Algorithm 2. The hill climbing strategy is applied to each conflictive event, and it is considered that an individual is improved when: (1) a new window with no overlapping can be assigned to the conflictive event, the plan is improved because the conflict is solved and the event is still planned, and (2) the conflictive event is unassigned, the plan is improved because the conflict is solved. The first improvement is preferred and, for this reason, all possible windows are analysed before unassigning the event.

figure b

5.2.4 Objective functions

Two objectives to be optimized are proposed in order to promote the soft constraints of the mission. These two objectives are summarized in a single fitness function F(see 2) that evaluates each individual I. The first objective (F G ) computes the time that the telescope is not observing target events and the second one (F T ) computes the number of incomplete targets weighted with their priority (having more weight the incomplete targets with higher priority). F G is defined in (3), where o is an observation from O I , which contains all the target observations planned by individual I. T a r g e t(o) returns the target associated to the observation o, T14 i is the duration of the target event in time units of target i, and T is the collection of existing targets. The denominator of F G indicates the total input time (i.e., the required time for observing all the required events). F T is described in (4), where P r i o r i t y(t) indicates the priority of target t, and \(\bar {\bar {T}}\) is the cardinality of the set of existing targets. It can be noticed that F, F G and F T have values between 0 and 1, and they are optimized when they are minimized. So, the individual of the population that minimizes F will be the best one. When an individual is evaluated with the fitness function, it represents a feasible plan that respects all the hard constraints except the target completeness constraint. The next two sections present how to obtain the final LTMP by removing the incomplete targets and filling the gaps between observations with downlink communications, station keeping operations, calibration tasks, and additional scientific observations.

$$\begin{array}{@{}rcl@{}} && F(I) = \displaystyle \frac{F_{G}(I) + F_{T}(I)}{2} \end{array} $$
(2)
$$\begin{array}{@{}rcl@{}} && F_{G}(I) = \displaystyle 1 - \frac{{\sum}_{o\in O_{I}}~2\cdot T14_{Target(o)}}{{\sum}_{t\in T}~2\cdot T14_{t}} \end{array} $$
(3)
$$\begin{array}{@{}rcl@{}} && F_{T}(I) = \displaystyle 1 - \frac{{\sum}_{t\in T}~if~t~is~incomplete~in~O_{I},~then~\bar{\bar{T}}-Priority(t)}{{\sum}_{t\in T}~Priority(t)}~~~~~ \end{array} $$
(4)

5.3 Improvement of the long term mission plan

The final individual obtained with the GA is the individual that best optimizes the defined fitness function in the final population. However, although the number of complete targets is promoted, some targets can be incomplete (not observed at least m % of their required number of events). For this reason, in order to obtain a feasible LTMP, the incomplete targets have to be entirely removed from the plan, increasing the number of gaps. Thus, the feasible LTMP can still be further improved by filling these gaps with additional tasks. The filling process attempts to add observations of targets with high priority. This can result in an overallocation of events of a particular target. Therefore, we have limited the times that a target can be planned and it is calculated as a percentage (M %) according to their required number of events. This process is divided in three parts:

  • The first one tries to place in the gaps between observations the downlink communications and the station keeping operations required with the specified periodicity and some flexibility (defined with the parameter f, which indicates a percentage of their periodicity). If these tasks cannot be placed in gaps according to their specific periodicity, they replace the observations planned in the conflictive time ranges.

  • The second one tries to place in the remaining gaps the calibrations (one hour each day).

  • The third one tries to complete each incomplete target and removes it if the target is still incomplete. After this, the gaps are filled with the observations of targets already included in the LTMP, starting with events of those targets with the highest priority.

After this procedure, a feasible and optimal LTMP with the planning of all the required operation tasks is obtained.

6 Experiments, results and discussion

The main goal of this section is to empirically analyse the performance of the proposed LT-MPT. The results obtained with different experiments are used to prove the suitability of the LT-MPT in EChO and, at the same time, to demonstrate the feasibility of the mission. The experiments have been carried out in several mock scenarios and in a real sample of known exoplanets.

6.1 Experimental methodology

This section describes the test samples used in the experimentation, the configuration of the LT-MPT and the comparison metrics applied in the evaluation.

6.1.1 Test samples and constraints

One of the key aspects of the EChO mission is the need to cover a broad area in the parameter space in terms of exoplanet and host star configurations. A statistical analysis has been carried out to estimate the future available parameter space for EChO together with the number of transiting planets expected in the year 2022. The resulting hypothetical so-called Mission Reference Sample (MRS) [9] covers the full range of exoplanetary host systems that EChO can potentially observe according to current Signal-to-Noise Ratio requirements and conservative assumptions on instrument performance.

The experiments have been analysed in ten distinct artificial scenarios in order to test if the algorithm is able to obtain similar performance with different target lists. Each scenario corresponds to a realization of the MRS and each one characterizes 238 targets, which results in around 6200 events and requires about 27000 hours of observation. MRS realizations are calculated by randomizing several parameters according to astrophysically-sound assumptions.

Although the generated scenarios are created artificially, they are realistic in astrophysical terms and more complete than a list of currently known exoplanets. However, an additional scenario based on the known exoplanets (192 targets that result in 4065 target events and 22241 hours of observation) has been included in the experimentation with the aim of testing the proposed tool in the current time.

The EChO mission has a nominal duration of four years and a goal of six years. In the presented experiments, the scenarios are planned in five years (2022 – 2026) in order to allow extrapolating the results to four or six years. From these five years, the first six months are for commissioning, so the LT-MPT has to plan the scientific observations in the remaining four years and a half.

6.1.2 General test bench configuration

Some global assumptions for the problem constraints have to be specified when defining the test bench configuration:

  • Tests cover 5 years (2022 – 2026), with the first 6 months for commissioning.

  • 520 downlink communications with a duration of 2 hours and a periodicity of 3.5 days are considered.

  • 65 station keeping operations with a duration of 8 hours and a periodicity of 28 days are assumed.

  • 1825 calibrations of one hour every day.

  • Slew time between observations of different targets is taken into account using a slew speed of 45 degrees per 10 minutes plus a flat 5-minute overhead.

6.1.3 Algorithm configuration

In terms of parameterization, the LT-MPT has several parameters related to the OPO algorithm and the improvement processes. Table 1 summarizes the parameter configuration used in the experiments done. Note that downlink communications and station keeping operations have a flexibility to their default position of 10% (8.4 hours and 2.8 days respectively), which is a reasonable flexibility according to the ground station conditions.

Table 1 Parameter configuration of the LT-MPT

6.1.4 Comparison metrics

Four metrics have been defined in order to evaluate and compare the performance of the obtained LTMPs in the aforementioned experiments. The first two metrics are related to the use of resources:

  • Event Observation Time: It is computed as a percentage of the total input time (i.e., the required time for observing all the required events), which is the time covered by the whole sample.

  • Slew Time: It is calculated as a percentage of the whole mission time (5 years).

The other two metrics are related to the scientific return:

  • Events Planned: It is the percentage of events observed from the overall number of required events.

  • Targets Completed: It is calculated as the percentage of complete targets from the overall sample of targets. Note that the LTMP only considers observations of complete targets, which are the targets observed at least m% (in this case, m is 80) of the required number of events.

6.2 Analysis of the experimentation results

This section presents the results obtained with the aforementioned configurations of the LT-MPT, which has been run several times using 25 different random seeds for each scenario. Each of these executions is referred hereafter as trial.

In Table 2, the results obtained for the MRS sample realizations (MRS_rand_#) and the real case (Real_sample) are described in terms of the four defined metrics (event observation time, slew time, events planned and targets completed). The mean and deviation values of each metric for the 25 trials are given. Moreover, the average of each metric for the sample realizations is shown (MRS_rand Avg.).

Table 2 Results obtained (mean and deviation) with the LT-MPT with DSKO and OPO in the defined scenarios after 25 trials

Table 2 shows that the results obtained in each scenario have a small deviation, so the GA is able to explore similar regions of the search space in each trial. Moreover, results show that there are slight differences between the results obtained in the sample realization scenarios because MRS_rand Average presents a small deviation in the four analysed metrics. Therefore, the LT-MPT is able to obtain solutions of similar quality in different scenarios.

It is important to compare the performance of the LT-MPT based on GAs with a LT-MPT based on heuristics. The proposed heuristic is similar to the way in which an operator plans the observations. The main idea is to place the observations of the targets in priority order avoiding overlaps. A target observation that cannot be planned without overlap is dropped. Finally, the obtained LTMP is improved by filling the gaps and removing the targets that do not have at least 80% of the required events observed. The obtained results show that the LT-MPT based on GAs improves the Event Observation Time around 15%. This implies that the spacecraft is collecting photons around 3500 hours more in the GA approach than in the heuristic approach. Moreover, the GA approach observes 30% more events, so the observations are more efficiently planned. Therefore, the GA approach completes nearly 40% more of targets than the heuristic approach and, consequently, the result based on GA has significantly more interest from a scientific point of view. The next two sections analyse in more detail the results of the LT-MPT based on GAs.

6.3 Use of resources analysis

In terms of the event observation time metric, the telescope is observing between 80% and 95% of the total input time (i.e., the required time for observing all the required events) for each sample realization scenario and around a 99% of the time in the real sample. The duration of the available 4.5 years of the mission is 39420 hours, but one of the requirements of the mission is to allow around 20% of the available time for doing tasks of missions unrelated with the EChO goal. Thus, 31536 hours are free for doing all the operation tasks required by EChO. The overall time that EChO will be working (observations, slews, downlink communications, station keeping operations and calibrations) is, in average, 29662 hours in the artificial scenarios (24300 hours of target event observations, 1643 hours of calibrations, 1408 hours of downlink communications and station keeping operations, and 2312 hours of slewing). These working time is equal to the 94% of the free time of the mission (29662 hours worked of the 31536 hours available). Thus, due to the fact that these scenarios describe the hypothetical list of exoplanets to be observed in 2022, we can conclude that the optimization of the use of resources will be achieved.

Moreover, an interesting aspect to analyse is the number of gaps between operation tasks and their duration in the final LTMP (i.e., when all the operation tasks are planned). Figure 8 shows the number of gaps for several gap slots sizes in minutes for the artificial scenarios. It can be observed that there are a small number of gaps shorter than 10 minutes and longer than 300 minutes (5 hours). Thus, it seems that the gaps between observations are somehow commensurate to the visibility of each target event. This is supported by the fact that the algorithm obtains similar gap distribution for each scenario. Figure 9 shows the total duration of the gaps of each gap slot size in the artificial scenarios. It can be observed that the gap duration can slightly change in each scenario but they follow a similar pattern. As result of this analysis, it can be concluded that the resulting LTMP has around 9000 hours of gaps of considerable size that can be used for doing tasks unrelated with the EChO goal.

Fig. 8
figure 8

Average number of gaps for each gap slot in the artificial scenarios

Fig. 9
figure 9

Average duration in hours of the gaps of each gap slot in the artificial scenarios

6.4 Scientific return analysis

Table 6.4 shows that in the sample realization scenarios around a 93% of the required observations are planned and about a 98% of the targets are completed. In the case of the real sample, almost 99% of the events are planned and 100% of the targets are completed. These results show that the majority of the observations are fulfilled and almost all the required targets are completed (all these targets are observed for at least 80% of their required events). Therefore, it is clearly shown that the optimization of the scientific return is also achieved due to the high number of target events observed and of targets completed.

Furthermore, an important issue to be analysed from a scientific point of view is the completeness of each target class, which is the number of targets completed for each class, and it is related with the priority of the targets. Figure 10 shows the average number of targets that belong to each class (first bar, in black colour), and the average number of targets planned for each one (second bar, in grey colour) in the artificial scenarios with the 25 trials executed. It can be observed that all classes have a high class completeness rate, even those that contain more targets.

Fig. 10
figure 10

Average of the target class completeness in the artificial scenarios. For each class, the first bar (in black colour) indicates the number of targets that belong to it, and the second bar (in grey colour) indicates the average number of targets finally planned

6.5 Computational cost

The LT-MPT has been executed in a CPU Intel Core 2 Duo Processor E6600 2.40 GHz with 6GB of RAM, and the planning results of one trial are obtained in approximately 45 minutes for the sample realization scenarios and in 15 minutes for the real sample scenario.

The computational cost of the LT-MPT depends on the number of generations, the number of individuals evaluated in each generation (in our case, two individuals) and the cost of the evaluation of the objective functions and the repair of the individual. According to the individual representation described in Section 5.2.1, the total number of events required (i.e., sum of events to be observed for each target) is the aspect that affects the computational cost of the algorithm because the size of the individual depends on it. The complexity of the evaluation of each individual described in terms of Big-O notation [21] is O(2⋅e), where e is the total number of events required for all the targets; and the complexity of the repair algorithm of each individual is O(e 2). In consequence, the cost expression of the LT-MPT is g⋅2⋅(e 2+2⋅e), where g is the number of generations, and 2 refers to the number of individuals evaluated in each generation. Considering a high number of events, this results in a complexity of O(e 2), which means that the LT-MPT has a quadratic cost.

On the other hand, to guarantee that the optimal plan is found, it is necessary to use a brute-force search algorithm. This algorithm will repair and evaluate each possible combination of the observation of events, and the cost expression will be w e⋅(e 2+2⋅e), where w is the average number of windows where each event can be observed. Considering a high number of events, this results in a complexity of O(w e), which is an exponential cost and the computation of the algorithm will be unachievable.

In brief, the LT-MPT based on GAs finds a good solution in terms of use of resources and scientific return in a reasonable computing time.

7 Conclusions and further work

The allocation of tasks, by optimizing different objectives and respecting some constraints, is a key point for the success of astronomical space missions. A proper planning of the observations for EChO can represent a big difference in its efficiency and ultimate scientific performance. However, an enormous amount of possible combinations exists and a manual search for an optimal solution becomes unrealistic. Hence, it will be necessary to increase the efficiency of telescope operation by means of automatic processes that can find solutions close to optimal. Artificial Intelligence techniques can be applied in such optimization problems with high mathematical and computational complexity. In the first part of the document we have identified the role of the LT-MPT into the EChO operations design and we have discussed the main considerations of the EChO mission to be taken into account by the planning tool. In the second part of the document we have proposed a process to be followed for an automatic planning of the mission. The steps of this process are mainly based on (1) calculating the windows of time where each target event is visible, (2) cleaning up targets that cannot be observed the required number of times, (3) planning the target observations by respecting the hard constraints and maximizing the soft constraints, (4) removing observations of the targets that are not planned at least m% of their required events, and (5) filling gaps with downlink communications, station keeping operations and calibration tasks. The key step of this process is the observation planning, which is considered an optimization problem. The optimization algorithm proposed is called OPO and it defines a process with a hybrid optimization step [14] based on (1) a global search carried out with Pittsburgh-style GAs focused on planning target observations, and (2) a local search for guarantee the feasibility of the potential solutions. Moreover, in GAs the definition of the individual (i.e., knowledge representation) and the genetic operators are a key factor for obtaining suitable results [11, 16]. For this reason, the proposed algorithm makes special emphasis in these two aspects in order to properly guide the exploration of the search space to the region where the optimal solutions are.

A detailed experimentation of the LT-MPT has been presented and discussed. Specifically, the LTMPs obtained for five year mission in ten different artificial samples (sample realizations) and in a real exoplanet sample have been analysed. The results are very promising and show that the proposed LT-MPT is useful to solve this problem at a high level of optimization. In particular, the experimentation has shown that the proposed LT-MPT is able to plan observations of almost all the targets. Moreover, it has been shown that the process is able to explore similar regions of the search space in each trial and it obtains the results in a reasonable computing time. It obtains solutions of similar quality in different scenarios, so our proposal is a robust LT-MPT whose performance does not depend on the exoplanet sample. This is an important property because it is expected that the ongoing and future transit search experiments will continue discovering new targets of EChO interest, adding to the available list of exoplanets. Besides establishing a proper design of the LTMP, the experiments that we have conducted, and described here, have been used to size the EChO mission, with the aim of guaranteeing its feasibility (duration of survey, planning efficiency, slew statistics).

Finally, the proposed LT-MPT methodology is applicable to other ground and space missions because the optimization process does not use any heuristic that can limit the search of the plan, so it is only necessary to adapt the representation of the individual and the genetic operators to the problem to be solved.