1 Introduction

The ground-breaking discovery of cell reprogramming has overturned the conventional thinking that cell differentiation was irreversible. With cell reprogramming techniques, it is possible to reprogram cell fates in many different ways, such as trans-differentiation, de-differentiation, retro-differentiation, etc. This has opened up a great opportunity for regenerative medicine to treat the most devastating diseases, such as Parkinson’s disease, Alzheimer’s disease, etc.

A big obstacle for the application of in vivo cell reprogramming is the effective identification of target genes. Numerical-experimental methods are infeasible [13], due to the combinatorial complexity of target genes and high experimental costs. This gives rise to the need of computational modelling of gene regulatory networks (GRNs), which makes it possible to analyse GRNs with formal reasoning and tools. Among various modelling frameworks, Boolean networks (BNs) have distinct advantages: being simple and yet able to capture important properties of nonlinear dynamical GRNs [2]. In BNs, genes are modelled as binary nodes, being either ‘expressed’ or ‘not expressed’. Activation/inhibition regulations between genes are described by Boolean functions. The updating of nodes can be either synchronous or asynchronous. The steady states of GRNs are described as attractors, to one of which the system eventually settles down.

Asynchronous BNs are considered more realistic than synchronous BNs, because only the asynchronous updating allows for the biological processes occurring on different time-scales [7]. Owing to the non-determinism of asynchronous BNs, the control methods for synchronous BNs [3, 15] are not applicable to asynchronous BNs. Recently, a stable motif based method was developed to control asynchronous BNs [14]. However, this method does not guarantee the minimality of the control sets, which may result in unnecessary experimental costs.

The limitations of the existing methods motivate us to develop scalable, efficient and practical methods to control asynchronous BNs.

2 Control Problems in Boolean Networks

Attractors of a BN characterise cell phenotypes, which are biologically observable states [2]. Only the control of attractors is meaningful. Thus, the control objective for GRNs, in the context of asynchronous BNs, can be described as: finding a subset of nodes, called driver nodes, such that the control of these nodes can drive the network (from a source attractor) to a target attractor. If the source state is known, we call it source-target control; otherwise, we call it target control.

The control (perturbation) of a node means to change the expression of the node to either ‘1’ or ‘0’. Based on the amount of time that the control is applied, we distinguish the following three types of controls:

  1. (a)

    instantaneous control – the control is applied instantaneously;

  2. (b)

    temporary control – the control is applied for finite steps and then released;

  3. (c)

    permanent control – the control is applied for all the following time steps.

Thanks to the rapid advances in cell reprogramming, these three controls can be realised in biological experiments with different bimolecular tools. In particular, for the source-target control, based on the number of control steps, we also have:

  1. (a)

    one-step control – perturbations are applied simultaneously for one step;

  2. (b)

    sequential control – perturbations are applied in a sequence of steps.

3 Results

The major challenge in the control of asynchronous BNs lies in the infamous state space explosion problem: the state space grows exponentially with respect to the number of nodes of a BN. It prohibits the efficiency, scalability and minimality of the control methods. To cope with this problem, we employ the ‘divide and conquer’ strategy to explore both the structural and dynamical properties of a BN. As shown in Table 1, we have developed efficient methods to solve the minimal one-step source-target control with instantaneous, temporary and permanent perturbations [8, 9, 11], the minimal sequential source-target control with instantaneous perturbations [4, 5], as well as the target control with instantaneous perturbations [1]. Among these methods, sequential source-target control identifies a sequence of intermediate states and the associated perturbations. At each control step, we apply a set of perturbations, wait until the network reaches the intermediate state and then apply another set of perturbations. Based on the status of intermediate states, we have developed a general sequential control [5], where any state (transient states or steady states) can act as intermediate states, and an attractor-based sequential control [4], where only steady states can play the role of intermediate states.

Our methods are implemented as part of the software ASSA-PBN [6]. We have evaluated our methods on a variety of real-life biological networks.Footnote 1 The results show that our methods are quite efficient in terms of the computation time and scale well for large networks. Moreover, our methods identify the minimal control sets for different strategies, which can reduce the experimental costs to a great extent. For the source-target control, both sequential control methods [4, 5] require less perturbations than the one-step control [8, 9]. The attractor-based sequential control [4] is considered more practical than the general sequential control [5], since it uses biologically observable states as intermediate states and thus only requires partial observability of the system. Considering different types of perturbations, the temporary perturbation is preferable than the instantaneous and permanent perturbations due to: (1) temporary control requires the least number of perturbations, which not only leads to less experimental costs, but also makes the experiments easier to conduct; and (2) temporary perturbations are eventually released and thus less invasive. In this way, we can avoid some unknown side effects compared to permanent perturbations.

Table 1. Different control strategies for asynchronous BNs.

Furthermore, in order to make the results more practical, experimental constraints need to be incorporated. We have made the following improvements to integrate practical and experimental concerns.

  • Our methods can avoid certain perturbations (perturbing a gene from ‘expressed’ to ‘not expressed’ and/or the reverse direction) during the computation. In this case, we can avoid perturbing genes that are essential for cell survival and genes that are hard or expensive to perturb.

  • For attractor-based sequential source-target control, our method can avoid undesired attractors as intermediate states, such as apoptosis.

  • Considering the minimal solution may not be the best solution, an upper bound of the number of perturbations can be set as a prerequisite. Then our methods can compute all the solutions within that upper bound.

4 Discussion and Future Work

Due to the diversity of GRNs, it is less likely to find one strategy that suits different networks. Furthermore, the cost and success rate of different cell reprogramming techniques vary a lot. Taking these into consideration, we can compute a bunch of reprogramming paths with different control methods. Afterwards, biologists can make a choice from the provided solutions based on specific biological systems and experimental settings.

Given the strengths of temporary perturbations, currently we are working on sequential source-target control and target control with temporary perturbations. In future, we plan to extend our work to the control of PBNs [10, 12].