Keywords

1 Introduction

Gene expression analysis is being used to investigate the functions of gene products (RNA and protein), to improve our understanding of various aspects of cellular function and disease, and to facilitate drug development [1]. Expression analysis has ever revealed the key regulators for various cell differentiations, which may help scientists establish novel cells [2]. However, little is known about the regulatory mechanisms of dynamic gene expressions. Although many biological processes, such as transcription factors binding, chromatin remodeling and cell cycle, have been reported as the important factors, a systematic understanding of unidirectional cell differentiations remains to be acquired. Today systems biology gives us a novel methodology to systematically understand the complex intracellular dynamics [3].

Modern techniques on single-cell and -molecule resolution reveal that transcriptions and translations are stochastic in time and that clonal population of cells displays heterogeneity in the abundance of a given RNA and protein per cell [47]. Thus, expression analysis based on probability statistics becomes an indispensable tool today [8], which may shed light on the classical biological knowledge [9]. In the present article, we mathematically investigate a variety of expression patterns by analyzing a simple model of transcription. Then, we discuss a reduction of the model equation, which is the key step to make gene regulatory networks [10].

2 Mathematical Model

Based on previous papers [11], we consider the following mathematical model for a single gene expression induced by a transcription factor:

$$\begin{aligned} \mathrm{G} \ \xrightarrow []{\displaystyle \ a } \ \mathrm{G}^{*} , \qquad \qquad&\mathrm{G}^{*} \ \xrightarrow []{\displaystyle \ b } \ \mathrm{G}, \nonumber \\ \mathrm{G}^{*} \ \xrightarrow []{\displaystyle \ c } \ \mathrm{G}^{*} + \mathrm{mRNA}, \qquad&\mathrm{mRNA} \ \xrightarrow []{\displaystyle \ d } \ \phi , \end{aligned}$$
(1)

where \(\mathrm{G}\) and \(\mathrm{G}^{*}\) denote the genes being ‘off’ and ‘on’ states, respectively, and \(\phi \) the degraded mRNA. Here, we assume that the transcription event can only occur under the ‘on’ state [12]. The parameters a and b are the probabilities per unit time of the promoter switching from inactive to active and active to inactive, respectively, c and d are the probabilities per unit time of transcription and mRNA degradation, respectively.

We assume that the time evolution of the mRNA copy number is modeled by a simple Markov process in continuous time, of which the state space is defined as

$$\begin{aligned} S = \{ (i, n) \ | \ i \in \{ 0, 1 \}, n \in {\mathbb Z}_{\ge 0} \}, \end{aligned}$$

where \(i = 0\) and 1 are for ‘off’ and ‘on’ states, respectively, and n denotes the mRNA copy number in the system. Let \(P^{(i)}_{n} (t)\) be the probability of having (in) state at time t, which obeys the following master equation :

$$\begin{aligned} \frac{\mathrm{d}}{\mathrm{d} t} \begin{bmatrix} P^{(0)}_n \\ P^{(1)}_n \end{bmatrix} = {{\varvec{A}}} \begin{bmatrix} P^{(0)}_n \\ P^{(1)}_n \end{bmatrix} + {{\varvec{C}}} \begin{bmatrix} P^{(0)}_{n-1} - P^{(0)}_n \\ P^{(1)}_{n-1} - P^{(1)}_n \end{bmatrix} + {{\varvec{D}}} \begin{bmatrix} (n+1)P^{(0)}_{n+1} - n P^{(0)}_n \\ (n+1)P^{(1)}_{n+1} - n P^{(1)}_n \end{bmatrix}, \end{aligned}$$
(2)

where \({{\varvec{A}}} = \begin{bmatrix} -a&b \\ a&-b \end{bmatrix}\), \({{\varvec{C}}} = \begin{bmatrix} 0&0 \\ 0&c \end{bmatrix}\) and \({{\varvec{D}}} = \begin{bmatrix} d&0 \\ 0&d \end{bmatrix}\). The initial condition is

$$\begin{aligned} (P^{(0)}_{n} (0), P^{(1)}_{n} (0)) = (\delta _{0, n}, 0), \end{aligned}$$
(3)

where \(\delta \) is the Kronecker delta.

3 Analysis

The limiting distribution \(\overline{P}_n\) of the system (2) and (3) becomes as follows:

$$\begin{aligned} \overline{P}_n = \frac{\gamma ^n}{n!} \mathrm{e}^{- \gamma } \frac{( \alpha )_{n}}{( \alpha + \beta )_{n}} \ _1 F_1 ( \beta , \alpha + \beta + n; \gamma ), \end{aligned}$$
(4)

where we define \(\alpha = a / d\), \(\beta = b / d\) and \(\gamma = c / d\). Here, \(( )_{n}\) is the Pochhammer symbol and \(_1 F_1\) is the Kummer function. As indicated in an earlier paper [13], (4) can be further simplified as follows:

$$\begin{aligned} \overline{P}_n = \int _0^1 \frac{(\gamma x)^n}{n!} \mathrm{e}^{- \gamma x} \frac{x^{\alpha - 1} (1 - x)^{\beta - 1}}{B(\alpha , \beta )} \mathrm{d} x, \end{aligned}$$
(5)

where B is the beta function. The distribution (5), which is called a Poisson-beta distribution , shows that the transcription rate can be regarded as \(\gamma x\) in which x follows the beta distribution with parameters \(\alpha \) and \(\beta \). Hence, in the long-time limit, the model (1) can be approximated by the following scheme:

$$\begin{aligned} \mathrm{G}^{*} \ \xrightarrow []{\displaystyle \ c X } \ \mathrm{G}^{*} + \mathrm{mRNA}, \qquad \mathrm{mRNA} \ \xrightarrow []{\displaystyle \ d } \ \phi , \end{aligned}$$
(6)

where the stochastic variable X follows the beta distribution \(B(x; \alpha , \beta )\).

Fig. 1
figure 1

The limiting distribution with respect to the mRNA copy number obtained from (2) and (3). The exact solutions (bold line) are obtained from (4) and numerical solutions (filled bar graph) from Monte-Carlo simulation with \(\Delta t = 0.1\). The parameters \((\alpha , \beta , \gamma )\) are a (50, 50, 1), b (1, 10, 10), c (0.1, 0.1, 10), d (1, 1, 50)

Figure 1a–d shows the limiting distribution (4) with various parameter sets. As one can see in Fig. 1, the expression patterns widely change depending on the parameters \(\alpha \), \(\beta \) and \(\gamma \). From the analytical result (5), we found that the beta distribution produces the variation and the Poisson distribution guarantees the discreteness of mRNA molecules.

4 Conclusion

Mathematical models of gene regulation have been studied since 1960s [1416]. However, the classical deterministic approaches based on the population-wide average methods, such as the statistical procedure and the modeling with ordinary differential equations, are not enough to understand cell-to-cell variability. To understand the mechanisms of cell-to-cell variation in gene expressions, we should consider intrinsic and extrinsic noises (‘biological noise’) when constructing a mathematical model [4]. In the present article, we considered a simple model of transcription with only two gene states (‘on’ and ‘off’) and investigated the probability distribution of mRNA copy number. We found that the limiting distribution can be described by the Poisson-beta distribution , which represents four different types of expression patterns Fig. 1. Thus, the classical model (1) can be approximated by the scheme (6) in the long-time limit.