1 Introduction

With the emergence of systems biology and synthetic biology, concepts and methods from mathematics, physics and engineering are increasingly used in the life sciences [1, 2, 14]. In particular, two central goals of this field are to predict the dynamics of gene expression based on mathematical descriptions of the genetic networks of a cell and to design genetic circuitry based on well-characterized regulatory elements [6, 17, 20, 46]. The progress of this research program has however also highlighted a number of generic complications that arise from the fact that all genetic circuits function in a cellular chassis that itself is dynamic and adapts to external conditions, which can have unexpected effects on circuit function [24, 26, 28, 39, 42]. This observation raises the question what mathematical description is appropriate for the description of genetic circuitry in a dynamic cell. In particular, even if the external conditions are constant and the cells exhibit ‘balanced growth’ (a steady state of all cellular parameters except for the overall exponential growth of the culture), each individual cell grows and divides and, while doing so, doubles its content of all cellular components. Some of the components will clearly affect the function of any gene circuit, the most important example being the duplication of the circuit genes themselves. In mathematical models of genetic circuits, these effects are often ignored and described by an average gene copy number and an effective degradation of the protein that mimics the dilution of a protein concentration due to cell growth in the absence of its synthesis. In this article, we therefore ask how strongly cell growth within the division cycle affects gene expression and whether models that do not describe growth and division explicitly introduce big errors through that approximation.

Another facet of the question which mathematical description to use is the question whether such a description should be deterministic or stochastic. It has been realized in recent years that often the relevant molecules are present in the cell in low copy numbers, giving rise to large fluctuations and thus requiring a stochastic description of gene expression [15, 18, 21, 29, 36]. The foundations for this view have been laid long ago [3, 40], but the progress in single-molecule and single-cell technology now allows the direct observation of these effects and the quantification of fluctuations from time series or from cell-to-cell variability [15, 16, 43, 48]. Stochasticity in gene expression has been studied extensively from a theoretical point of view, see e.g. [3, 4, 10, 19, 27, 34, 35, 38, 41, 44]. Here we ask about the sources of stochasticity, as noise can be generated at many points in the process of protein synthesis and by the partitioning during cell division. Many of the noise sources have been studied before, but we are interested in a systematic comparison of their impact. Specifically, we ask whether there is a dominant source of noise, and whether the noise predicted from models with explicit cell growth and division differs from what is obtained from implicit cell division models.

It turns out that the question of stochasticity and the dependence of gene expression on the growth and division cycle are closely related: The variation of a protein concentration during the division cycle is observed as a cell-to-cell variation in that concentration in snapshots of cell cultures (where the division cycles of different cells are typically not synchronized, i.e. different cells divide at different times). We therefore also determine the effective ‘noise’ that arises from the dependence on the division cycle (which in fact is a deterministic component of the observed ‘noise’ and is seen as part of the so-called ‘extrinsic noise’ that is common to different genes [15, 41]).

The paper is organized as follows: We start with deterministic descriptions of gene expression in Sect. 2, where we discuss the effects of the division cycle and approximations that ‘average out’ the division cycle. In Sects. 3 and 4 we discuss several simple models that describe various processes of gene expression stochastically to address the question of the relative importance of various sources of stochasticity. We derive analytical results for some key characteristics of the noise. Here we focus on intrinsic noise, i.e. noise inherent in the synthesis and division process and specific to one gene. Extrinsic noise is discussed in Sect. 5, where we come back to the dependence of protein concentrations on the division cycle and show that the effective ‘noise’ resulting from this dependence is small (Sect. 5.1). In addition, we also include a discussion of fluctuations of the growth rate (Sect. 5.2). We end with some general conclusions in Sect. 6, where we summarize the relative importance of various sources of noise and cell-to-cell variations and discuss the minimal ingredients to arrive at realistic descriptions of gene expression.

2 Deterministic Descriptions of Gene Expression

2.1 Basic Model

We will start by discussing a simple deterministic model of protein synthesis that accounts for the effects of the cell division cycle, specifically cell division itself and gene duplication, onto protein synthesis. Living cells grow and divide, while in the meantime, proteins are continuously synthesized inside the cell. We determine the amount of protein synthesized within a cell cycle and the corresponding concentrations for both exponential and linear cell growth.

The number of copies of a specific protein in a cell, P(t), is described by the following dynamics:

(1)

where α is the protein synthesis rate, g is the gene copy number and β is the protein degradation rate (typical parameter values are summarized in Appendix A.1). Throughout this work, we will assume that the proteins are stable (β≈0), as it is typically the case for bacterial proteins [31].

While the proteins are synthesized, the cell also grows and divides. Divisions take place at integer multiples of the doubling time T. Here we treat cell division as a deterministic process that occurs instantaneously. At the time of division, the amount of our protein of interest is divided equally among two daughter cells, so that its amount per cell is simply divided by 2. The same partitioning applies to all other contents of the cell, and therefore, in a steady state of growth, all content of the cell has to be doubled between divisions. Specifically, we are interested in the doubling of the gene that encodes our protein of interest. This gene, which we assume to be present as a single copy in the genome of the cell, is doubled at a time t x after the last division (and, of course, divided by 2 at the time of division). Therefore, the gene copy number g that enters Eq. (1) is given by g=1 for times 0≤t<t x after division and by g=2 for times t x t<T. Another important characteristic of the cells that has to double over the doubling time is the total cell mass or the cell volume. We will come back to this point below, when we discuss the concentration of the protein.

Now we consider our gene to be in a ‘steady state’ of protein synthesis, in the sense that the protein level only depends on the time in the division cycle, but is the same if corresponding time points in different cycles are compared. In that case, the protein copy number at the end of the cycle is exactly twice that at its beginning, i.e. P(t=T)=2P 0=2P(t=0) (here and in the following, we measure time with respect to the time of division, i.e. assume that divisions take place at integer multiples of T). This condition, which can be considered as a singular boundary condition for Eq. (1) with times restricted to the interval [0,T], determines the time course of the copy number of our protein of interest per cell,

$$P(t) = \begin{cases}\alpha(t + 2T - t_x) & \mbox{for }0\le t \le t_x \\[3pt]2 \alpha(t + T - t_x) & \mbox{for } t_x < t \le T.\end{cases}$$
(2)

Immediately after division, there are P 0=α(2T-t x ) copies of the protein in the cell, and the same number is synthesized over the doubling time T (Fig. 1). This synthesis occurs in two phases, from one or two copies of the gene, respectively. One can define an effective synthesis rate α eff=α(2-t x /T); then, the number of proteins synthesized over the division cycle has the intuitive form α eff T.

Fig. 1
figure 1

Variation of the protein number P(t) (a) and concentrations p lin(t) and p exp(t) (b) over the cell division cycle. (a) The protein copy number increases from P 0=7500 to 2P 0=15000 during a cell division cycle. Note that the protein synthesis rate doubles at time t x after each cell division, where the gene is replicated. (b) The corresponding protein concentration decreases transiently during the division cycle. This effect is more pronounced for linear volume growth (solid blue line) than for exponential volume growth (green dashed line) within the division cycle. The parameters are α=5000/T, T=60 min, t x =30 min, V 0=0.5 μm3 (Color figure online)

We now turn to the corresponding concentration of the protein. This will be denoted by p and is given by p=P/V, the number of protein molecules per cell divided by the cell volume V. It therefore also depends on the time course of the cell volume over the division cycle. The functional form of that time dependence has been debated for a long time, see for example a recent discussion in Ref. [13]. Here we use two models that have been proposed, namely linear and exponential growth of the cell volume, which we indicate by the subscripts ‘lin’ and ‘exp’, respectively.

We denote the cell volume at the beginning of a cycle by V(t=0)≡V 0. In a steady state of growth, this volume must have doubled at the end of the cell cycle, such that V(t=T)=2V 0. Using this constraint, the cell volume V(t) is given by

(3)

for linear and by

(4)

for exponential growth. As a consequence, the concentration of our protein at the beginning and at the end of a division cycle is equal, p(t=0)=p(t=T)=P 0/V 0. However, it decreases between divisions as the protein copy number initially grows more slowly than the volume. When the gene is duplicated, the protein copy number growth speeds up and becomes faster than volume growth and the concentration increases for times t x <t<T such that the concentration returns to its initial value. This temporary decrease of the protein concentration is more pronounced for linear than for exponential volume growth, as can be seen in Fig. 1(b). The extent of this decrease depends on the timing of gene duplication (which is dependent on the position of the gene with respect to the origin of DNA replication [7, 12]). For example, in the extreme case, where the gene is duplicated immediately after or before cell division, the protein content increases approximately linearly, and thus, for linear volume growth, the concentration is almost constant over the division cycle. We will come back to this point in Sect. 5.1, when we discuss the contribution of division cycle effects to the observed ‘noise’ in the protein content.

2.2 Population Averages

The dynamics described so far is observable in experiments that track the content of specific proteins in single cells. Such experiments have been done (e.g., [11, 43, 48]), although most of these studies were more focused on stochastic effects. In many experiments, however, what is observed is the population average of the protein content per cell. Unless the cell culture is specifically prepared to synchronize the division cycles of these cells, the population will consist of many cells (∼109 in a typical bacterial culture) that divide in an asynchronous fashion. Averages of cellular properties over such populations will in general not only depend on the dynamics of the observable over the division cycle, but also on the age distribution in the population, i.e. the distribution of the time points in the division cycle at which these cells are. The latter depends on the experimental setup. We consider two cases, an exponential and a constant age distribution. The exponential age distribution,

(5)

applies to asynchronous cultures with an exponentially growing population size, where there are more young cells than old cells. The average age of a cell in such a culture is \(\langle{t}\rangle=\int^{T}_{0}t\phi (t)\,dt=T (1/\ln2-1)\approx0.44 T\).

In addition we consider a constant age distribution, ϕ(t)=1/T, which is obtained if, for example, after each cell division only one of the daughter cells is kept and analyzed. An example of such an experimental setup is the ‘mother machine’ that was described recently [47].

The protein copy number per cell, averaged over such an exponentially growing population, is given by

$$ \langle P\rangle=\int_0^TP(t)\phi(t)\,dt= \frac{\alpha T2^{1-t_x/T} }{\ln2}.$$
(6)

This result can be rewritten as 〈P〉=αg〉/β eff with an effective degradation rate of the protein β eff=(ln2)/T that describes the loss of proteins due to cell division (with half-life equal to the doubling time of the cells), and the average copy number of the gene \(\langle g \rangle=2^{1-t_{x}/T}\). For comparison, the average protein copy number per cell in a population with constant age distribution is \(\langle P\rangle =\alpha T [3- 2t_{x}/T + \frac{1}{2}(t_{x}/T)^{2}]\). Notice that this is in general not equal to 3P 0/2. The numerical comparison with Eq. (6) shows that the average protein number is approximately 4 % larger with the constant age distribution than with the exponential age distribution.

The average concentration can be calculated in the same way, but is more involved due to the age-dependence of the volume. We give only the result for exponential volume growth and an exponential age distribution. In this case, we obtain

(7)

This can be compared to the ‘mean field’ result 〈p(t)〉≃〈P〉/〈V〉 that is obtained from the average protein number and the average volume. Using 〈V〉=(2ln2)V 0, that approximation leads to

$$ \langle p\rangle \simeq\frac{\alpha T\, 2^{1-t_x/T} }{2 (\ln 2)^2V_0}\simeq 1.04 \frac{\alpha T \langle g \rangle }{V_0}.$$
(8)

A numerical comparison with the exact result shows that they differ by less than 0.3 % for all values of the replication time t x . Likewise, we find that the average concentrations for linear and exponential volume growth also differ only by a few percent.

2.3 Averaging out the Cell Division Cycle

The observation that the ‘mean field’ approximation for the protein concentration given in Eq. (8) is rather accurate suggests that the dynamics on time scales that are longer than the generation time can actually be described by the following equation

$$ \dot{p}=\frac{\alpha \langle g\rangle }{\langle V \rangle } -\beta_{\mathrm{eff}} p,$$
(9)

with β eff=ln2/T as before (or β eff=β+ln2/T, if the protein is unstable). The equation can also be interpreted as describing the dynamics of the average concentration in a population of non-synchronized cells. Through β eff, the equation describes the loss of protein due to growth and division of the cells as an effective degradation. As protein concentration is actually diluted out by volume growth throughout the division cycle (in contrast to the protein number per cell, which experiences dilution through instantaneous reduction by 50 % at division), and thus its variations through the cycle are relatively small (Fig. 1b), this approximation can be expected to be quite good. The same approximation can also be used for the average protein copy number per cell, but there one has to keep in mind that variations over the division cycle that are neglected, are stronger, as the protein number P varies 2-fold over the cycle.

2.4 A Remark on Messenger RNA

Protein synthesis is a process that occurs in two steps, transcription and translation. In the first step, the gene sequence is copied into a mRNA, which subsequently serves as a template for protein synthesis. A more complete description of the process thus describes the copy numbers of the protein (P) and of the mRNA (M),

(10)

with α m α p and β m , β p being the growth and degradation rates of mRNA and protein, respectively. In many cases, however, mRNA is rather short-lived and one can approximate the equation for M by its steady state, M=α m g/β m . In that case, we are back to Eq. (1) with α=α p α m /β m .

This approximation is specifically suited for gene expression in bacteria, where typically mRNA lifetimes are of the order of a few minutes [5, 43], while proteins, as mentioned above, are mostly stable [31, 37]. This means that when a gene is turned off and synthesis of the corresponding mRNA and protein is stopped, the mRNA will disappear with a half-life of a few minutes, while the protein is diluted out through cell growth and division and its half-life is given by the doubling time, which is typically of the order of 1 hour (the range for E. coli is 20 min—many hours).

3 Sources of (Intrinsic) Stochasticity

As mentioned in the introduction, the copy numbers of some proteins can be small, so that fluctuations play an important role, and stochastic descriptions of the dynamics of gene expression are required. In general, all steps in the synthesis pathway of proteins are stochastic processes. The same is true for the degradation of the protein if that protein is unstable. In addition, the partitioning of the copies of that protein during cell division also adds to the noise. We will now consider these different sources of noise separately to characterize the noise arising from different sources in a systematic way.Footnote 1 In these considerations, we aim at understanding the relative importance of different sources of stochasticity rather than at accurately capturing the complicated processes that govern protein production in precise biological detail. Specifically we ask which sources contribute to the noise level observed in the protein number and whether there is a dominant source. In this sense, the most realistic model is the one that includes stochastic effects in all processes considered here, but we are interested in whether a reduced model may be sufficient.

We use a bottom-up approach to study the contributions from cell division, protein synthesis, and finally transcription and translation. We start with a stochastic version of the models described in Sect. 2.1, i.e. with models that treat protein synthesis as a simple one-step process. Effects that are due to the two-step nature of protein synthesis (transcription and translation) will be discussed later in Sect. 4. The most basic model thus describes protein synthesis and cell division, and we study three versions of this scenario. First, we take the partitioning of proteins into daughter cells upon division to be stochastic (Sect. 3.1), but describe protein synthesis deterministically. Second, we treat protein synthesis as a stochastic process but partitioning during cell division as deterministic (Sect. 3.2). Finally, both synthesis and cell division are considered as stochastic processes (Sect. 3.3). Our analysis shows that the two noise sources contribute similarly to the overall noise, so none of the noise source is dominant.

In Sect. 4, we discuss models that explicitly treat protein synthesis as occurring in two steps, transcription and translation. The resulting noise is then characterized in terms of a parameter termed ‘burst size’, that characterizes the average number of proteins synthesized per mRNA copy. Here, high burstiness leads to a significant increase in the noise with bursty protein synthesis then being the dominant source of stochasticity. Thus, under the conditions of high burstiness, therefore a reduced model that neglects other sources of noise can provide a realistic description of the dynamics.

All the sources of stochasticity we discuss here produce so-called intrinsic noise [15], i.e. the fluctuations are specific to the gene/protein under consideration and the fluctuations in the level of two different proteins are uncorrelated. Sources of extrinsic noise, which affects all genes will be discussed in Sect. 5.

3.1 Stochastic Partitioning During Cell Division

We first consider the case, where protein synthesis is described by a deterministic process, but where proteins are distributed stochastically into the daughter cells during cell division. Specifically, we consider the case, where each copy of the protein has probability r=1/2 to end in each of the two daughter cells. This means that in every generation a constant number Q=αT of proteins is newly synthesized, but the initial copy number of the protein at the beginning of the division cycle fluctuates due to the stochastic partitioning during cell division. Figure 2(a) shows a time series of such a process as obtained from simulations.

Fig. 2
figure 2

Stochastic models of protein synthesis: (a)–(c) Trajectories of the protein copy number from stochastic simulations with stochastic synthesis, cell division, or both, all with cell division modeled explicitly. (d) Noise strength η 2 as a function of the average protein copy number 〈P〉 (varied by varying the synthesis rate α) for the different models (for the models with explicit cell division, averages over cell immediately after division are plotted, i.e. \(\eta_{0}^{2}\) and 〈P 0〉). (e) Trajectory of the protein copy number for a model with implicit cell division, i.e. where cell division is described by an effective degradation rate β eff. The corresponding curve in (d) lies on top of the curve for stochastic synthesis and stochastic division. The parameter values used for these plots are α=0.5/min, T=40 min, and, in (e), β=ln2/T (Color figure online)

For this case, a number of characteristics can be obtained analytically using a method described in Ref. [8], which we summarize briefly in Appendix A.2. For example, the average copy number of the protein directly after cell division is 〈P 0〉=Q=αT and the variance of that number is \(\delta P_{0}^{2}=2Q/3\). Two commonly used characteristics of noise are the noise strength η 2 defined as

$$ \eta^2=\frac{\langle (P-\langle P\rangle )^2\rangle }{\langle P \rangle ^2}$$
(11)

and the Fano factor F=η 2P〉. η 2 typically scales as η 2∼1/〈P〉, so the latter parameter provides a characterization of the prefactor of that scaling. In our specific case, we obtain

$$ \eta_0^2=\frac {2}{3\langle P_0\rangle }$$
(12)

or F 0=2/3 (the index ‘0’ in these expressions indicates that we have taken averages over a population of cells immediately after division), plotted in Fig. 2(d).

3.2 Stochastic Protein Synthesis

Next we consider the stochasticity that is inherent in the protein synthesis process itself. To disentangle it from the effects of stochastic partitioning we first describe partitioning deterministically, i.e. we consider the case where each daughter cell inherits exactly one half of the protein molecules (Fig. 2b).

We consider again one lineage of cells. Between two cell divisions proteins are synthesized stochastically with rate α. At the time of cell division (integer multiples of the doubling time T), the protein number is divided by two (if the protein number P is an odd number, we take the number after division to be either (P+1)/2 or (P-1)/2, each with probability 1/2, so strictly speaking, there is a minimal remnant of stochasticity in our deterministic description of division as well). To keep the discussion simple, we assume here that the synthesis rate is constant, i.e. we neglect the fact that the synthesis rate changes upon duplication of the gene. We find

$$\langle P_0\rangle =\alpha T,\qquad\delta P_0^2=\frac {\alpha T}{3}\quad\mbox{and}\quad \eta_0^2=\frac {1}{3\langle P_0\rangle }.$$
(13)

The last result implies that the Fano factor is F 0=1/3, which is just half of what we have seen for stochastic partitioning (Eq. (12)).

3.3 Both Sources Combined

Now let us combine the two sources of stochasticity discussed so far and consider the case where both protein synthesis and partitioning are stochastic (Fig. 2c). Using again the method of Ref. [8], we obtain

$$ \langle P_0\rangle =\alpha T,\qquad\delta P_0^2=\alpha T,\quad\mbox{and}\quad \eta_0^2=1/\langle P_0\rangle .$$
(14)

Two points are noteworthy here: (i) The noise strengths (η 2) of independent noise sources are additive. In our case, the noise in Eq. (14) is the sum of the noise components that arise from stochastic partitioning (2/〈3P 0〉) and from stochastic synthesis (1/〈3P 0〉). (ii) The contributions from both sources of noise are of the same order, there is no dominant source of noise in this simple case.

For comparison, we also consider the corresponding model with implicit cell division, i.e. a stochastic version of Eq. (9), where the effect of protein dilution through cell growth and division is described by an effective degradation rate β eff=β+ln2/T. In this case, we end up with a simple birth-death process, where the number P of copies of our protein of interest increases with constant rate α and decreases with rate β eff P, described by the following master equation

(15)

where \(\mathcal{P}(p,t)\) is the probability to have P proteins at time t. The moments of that distribution in the steady state 〈P n〉 can easily be calculated by multiplying the master equation with powers of P and summing over P. For this type of model, the protein copy number does not exhibit the periodic behavior seen in the models with explicit cell division, but rather fluctuates around a constant mean value 〈P〉=α/β eff in the steady state (Fig. 2e). These fluctuations are characterized by η 2=1/〈P〉, so the Fano factor is the same as F 0 for the case with explicit cell division discussed before. This indicates that using models with implicit cell division (which by the choice of β eff are constructed to correctly describe the dynamics of the mean protein number on time scales that are long compared to the generation time T) also provide a good description of the fluctuations in such a system.

4 Bursts of Protein Synthesis

As discussed in Sect. 2.4, the two-step nature of protein synthesis can often be neglected as mRNA levels evolve on faster time scales than protein levels, and therefore the dynamics of mRNA can be approximated by its steady state. However, while absorbing the mRNA degrees of freedom into effective protein synthesis results in a correct description of the average protein level, it generally underestimates fluctuations, as it smoothens out the ‘bursty’ nature of protein synthesis resulting from the two-step process. This was realized first by Berg in 1978 [3] and has been studied extensively in recent years, as experimental techniques to count proteins in individual cells were developed [9, 32, 48].

To keep the discussion simple, we start with the stochastic version of Eq. (10), i.e. with a model that describes cell division by an effective protein degradation [44]. The mRNA part of Eq. (10), follows the same dynamics as the protein in Eq. (15) and is thus characterized by the same noise \(\eta_{M}^{2} =1/\langle M \rangle \) with 〈M〉=α m /β m . However the protein number, P, behaves differently and is characterized by 〈P〉=α m α p /β m β p and \(\eta_{P}^{2} =(1+b)/\langle P\rangle \) [44], where b=α p /(β p +β m )≈α p /β m is called the ‘burst size’ and describes the average number of proteins synthesized per copy of the mRNA or the amplification of transcription by translation. Experimentally determined burst sizes range between 1 and 10 [43, 48]. The increase in noise can be interpreted as an additional (independent) source of noise that arises from the stochastic amplification of the transcription output by translation. This additional noise is characterized by a noise strength b/〈P〉 that is added to the noise already present from stochastic protein synthesis and degradation/dilution in the absence of stochastic amplification.

The bursty nature of these processes is shown by cases with low transcription rate: In this case, protein synthesis events are rare (as transcripts are produced infrequently), but multiple copies of the protein are generated in every synthesis event. The increase in fluctuations for the case of bursty synthesis is illustrated in Fig. 3, where we plot trajectories for three cases with the same average protein number. In Fig. 3(a), protein synthesis is described by a single step with rate α=α m α p /β m α m b, in Fig. 3(b) and (c) protein synthesis is described as a two-step process. However, while in Fig. 3(b) the transcription rate is large and the translation rate is small (b≃0.2), the translation rate is large and the transcription rate is small in Fig. 3(c), resulting in bursty protein synthesis (with b≃5).

Fig. 3
figure 3

Burstiness of protein synthesis: (a)–(c) Trajectories of the protein copy number from stochastic simulations with (a) a one-step model of protein synthesis, (b) a two-step model (transcription and translation) with low burstiness, and (c) a bursty two-step model. All three cases are for implicit cell division and exhibit the same average protein copy number. (d) Noise strength η 2 for bursty protein synthesis with exponential burst size distribution (as in the two-step models) or with constant burst sizes as a function of the average protein copy number (varied by varying α m ). (e) Fano factor for models with either implicit or explicit cell division as function of the burst size b. The parameter values are (a) α=2/min, β eff=0.01/min, (b) α p =0.4/min, β p =0.01/min, α m =10/min, β m =2/min, (c) α p =10/min, β p =0.01/min, α m =0.4/min, β m =2/min, (d) β p =0.01/min, and (e) β p =0.01/min, α m =2/min, β m =5/min, T=60 min (Color figure online)

It is worth mentioning here that the bursts on the one hand amplify the noise from transcription, but on the other hand also create additional noise as the size of each burst is a stochastic quantity. To disentangle these two effects, we determine the noise strength η 2 for a one-step model of protein synthesis, where however b copies of the protein are produced in every synthesis event. In that case, the burst size does not fluctuate, but bursts can still amplify the noise from the one-step synthesis process that mimics transcription. This case can be solved using a modified version of the master equation (15)Footnote 2 and leads to a noise strength η 2=(1+b)/(2〈P〉) with 〈P〉=b×α/β eff. This is exactly half of what we have obtained for exponentially distributed burst sizes in the two-step model (see also Fig. 3(d), where we plot the noise strength for both constant and exponentially distributed burst sizes).Footnote 3 This result indicates that the two effects of bursting contribute equally to the increased noise.

The model discussed so far describes cell division implicitly as an effective protein degradation, but models with explicit stochastic cell division exhibit the same burstiness behavior. This is shown in Fig. 3(e), where we plot the Fano factor \(F_{P,0}=\eta_{P,0}^{2}\langle P_{0}\rangle \) for a model where both protein and mRNA are divided stochastically between daughter cells as in Sect. 3.3. F P,0 shows the same dependence on the burst size except for a different prefactor of the linear term (≈ln2), which arises from the fact that averages are taken over slightly different populations (over cells immediately after division vs. over age-less cells representing an average over the division cycle).

Finally, we want to mention that burstiness can also arise from other physical processes than from multiple translations of a transcript. For example, bursts have been demonstrated experimentally to occur on the level of transcription [16], which can be interpreted as resulting from the stochastic switching of the gene between two activity states (transcription ‘on’ or ‘off’). The molecular origin of these activity states remains however unclear,Footnote 4 although several mechanisms have been proposed (e.g. states of chromosome structures, binding/unbinding of transcription factors, etc. [30, 45]). In a genome-wide study, the Fano factors for mRNA were found to range mostly between 1 and 2, larger than what is expected for a single-step (Poisson) synthesis, but not much larger [43].

5 Extrinsic Noise

So far, we have discussed intrinsic noise in gene expression, i.e. noise that is specific to a particular gene or protein and results from the inherent stochasticity of the synthesis and degradation of that protein. As we have seen, a characteristic property of intrinsic noise is its scaling proportional to the inverse of the average protein number in the cell. We now turn to extrinsic noise i.e. fluctuations of cellular parameters that affect all genes/proteins in a cell. Such noise has first been demonstrated by a study of the correlations between the reporter proteins expressed from two copies of the same operon [15]. For highly abundant proteins, intrinsic noise becomes negligible and the extrinsic component of the noise, which does not depend on protein abundance, is dominant with fluctuations of about 30% in the protein concentration as shown by a study of a library of fluorescent reporter proteins [43]. There are many possible sources of extrinsic noise such as fluctuations in the concentrations of essential components of the transcription and translation machinery or mRNA degradation enzymes (RNA polymerases, ribosomes, RNases). Here we consider two effects that should be present even if such fluctuations are suppressed by feedback mechanisms for the synthesis of these machines: cell-to-cell variations arising from different cell ages in a population (Sect. 5.1) and effects due to fluctuations in the growth rate (Sect. 5.2).

We note that another definition of extrinsic and intrinsic noise has been given in Ref. [33]. There, the distinction between extrinsic and intrinsic noise is not based on distinguishing a specific genetic system and its environment, which affects different genes in the same way, but on the dependence of the noise on the average protein number. One component of the noise exhibits the characteristic 1/〈P〉 behavior and is classified as intrinsic, while the component of the noise that does not exhibit this behavior and depends on the fluctuation of a variable that influences the protein synthesis rate is classified as extrinsic. The two cases we consider here are extrinsic according to both definitions, but based on the definition of Ref. [33], one could, for example, consider the noise from transcription as extrinsic to translation.

5.1 Effects of the Division Cycle

In Sect. 2, we have seen that the protein concentration varies systematically over the course of a division cycle. In a population of non-synchronized cells, this age-dependence of the protein content is observed as a cell-to-cell variation that forms part of the extrinsic noise. To study the effect of age-dependent protein content and to estimate what part of the extrinsic noise can be understood from such deterministic variation, we now determine the distributions of the protein number and concentration over the division cycle. As for the average protein number calculated in Sect. 2, we have to take the age distribution of the experimental culture into account. We consider again the case of a single lineage and of an exponentially growing population, i.e. a constant and an exponential age distribution as given in Eq. (5). We denote the resulting protein copy number distributions by Φ(P) and Ψ(P). They can be calculated by inverting the time dependence of P(t) and using the inverse relation t(P) for a transformation of variable in the age distribution, see Appendix A.3.

The distributions for both types of cell culture are presented in Fig. 4. Panel (a) shows the distributions of protein number and concentration for a single lineage, Φ(P) and Φ(p), respectively. The concentration distribution was determined for both linear and exponential volume growth. The distribution of the protein copy number, Φ(P) (top panel), exhibits two flat plateaus. The probability to find a protein number P<P(t=t x ) that is seen prior to the replication time t x is twice as high as for a protein number that corresponds to larger times, P(t>t x ), as the synthesis rate doubles at time t x .

Fig. 4
figure 4

Distributions for protein number and concentrations as arising from the deterministic variation over the division cycle. (a) Distributions for the protein number Φ(P) and concentration Φ(p lin(t)) and Φ(p exp(t)) (for the case of linear and exponential volume growth, respectively) for a single lineage. (b) Distributions for the protein number Ψ(P) and concentrations Ψ(p lin) and Ψ(p exp) for an exponentially growing cell population with age distribution ϕ(t). The parameters are as in Fig. 1 (Color figure online)

For the concentration subject to linear volume growth (middle panel), Φ(p lin(t)) is almost flat with a minimum for intermediate concentrations. In the case of exponential volume growth (bottom panel), Φ(p exp(t)), which is quite flat for small concentrations, rises sharply towards the maximum concentration.

It is worth noting that while the protein copy number exhibits a broad distribution over a two-fold range, defined by the copy numbers directly before and after cell division, the range over which the concentration varies is much smaller: The maximal concentration is only ≈13 % larger than the minimal concentration for linear volume growth and even less (≈6 %) for exponential volume growth.

Figure 4(b) shows the corresponding results for an exponentially growing culture (with an exponential age distribution). The distribution for the protein number (top panel), Ψ(P), still exhibits two plateaus, which are tilted towards smaller values of P as the age distribution gives more weight to younger cells. The distributions of the concentrations (middle and lower panel), Ψ(p lin) and Ψ(p exp), are not radically altered by the change in age distribution.

Next, we determine the noise parameter that characterizes the variation over the division cycle, which in analogy to Eq. (11) can be defined as

(16)

for the protein copy number and likewise, \(\eta ^{2}_{p_{\mathrm {lin}}}=\delta p_{\mathrm {lin}}/\langle p_{\mathrm {lin}}\rangle^{2}\) or \(\eta ^{2}_{p_{\exp }}=\delta p_{\exp }/\langle p_{\exp }\rangle^{2}\), for the concentration (for linear and exponential volume growth, respectively). One parameter that affects the extent of this deterministic cell-to-cell variation is the replication time t x , which depends on the genomic location of the gene of interest relative to the origin of replication. In Fig. 5, we show the noise parameters for the protein copy number and the concentration as functions of t x in the range of 0≤t x T.

Fig. 5
figure 5

Noise parameter η 2 arising from deterministic variations over the division cycle for (a) the protein number P and (b) for the protein concentration, \(\eta ^{2}_{p_{\mathrm {lin}}}\) and \(\eta ^{2}_{p_{\exp }}\) (with linear and exponential volume growth) as functions of the replication time t x (Color figure online)

We first note that these noise parameters only depend on T and t x , or, more precisely, on their ratio t x /T. Specifically they are independent of the protein synthesis rate α and, in case of the concentrations, the initial cell volume V 0. Therefore, this contribution to the observed noise does not decrease with increasing protein concentration and does not become negligible for abundant proteins. However, the overall contribution of the division cycle to the noise is relatively small. For the protein concentration, the noise \(\eta ^{2}_{p_{\mathrm {lin}}}\) in the case of linear growth is on the order of 0.001 (solid line in Fig. 5b), and for exponential growth \(\eta ^{2}_{p_{\exp }}\leq5\cdot 10^{-4}\) for all values of t x . These values correspond to 2–3 % variation of the concentration and are considerably smaller than the observed extrinsic noise, which is of the order of η 2≃0.1 [43]. For the protein copy number, which varies over a wider range, the effect of the division cycle is more pronounced and varies by about one fourth of its mean over different values of t x . However, even here, the absolute value of η 2, being on the order of 0.04, remains rather small. We can thus conclude that, while the division cycle contributes to the observed extrinsic noise, other sources of extrinsic noise are more dominant.

5.2 Fluctuations of the Growth Rate

The fluorescent reporter protein library study of Taniguchi et al. mentioned above [43] showed that abundant proteins exhibit extrinsic noise that does not display the inverse scaling with the mean protein concentration. The same study also revealed some additional characteristics of that noise: In particular, (i) there are correlations between the noise of different extrinsic proteins, a defining feature of extrinsic noise [15], and (ii) the extrinsic fluctuations are slow, with variations in the protein concentration over timescales longer than the generation time [43]. Moreover, they come together with substantial fluctuations of the generation time. We therefore ask now whether fluctuations in the growth rate may substantially contribute to the observed extrinsic noise.

For an estimate of the effect of a fluctuating growth rate, we make the assumption that while the doubling time fluctuates slowly, the protein synthesis rate per cell volume α/V remains approximately constant. This condition is (approximately) satisfied by the population average of the synthesis rate as a function of growth rate when the growth rate is systematically varied by using different growth media [24]. It basically means that changes of the growth conditions, while affecting the synthesis rate of protein numbers, do not affect the rate of synthesis of protein concentration. Only the effective degradation is changed when the growth rate changes. Under balanced growth, this constancy is the result of the combination of several factors (such as the availability of RNA polymerases and ribosomes, the gene copy number etc. [22, 24]) that do change, but in such a way that their combined effect cancels out (with the exception of conditions of very slow growth) [24]. Obviously, it is not clear that this assumption holds for slowly varying growth rates in individual cells; in principle, all factors that contribute to the growth-rate dependence of protein concentrations could vary in a mutually independent fashion, but we can consider the case where they vary together as one that provides a lower limit for the resulting noise. With a deterministic description of protein synthesis, we obtain p=(α/VT/ln2, so fluctuations of T are directly carried over into fluctuations of the protein concentration p. If T fluctuates by some time ΔT (of about 10–25 % of the doubling time), p will fluctuate by Δp=αΔT/(Vln2) or also about 10–25 %, as Δp/p=ΔT/T. This would correspond to a noise parameter η 2 of 0.01–0.08. While this simple estimate is certainly not an accurate description of such global noise, it clearly indicates that fluctuations in the growth rate can lead to noise in protein concentrations of the order of the observed extrinsic noise [43].

6 Concluding Remarks

In this article, we have discussed several ways of describing gene expression with deterministic or stochastic models. Deterministic models that explicitly describe cell division, gene duplication, and volume growth provide a detailed description of the dynamics over both short and long time scales (compared to the doubling time). We have shown that the results depend generally on specific details of the model such as how volume growth is implemented and the age structure of the population over which averages are taken. Fortunately, however, these differences are not dramatic. Moreover, a mean-field-like approximation that describes protein synthesis by an effective rate per volume given by the average gene copy number and the average cell volume provides a good approximation that averages over the detailed dynamics within the division cycle. Nevertheless, it is worth keeping in mind that there are all these subtle effects as well as to carefully distinguish different normalizations of protein amounts or synthesis rates such as per gene (e.g. α), per cell (αg〉) and per volume (αg〉/〈V〉). This is particularly important for studies that address the coupling of gene expression and global cellular physiology, where quantities such as the average gene copy number and the average volume per cell may change [24].

With respect to the fluctuations around this average behavior, we have compared several simple models to disentangle the contributions of different sources of noise. This comparison shows that the noise contributions from sources such as stochastic protein synthesis or degradation and stochastic partitioning during cell division are all of the same order and that there is no single dominant noise source, except when protein synthesis is pronouncedly bursty. The burstiness of protein synthesis is the largest contribution to the noise (with a Fano factor ≈b, while the other noise sources have Fano factors of fractions of 1). If b is large, this is clearly dominant, and one could neglect all other sources of noise. The study of Taniguchi et al. [43], however, indicates that typical values of b for many low-abundance proteins are in the range 1–10 and thus are not necessarily very dominant. In many cases, a realistic description of the dynamics of expression of low-abundance proteins will therefore need to include all these sources of noise.

For intermediate-abundance to high-abundance proteins (with 〈P〉>20), the noise is dominated by extrinsic noise [43]. Here we have considered two sources of extrinsic noise: We have shown that the deterministic contribution from systematic variation over the division cycle is rather small (even for the protein copy number, but in particular for the concentration), while fluctuations in the growth rate can be expected to give a larger contribution. These results suggest that a model that incorporates the burstiness of protein synthesis and fluctuations in the growth rate might provide a minimal description of stochastic effects in gene expression that is able to describe both intrinsic and extrinsic components of the noise.