## Abstract

A theory for an non-equilibrium phase transition in a driven biochemical network is presented. The theory is based on the chemical master equation (CME) formulation of mesoscopic biochemical reactions and the mathematical method of large deviations. The large deviations theory provides an analytical tool connecting the macroscopic multi-stability of an open chemical system with the multi-scale dynamics of its mesoscopic counterpart. It shows a corresponding non-equilibrium phase transition among multiple stochastic attractors. As an example, in the canonical phosphorylation–dephosphorylation system with feedback that exhibits bistability, we show that the non-equilibrium steady-state (NESS) phase transition has all the characteristics of classic equilibrium phase transition: Maxwell construction, a discontinuous first-derivative of the ‘free energy function’, Lee–Yang's zero for a generating function and a critical point that matches the cusp in nonlinear bifurcation theory. To the biochemical system, the mathematical analysis suggests three distinct timescales and needed levels of description. They are (i) molecular signalling, (ii) biochemical network nonlinear dynamics, and (iii) cellular evolution. For finite mesoscopic systems such as a cell, motions associated with (i) and (iii) are stochastic while that with (ii) is deterministic. Both (ii) and (iii) are emergent properties of a dynamic biochemical network.

## 1. Introduction

The theory of equilibrium phase transitions is one of the deepest branches of the physics of matters (Goldenfeld 1992). The physics of living matters, however, has to deal with molecular systems under driven non-equilibrium conditions (Qian 2006; Karsenti 2008; Phillips *et al.* 2008). One of the challenging questions in biological physics is whether non-equilibrium phase transitions play an important role in living systems.

In the theory of matters, microscopic, stochastic molecular fluctuations disappear in the thermodynamic limit in which deterministic nonlinear behaviour arises. However, in the mesoscopic world of cellular biology, complex dynamics with multiple timescales makes the meaning of *thermodynamic limit* only relative to one level of description with respect to another. More specifically, we shall show in the present paper that there are three biologically significant timescales, with related levels of mathematical description: (i) stochastic molecular signalling, (ii) deterministic, nonlinear biochemical network dynamics, and finally (iii) stochastic (again!) cellular evolution. In other words, there is stochastic behaviour beyond the deterministic dynamics; all three levels are contained in a mesoscopic living (driven) system. Current cellular molecular biology chiefly focuses on (i) while increasingly interested in (ii); however, it is the (iii), we believe, that is most relevant to major cellular biological issues such as differentiation, apoptosis, cancer immunoediting, and epi-genetics.

Our conclusion is reached through a general mathematical theory, together with an application in terms of a detailed analysis of a simple cellular signalling module: a phosphorylation–dephosphorylation cycle (PdPC) with feedback (Ferrell & Xiong 2001; Qian & Reluga 2005; Qian 2007). The PdPC is one of the most important components in cellular biology. Furthermore, the GTPase together with guanine nucleotide exchange factor and GTPase activating protein, another important cellular system, is kinetically isomorphic to the PdPC.

We formulate our theory starting with the chemical master equation (CME) description of biochemical reaction systems in a spatially homogeneous mesoscopic volume. In recent years, CME has emerged as one of the physiochemical foundations of cellular biochemistry (Samoilov & Arkin 2006; Gillespie 2007; Beard & Qian 2008; Liang & Qian 2010). The theory had begun in 1940 and went through a major development in 1960s and 1970s (Delbrück 1940; McQuarrie 1967; Nicolis & Prigogine 1977). In particular, the Brussels school has used this theory as a mathematical basis of non-equilibrium steady state (NESS), a term first proposed by Gaspard (2004), Lax (1960) and Klein (1955). It is now widely accepted that both concepts of CME and NESS are appropriate for studying isothermal, homeostatic cellular biochemistry (Qian 2006; Karsenti 2008; Phillips *et al.* 2008). The mathematical theory of NESS is an irreversible, but stationary stochastic processes, associated with which the concepts of entropy production and stationary distribution naturally arise (Bergmann & Lebowitz 1955; Luo *et al.* 1984; Jiang & Qian 2004; Kawai *et al.* 2007; Parrondo *et al.* 2009).

It is known that the deterministic nonlinear dynamics, derived from the CME in the limit of the reaction system volume *V* → ∞ according to Kurtz (1972)'s theory, is the macroscopic counterpart, in terms of the Law of Mass Action, of the chemical reaction system (Samoilov & Arkin 2006; Gillespie 2007; Beard & Qian 2008; Liang & Qian 2010). While this is certainly true, here we refine this notion by studying the large-deviation properties of *V* → ∞, i.e. the thermodynamic limit. We shall show that in the case of a nonlinear dynamical system with multiple dynamic attractors, there is a unique macroscopic thermodynamic state; all the other macroscopic attractors are in fact metastable, with an infinitesimal stationary probability ∝e^{−βV} but also exponential small exit rate ∝e^{−αV}(*α*, *β* > 0).

The mathematical theory of large deviations (LDT; Dembo & Zeitouni 1998) is the natural device for understanding the thermodynamic limit of systems with multistability, i.e. phase transition(s) (Ellis 1985; Touchette 2008). Our result based on the LDT rigorously establishes the stochastic dynamics with bi(multi)-modal distribution as the mesoscopic signature of a nonlinear dynamics with bi(multi)-stability. While our specific example below is a simple one-dimensional model, the analysis based on the LDT is general for systems with high dimensions, and without detailed balance.

We have recently re-examined the nonlinear bistability in the context of the biochemical signalling module (Ge & Qian 2009). In the thermodynamic limit when *V* tends to infinity, there is a phase transition associated with the conventional nonlinear dynamic approach based on the Law of Mass Action, which is the macroscopic limit, in some sense, of the CME (Kurtz 1972). A Maxwell-type construction is an integral part of a complete theory of the CME (Ge & Qian 2009). This phase transition cannot be predicted by the deterministic nonlinear dynamics. Stochastic dynamics provides relative significance, i.e. probability, of different deterministic attractors.

In equilibrium phase transition, the Lee–Yang theorem for grand canonical partition function is widely considered to be a deep and elegant result (Zimm 1951; Lee & Yang 1952; Yang & Lee 1952). We shall show non-differentiability of a function *c*(*λ*), the NESS counterpart of the free energy function, is the origin of multi-phase behaviour, and it is because a zero of *G*(*λ*), the NESS counterpart of a partition function, reaches the real axis. Different generalization of the Lee–Yang theory to NESS and to bimodalities can be found in Blythe & Evans (2002), Chomaz & Gulminelli (2003) and Touchette (2005).

While this study focuses on spatially homogeneous systems, the stochastic theory of non-equilibrium transitions in space and time has also been used, quite successfully, to describe spontaneous phase separation in biological systems (Elf & Ehrenberg 2004; Fange & Elf 2006).

## 2. The biochemical system and its NESS phase transition

We shall present the mathematical theory through a concrete example. It should be noted, however, that the mathematical theory can be applied to more complex biochemical systems with the CME. We consider a simple biochemical signalling system, the PdPC with positive feedback, which exhibits nonlinear bistability (Ferrell & Xiong 2001; Qian & Reluga 2005; Qian 2007; Ge & Qian 2009):
2.1
in which *K* and *K** are inactive and active forms of a kinase, *P* is a phosphatase. *E** is the phosphorylated *E*, a signalling molecule. Usually *E** is functionally active, i.e. ‘turned-on’, and the reversible binding is rapid. The dynamics of the fraction of phosphorylated *E*, *x*, thus satisfies
2.2
in which the parameter *θ* represents the ratio of the activity of the kinase to that of the phosphatase; *ɛ* represents the ADP to ATP concentration ratio, and *μ* represents the strength of phosphorolysis. −*k*_{B} ln(*μ**ε*) = Δ*G* represents the ATP hydrolysis energy. In a living cell, both *μ* and *ε* are small; hence *γ* = 1/(*μ**ε*)≫1.

### 2.1. Large deviations theory, Maxwell construction and first-order phase transition in a NESS

The CME for the chemical reaction system in equation (2.1) gives the probability of *N*_{V}, the random variable representing the activated signalling molecule *E**, *V* being the volume of the system (Ge & Qian 2009). Let *p*_{V}(*n*) be its stationary distribution. According to the classic result of LDT (Ellis 1985; Dembo & Zeitouni 1998; Touchette 2008), especially (Dembo & Zeitouni 1998, § 4.5.2), it is concluded that if *N*_{V}/*V* satisfies the LDT with a good ‘rate function’ *ϕ*(*x*), i.e. *p*_{V}(*n*) ∼ e^{−Vϕ(x)}, the concentration *x* = *N*_{V}/*V* ≥ 0, then

— For each

*λ*, the ‘free energy function’*c*(*λ*) = lim_{V→∞}1/*V*ln〈e^{λNV}〉 exists, and it is finite and non-decreasing. Moreover it satisfies 2.3Please see appendix for large deviations theory in equilibrium phase transition. It clearly illustrates the meaning and definition of free energy

*c*(*λ*) and*ϕ*(*x*) introduced in our non-equilibrium studies.— If

*ϕ*(*x*) is convex, then it is the Fenchel–Legendre transform of*c*(*λ*), namely, 2.4— If

*ϕ*(*x*) is not convex, then*c**(*x*) is the affine regularization of*ϕ*(*x*), i.e.*c**(·) ≤*ϕ*(·), and for any convex rate function*f*such that*f*(·) ≤*ϕ*(·) implies*f*(·) ≤*c**(·).

The most important feature of the LDT property *p*_{V}(*n*) ∼ e^{−Vϕ(x)} is that the function *ϕ*(*x*) is independent of *V*, provided that *V* is sufficiently large. Therefore, even though *ϕ*(*x*) could exhibit bi- or multi-minima, when *V* → ∞ only one of the two minima (i.e. stable fixed points) is relevant in the thermodynamic limit, and it is the one with smaller *ϕ*(*x*). A Maxwell-like construction, therefore, is necessary at the critical parameter value when the two minima are equal (Ge & Qian 2009).

Consequently, according to the well-known Gärtner–Ellis theorem (Ellis 1985; Touchette 2008), we know that when *ϕ*(*x*) is bimodal with two local minima, then they are at different heights if and only if the *c*(*λ*) is differentiable at *λ* = 0, and d*c*(0)/d*λ* is simply the position of the lower minimum. This implies that the Maxwell-type construction where the two minima are at the same height (Ge & Qian 2009) corresponds to the function *c*(*λ*) being non-analytic at *λ* = 0. Further, if the rate function *ϕ*(*x*) is analytic, then *c*(*λ*) is continuous and
2.5

In classical equilibrium phase transition theory, a first-order phase transition has a discontinuity in the first derivative of the free energy, and a second-order phase transition has a discontinuity in the second derivative. According to this classification, the present (non-equilibrium) phase transition can be considered as first order. Another classification in engineering has been the ‘soft-’ and ‘hard-mode’ of instability. Transcritical and supercritical Hopf bifurcations are soft since their bifurcation diagrams are continuous; the saddle-node and subcritical Hopf bifurcations are hard since a pair of stable and unstable attractors appear ‘out of the blue’.

Moreover, if a non-convex *ϕ*(*x*) has two local minima of *ϕ*(*x*) with equal height, for sufficiently small *λ* < 0, *c*(*λ*) = *λ**x* − *ϕ*(*x*) for the *x* near the left minima satisfying *ϕ*′(*x*) = *λ*; and when *λ* > 0, also *c*(*λ*) = *λ**x* − *ϕ*(*x*) for the *x* near the right minima satisfying *ϕ*′(*x*) = *λ*. Therefore, the left and right derivatives of the function *c*(*λ*) at *λ* = 0 both exist but are equal to the left and right local minima, respectively.

If *ϕ*(*x*) has two local minima *x*_{1} and *x*_{2} with different heights, one can rewrite *λ* *x* − *ϕ*(*x*) = *λ*′*x* − *ψ*(*x*) where *ψ* (*x*) = *ϕ*(*x*) − *λ***x* such that *ψ*(*x*) has two minima with equal heights and *λ*′ = *λ*− *λ**. Hence, the non-analytic point of *c*(*λ*) moves to *λ**, and also the left and right derivatives at *λ* = *λ** both exist and are equal to the left and right local minima of *ψ*(*x*), respectively. In other words, *λ** is just the slope of the tangent line of *ϕ*(*x*) with exactly two tangent points. More generally, if the non-convex *ϕ*(*x*) has *k* tangent lines with more than one tangent points, then the function *c*(*λ*) has *k* non-analytical points, and vice versa. So it is a ‘higher level’ of convexity!^{1}

The above LDT results are summarized in figure 1. It shows that, as in Lee–Yang's theory (Zimm 1951; Lee & Yang 1952; Yang & Lee 1952), *c*(*λ*) is continuous but non-differentiable at *λ* = *λ**.

Now let us consider another parameter *θ* of the system. Let it be a bifurcation parameter in the nonlinear dynamics according to the Law of Mass Action (Ferrell & Xiong 2001; Qian & Reluga 2005; Qian 2007). It is easy to verify (Ge & Qian 2009) that the stable and unstable fixed points of the nonlinear dynamics correspond precisely with the minima and maxima of the *ϕ*(*x*), and bistability corresponds to double-wells in *ϕ*(*x*), and bimodality of −*ϕ*(*x*).

Here, consider the function (1/*V*)ln〈e^{λNV}〉 = *c*_{V}(*λ*, *θ*). As *V* tends to infinity, the limit *c*(*λ*,*θ*) exists, and it is continuous and a non-decreasing function of *λ*. Furthermore, there is a line in the (*λ*,*θ*) plane at which the *c* is non-differentiable with respect to *λ*. The line passes through (0,*θ**), where *θ** is the critical value of Maxwell construction with which the function *ϕ*(*x*) has two minima with equal heights. Such a ‘singularity line’ in the (*λ* − *θ*) space divides the space into upper left and lower right parts. They represent two phases. See figure 2.

In our theory, the derivative at *λ* = 0 is particularly meaningful: it is the mean concentration of molecules in the system (property of a generating function). In the thermodynamic limit, the mean and the highest peak position of e^{−Vϕ(x)} are the same, the macroscopic value. Thus, we understand that the Maxwell construction implies the mean concentration is not continuous.

The above LDT is not limited to one-dimensional systems, nor systems with detailed balance. Thus, it provides a heuristic derivation of the phase transition behaviour in any stochastic system with non-convex *ϕ*(*x*). Stochastic CME with deterministic multi-stability has non-convex *ϕ*(*x*) (Kubo *et al.* 1973; Graham & T'el 1984; Hu 1986). The present work establishes the correspondence between macroscopic, open chemical systems with multi-stability and the mesoscopic, multi-scale dynamics with non-equilibrium phase transitions (Ge & Qian 2009).

### 2.2. Mean dynamics of the CME with bistability

The thermodynamic, or *V* → ∞ limit of a CME is well understood following the work of (Kurtz 1972; Beard & Qian 2008), together with a Maxwell construction. The significance of the Maxwell construction is that in the thermodynamic limit, there is only one fixed point of a bistable system, not two, except at a critical value of the parameter—the phase transition point. However, it is important to point out that the mean, deterministic dynamics of a CME with finite *V* can be significantly different from that of the thermodynamic limit, with or without Maxwell construction. We shall now discuss this aspect of the CME with bistability.

From the CME, one can see that the locations of the two ‘fixed points’ of a bistable system are independent of the system's size. With increasing *V*, the system's fluctuations diminish because the probability associated with one of the fixed points tends to zero. If it had been otherwise, the systems fluctuations would not become zero in the thermodynamic limit. For a system with finite *V*, the correct mean dynamics has a slow component that is absent in the thermodynamic limit: that of Markovian jumping between the two attractors, ‘back and forth’, and establishing a steady-state distribution between them. When *V* → ∞, this slow component disappears.

Now consider this scenario: let *x*_{0} be in the basin of attraction of the less stable steady state, say *x*_{1}*. Then according to Kurtz's theorem, in the limit of *V* → ∞, the macroscopic *x*(*t*) with *x*(0) = *x*_{0} approaches to the *x*_{1}*. However, consider the mean dynamics for finite *V*: x̄_{V}(*t*) = 〈*X*(*t*)〉/*V* with the same initial condition. Then we have the following manifestation of Keizer's paradox (Ge & Qian 2009):
2.6

To obtain the mean dynamics from the CME, one can write the differential equations for the mean dynamics based on the CME. Except for linear systems, however, the equation is coupled to the variance and higher order moments of the distribution. A widely used treatment at this point is the ‘moment closure’ which either assumes fluctuations to be small, or certain relationships between its high-order moments and the variance (Gómez-Uribe & Verghese 2007; Kim *et al.* 2008). Assuming small fluctuations, however, is an operation on the order of 1/*V*, the same order approximation as linear Gaussian treatment.^{2} To device relationships between high-order moments and the variance requires certain *a priori* assumption on the nature of the distribution. Knowing the number of attractors, combining with Gaussian approximation within each attractor, gives a useful method of attack.

### 2.3. Generalizing Lee–Yang's theory

In equilibrium phase transition, according to Yang & Lee (1952), Lee & Yang (1952) and Zimm (1951), the non-analyticity in the free energy function *c*(*λ*) is due to a zero in the partition function *G*_{V}(*λ*) = 〈e^{λNV}〉 approaching the real axis from the complex plane of *λ*. Is the non-analyticity in our *c*(*λ*) also due to the zero of *G*(*λ*) = lim_{V→∞} *G*_{V} (*λ*)? This is indeed the case.

Our probability distribution for *N*_{V} has a finite support. So the generating function is a finite-order polynomial of *z*, (use *z* = e^{λ}). Then consider a region of the complex plane of *z*, which contains a section of the *z*-axis. According to Yang & Lee (1952), Yang & Lee (1952, theorem 2) and Zimm (1951) which is a pure mathematical result, the zero of the generating function must be ‘pinched’ onto real *z*-axis at the non-analytic point of the free energy function *c*(*λ*) when *V* tends to infinity. Therefore, our theory generalizes the Lee–Yang theory to NESS phase transition.

Several previous works have generalized the Lee–Yang theory in NESS (Blythe & Evans 2002; Chomaz & Gulminelli 2003; Touchette 2005) through specific examples. It has been suggested that the bimodal distribution could imply the Lee–Yang theory, but not vice versa. This is consistent with our result.

### 2.4. Cusp catastrophe and critical point in a PdPC with feedback

For large system's volume *V*, the CME gives the stationary probability *p*^{ness}(*x*) ∝ e^{−Vϕ(x)}. For the system in equation (2.1), the LDT rate function *ϕ*(*x*) satisfies Ge & Qian (2009)
2.7

One sees that the extrema of *ϕ*(*x*) match exactly with the roots of *r*(*x*;*θ*,*ε*) = 0 in equation (2.2).

Equation (2.2) exhibits saddle-node bifurcations. Figure 3*a* shows the steady states of equation (2.2), *x*^{ss}, as a function of *θ* with various *ε*. We see for the range of *ε* ≤ 1.33 the system has three fixed points, i.e. bistability. After introducing the Maxwell construction for each and every curve *x*^{ss}(*θ*), we obtain a set of monotonic *x*^{ss}(*θ*). This corresponds to the ‘*PV*-isotherm’ in the van der Waals theory of phase transition.

Equation (2.2) also exhibits cusp catastrophe. The boundary of the region of bistability in (*θ*,*ε*) space is given by a parametric curve (Ge & Qian 2009):
2.8

According to the cusp catastrophe theory (Zeeman 1976; Thom 1977), one of the most important features of this region is that it has a cusp, at *θ*^{cusp} = (1 + *μ*)^{2}/3 *μ*, *ε*_{cusp} = (1 − 8*μ*)/9*μ*, when *z*_{cusp} = 3*μ*/(1 + *μ*), as shown in figure 3*b*.

For a given *ε*, the critical *θ** at which the Maxwell construction is performed satisfies *θ*_{1} ≤ *θ** ≤ *θ*_{2}. Thus, the critical line *θ**(*ε*) abruptly terminates at the cusp. In equilibrium phase transition, the cusp is also known as critical point (Gaite 1990).

We also note that bistability implies that the *x*^{ss}, as a function of the *θ*, or *ε*, is not monotonic (it is S-shaped as in figure 2*a*). However, after the Maxwell construction, the resulting *x*^{ss}(*θ*) is monotonic in the ‘true’ thermodynamic limit, shown in figures 2*d* and 3*a*. This is precisely the same situation as the *PV* isotherm in gas–liquid phase transition. The word ‘true’ means one has to wait sufficiently long to allow the jumps back and forth between attractors. The biochemical significance of monotonicity remains to be elucidated.

## 3. Discussion

While the CME as a fundamental theory of studying cellular biochemistry remains to be validated experimentally, it is certainly an acceptable mathematical model for studying mesoscopic complexity and emergent organization, as referred to by Laughlin *et al.* (2000) and Qian *et al.* (2009). Chemical reactions are marvellous systems for understanding complexity. The present work shows that while Kurtz's theorem is correct for finite time, the stationary solution with *V* tending to infinity is not the solution to the law of mass action, but rather requires an LDT treatment. From the CME point of view, the LDT treatment we present is a small but significant step beyond the Kurtz (1972) theory towards the macroscopic nonlinear dynamics. Analysing the CME is a much more challenging problem than analysing the partition function since the former offers a dynamic theory.

Furthermore, the present paper shows that many classical concepts from equilibrium phase transition can be applied to the bifurcation problem in nonlinear chemical dynamics that has a mesoscopic stochastic basis in terms of the CME. The celebrated Maxwell construction is a natural consequence of the general theory we propose, and the well-known Lee–Yang theorem is in fact a special case of it. Most importantly, the general theory is applicable to driven systems with NESS.

On the mathematical side, the general theory provides a framework to study nonlinear bifurcations in terms of mathematical non-analyticity of a certain function, a vision long being held by some investigators (Zeeman 1988). The large deviation function *ϕ*(*x*) can be in fact considered as some type of stochastic *landscape* (potential or Lyapunov function in a not rigorous sense) for systems without gradient, or detailed balance (Graham & Haken 1971; Graham & T'el 1984; Hu 1986; Ross *et al.* 2002; Ao 2004), which was first established by Nicolis & Lefever (1977).

### 3.1. Stationary distribution of CME

We shall again emphasize that our theory is general, applicable to any CME with multistability. For the simple example, this type of model of one-variable birth and death process, the master equation can be solved exactly and thus all the results discussed in the paper can be obtained quite straightforwardly from the explicit solution (Nicolis & Turner 1977). And also in the case of systems with several variables, it has been shown with particular examples that the master equation has always a unique stationary solution for any realistic finite size chemical system, despite the fact that the associated deterministic equations may have multiple stationary or non-stationary (such as time periodic or even chaotic) solutions (Van Kampen 1983; Gardiner 1985; Baras *et al.* 1990; Geysermans & Baras 1996).

Furthermore, the fact that the stationary probability density of the intensive variable *x* = *X*/*V* can be expressed in terms of the exponential of ‘*V**ϕ*(*x*)’(*ϕ*(*x*) is also called ‘stochastic potential’ or adaptive landscape) has been considered by Kubo *et al.* (1973), who succeeded in casting the original master equation into a Hamilton–Jacobi formulation based on the Kramer–Moyal expansion for an arbitrary number of variables. The same results could also be rigorously derived from the large deviations theory (Ge & Qian 2010).

The existence of ‘nice’ *ϕ*(*x*) in the asymptotic form of e^{−Vϕ(x)} might not be always true for the CME; note that there is chaotic behaviour involved as well. If one considers a CME whose corresponding ODE is a three-dimensional chaotic dynamics with a strange attractor, what will be the stationary distribution in the limit of *V* → ∞? This problem has been discussed in the past (Graham & T'el 1984; Hu 1986). The general feeling is that *ϕ*(*x*) is not smooth itself. So one does not have a ‘nice’ *ϕ*(*x*). For a very ‘rugged *ϕ*(*x*)’, we believe that its Fenchel–Legendre transform *c*(*λ*) might be a very powerful way to ‘find the key feature’ of the *ϕ*(*x*). The number of non-differentiable points is definitely much smaller than the number of peaks.

### 3.2. Diffusion approximation for the CME

Kurtz (1976, 1978) has proved a second theorem. It basically stated that, in the thermodynamic limit, the solution of the master equation converges with probability one to that of the corresponding Fokker–Planck equation (diffusion process) for some finite time. Shortly after, it was again established that the coexistence line based on the master equation differs from the one based on the diffusion process, even if the noise in the latter is nonlinear. This important result implies that there exists a domain in the parameter space where the most probable state, for the master equation approach, is the least probable one for the diffusion process modelling (Baras & Malek Mansour 1997).

The obvious question is: which state is actually realized in nature? It has then been shown that a sufficient condition for the validity of the Kurtz theorem in the limit of infinitely long time (i.e. stationary state) is the uniqueness of the macroscopic attractor, which thus includes systems exhibiting sustained oscillatory regimes (limit cycle; Malek Mansour *et al.* 1981). Molecular dynamic simulations based on Newtonian hard spheres have shown that it is the master equation that provides the correct answer (Baras *et al.* 1996). We believe that for the dynamics of a bistable chemical systems, this diffusion approximation could only correctly describe it around each steady state, and if we would like to include the stochastic transition between the two stable steady states, this approximation is not at the correct timescale. For instance, it has already been shown that it is impossible to derive from a jump Markov process (master equation) a diffusion process (Fokker–Planck equation) that correctly describes the statistical behaviour of the system in the multiple steady-state regime in the multiplicative noise case (Horsthemke *et al.* 1977).

### 3.3. Keizer's paradox

The disagreement for the deterministic trajectory and the peaks of the stationary distribution of master equations and diffusion processes is also called Keizer's paradox, which was addressed nearly four decades ago. According to Kurtz's theorems, we know that the most probable trajectories for the master equation and its corresponding diffusion processes accord well with the deterministic trajectory, and also it can easily be shown that the most probable states (local probability maximums) of both the master equation and the Fokker–Planck equation coincide very precisely with the solutions of the corresponding macroscopic (deterministic) equation as long as the volume is large.

However, this coincidence is only within the timescale of deterministic trajectory. Is there anything neglected beyond it? The answer is yes; it is in fact the stochastic transitions between the multiple steady states. It will give rise to their relative probabilities and become the origin of Keizer's paradox. Further, it is also the reason why the mean dynamics described above have those inconsistencies.

### 3.4. Beyond deterministic dynamics

It is generally believed that when a system's size increases, the stochastic behaviour at a mesoscopic level averages out, and a deterministic behaviour emerges. However, our present analysis clearly show that the emerging deterministic behaviour in the CME is a metastable system's dynamics. Beyond that timescale, another ‘macroscopic’ *stochastic* behaviour exists! This multi-attractor stochastic system is a true emerging phenomenon that one cannot naively expect from the deterministic dynamics (e.g. based on the relative area of the attractive basins) without detailed stochastic mechanistic modelling. The Maxwell construction is the consequence of the steady state on this ‘beyond-deterministic-infinite’ timescale.

There are three timescales in this mathematical hierarchy of cellular dynamics (see figure 4); a molecular signalling timescale (i.e. the rate constant for molecular interactions), a biochemical network timescale (i.e. the deterministic relaxation times to attractors), and a cellular evolutionary timescale). We believe it is at the last level of stochastic dynamics that is most relevant to major cellular biological issues such as differentiation, apoptosis, cancer immunoediting and epi-genetics.

## Acknowledgement

We thank R. M. O'Malley for many discussions and an anonymous reviewer for helpful comments. We acknowledge the supportive environment for collaborations provided by Fudan University, School of Mathematics, via Shanghai Key Laboratory of Modern Applied Mathematics grant 08FG077. H. Ge also wishes to acknowledge support by the National Natural Science Foundation of China (10901040) and the specialized Research Fund for the Doctoral Program of Higher Education (New Teachers).

## Appendix A. Large deviations theory in equilibrium phase transition and Lee–Yang theorem

The materials in this section are mainly based on the works of R. S. Ellis, Touchette (2008), Ellis (1985) and Huang (1963). The generic equilibrium problem starts with a Hamiltonian *H*_{n}(*ω*), where *n* is the number of identical, interacting particles (atoms, spins, molecules, etc.) in a system. *ω* = (*ω*_{1}, *ω*_{2}, … ,*ω*_{n}) denotes the microstate of the system. One task of classical statistical mechanics is to understand a certain discontinuity in the thermodynamic limit of *n* → ∞, when a phase transition can be rigorously defined. For finite *n*, all the thermodynamic functions are continuous and smooth.

The free energy of the system is the minus logarithm of the partition function A 1

Therefore, the free energy per particle in the thermodynamic limit is A 2

Gibbsian statistical mechanics is related to the theory of probability as follows. With system's configuration *ω* fluctuating, its energy *H*_{n}(*ω*) is a random variable. However, the mean energy, i.e. energy per particle, *h*_{n}(*ω*) = *H*_{n}(*ω*)/*n* tends to a constant h̄ when *n* tends to infinity regardless of *ω*. This is the Law of Large Numbers in the theory of probability. Without loss of generality, we set energy reference h̄ = 0.

The large deviations theory (LDT) computes the rate of this convergence. Note that |*h*_{n}(*ω*)| tends to zero when *n* → ∞. In fact, the convergence is exponentially fast in the sense its probability density function *f*_{hn}(*ξ*) ∼ e^{−nϕ(ξ)}. Hence, one defines the rate of convergence
A 3

Note in statistical mechanics, *ϕ*(*ξ*) is just the microcanonical density of the state, i.e. entropy function. Therefore, a key result in LDT, and in Gibbs' theory, is that *ϕ*(*ξ*) and *c*(*β*) are related via a Legendre transform:
A 4

That is, in the limit of *n* → ∞ according to Laplace's method, also known as the maximum term method,
A 5

Thermodynamic quantities are obtained from derivatives of free energy *c*(*β*). The equilibrium phase transition problem is to show that *c*(*β*) can be non-differentiable at a certain *β* value, thus giving rise to discontinuity in the ∂*c*(*β*)/∂*β*. As shown in the text, one such situation is when *ϕ*(*ξ*) is non-convex, e.g. bimodal. The Lee–Yang theorem indicates that this corresponds to *Z*_{n}(*β*) having a zero, in the complex plane of *β*, that approaches the real axis *β**. Therefore, in the limit of infinite *n*, *c*(*β*) is non-analytic at *β**.

## Footnotes

↵1 We observe that if

*ϕ*(*x*) has three minima, and the middle one is the highest, it seems that the*c*(*λ*) will be insensitive to the middle minimum! It only depends on the two low ones! However, if the middle one is the lowest, then*c*(*λ*) will have two non-analytical points.↵2 For a CME with finite

*V*, its mean dynamics starts at*x*_{0}differs from the deterministic solution to the ODE based on the Law of Mass Action in two respects: one is the stochastic fluctuations, which is defined with respect to the mean, and the difference between the exact mean and the solution to the ODE. Both terms diminish as*O*(*V*^{−1}). There is no obvious reason when one can neglect the former but meaningfully keeps the latter. To illustrate this, consider a simple example: . The CME is d*p*(*n,t*)/d*t*=*k*(*n*+ 2)(*n*+ 1)*p*(*n*+ 2,*t*) −*kn*(*n*− 1)*p*(*n,t*). Then, the mean dynamics d〈*n*〉/d*t*= −2*k*〈*n*(*n*− 1)〉 = −2*k*〈*n*〉^{2}+ 2*k*〈*n*〉 − 2*k*〈(Δ*n*)^{2}〉. We see that the right-hand side has two additional terms than the ODE: a term proportional the fluctuation, and a term having to do with the difference between*n*(*n*− 1) and*n*^{2}. Even when the fluctuation is small, the mean dynamics for finite*V*is not identical to that of thermodynamics limit. If we let mean concentration x̄_{V}= 〈*n*〉/*V*, and rate constant k̂ =*kV*then we have for finite*V*system with small fluctuations: dx̄_{V}/d*t*= −2k̂x̄_{V}^{2}+ 2(k̂/*V*)x̄_{V}. This equation might improve the accuracy of mean dynamics somewhat; but it will not be able to deal with bistability, which is a phenomenon on a completely different timescale.

- Received April 7, 2010.
- Accepted April 23, 2010.

- © 2010 The Royal Society