## Abstract

We study a deterministic mechanistic model for the flow of ribosomes along the mRNA molecule, called the *ribosome flow model with extended objects* (RFMEO). This model encapsulates many realistic features of translation including non-homogeneous transition rates along mRNA, the fact that every ribosome covers several codons, and the fact that ribosomes cannot overtake one another. The RFMEO is a mean-field approximation of an important model from statistical mechanics called the *totally asymmetric simple exclusion process with extended objects* (TASEPEO). We demonstrate that the RFMEO describes biophysical aspects of translation better than previous mean-field approximations, and that its predictions correlate well with those of TASEPEO. However, unlike TASEPEO, the RFMEO is amenable to rigorous analysis using tools from systems and control theory. We show that the ribosome density profile along the mRNA in the RFMEO converges to a unique steady-state density that depends on the length of the mRNA, the transition rates along it, and the number of codons covered by every ribosome, but not on the initial density of ribosomes along the mRNA. In particular, the protein production rate also converges to a unique steady state. Furthermore, if the transition rates along the mRNA are periodic with a common period *T* then the ribosome density along the mRNA and the protein production rate converge to a unique periodic pattern with period *T*, that is, the model entrains to periodic excitations in the transition rates. Analysis and simulations of the RFMEO demonstrate several counterintuitive results. For example, increasing the ribosome footprint may sometimes lead to an increase in the production rate. Also, for large values of the footprint the steady-state density along the mRNA may be quite complex (e.g. with quasi-periodic patterns) even for relatively simple (and non-periodic) transition rates along the mRNA. This implies that inferring the transition rates from the ribosome density may be non-trivial. We believe that the RFMEO could be useful for modelling, understanding and re-engineering translation as well as other important biological processes.

## 1. Introduction

Gene expression is a multi-stage process for converting the information inscribed in DNA to proteins. During the transcription stage, the information in the DNA of a specific gene is copied into messenger RNA (mRNA). In the translation stage, complex macromolecules called ribosomes bind (at the initiation phase) to the mRNA and unidirectionally decode each codon (at the elongation phase) into the corresponding amino acid that is delivered to the awaiting ribosome by transfer RNA (tRNA). Finally, at the termination phase, the ribosome detaches from the mRNA, the amino acid sequence is released, folded and becomes a functional protein (in some cases, post-translation modifications may occur) [1]. The output rate of ribosomes from the mRNA, which is also the rate in which proteins are generated, is called the protein translation rate or production rate.

Translation occurs in all living organisms, and under almost all conditions. Thus, understanding the factors that affect translation has important implications to many scientific disciplines, including medicine, evolutionary biology and synthetic biology. Deriving and analysing mechanistic models of translation is important for developing a better understanding of this complex, dynamical and tightly regulated process. Such models can also aid in integrating and analysing the rapidly increasing experimental findings related to translation (e.g. [2–9]).

Mechanistic models of translation describe the dynamics of ribosome movement along the mRNA molecule, with parameters that encode the various translation factors affecting the codon decoding times along the mRNA molecule. Several such models have been suggested based on different paradigms ranging from Petri nets [10] to probabilistic Boolean networks [11]. For more details, see the survey papers [9,12].

The *totally asymmetric simple exclusion process* (TASEP) [13,14] is a fundamental model in non-equilibrium statistical mechanics that has been used to model numerous natural and artificial processes [9,15], including ribosome flow during mRNA translation. In TASEP, particles stochastically hop between consecutive sites along an ordered lattice of *N* sites. However, a particle cannot hop to an already occupied site. TASEP encapsulates both the unidirectional flow of ribosomes along the mRNA molecule, and the *interaction* between the particles, as a particle in site *i* blocks the movement of a particle in site *i* − 1. This hard exclusion principle models particles that have ‘volume’ and thus cannot overtake one other. In the context of translation, the lattice represents the mRNA molecule, and the particles are the ribosomes. The rate of hopping from site *i* to site (*i* + 1) is denoted by *γ*_{i}. A particle can hop to (from) the first (last) site of the lattice at a rate *α* [*β*]. The flow through the lattice converges to a steady-state value that depends on *N* and the vector of parameters:
1.1The special case where all the internal hopping rates are assumed to be equal and normalized to one, i.e. *γ*_{i} : = 1, *i* = 1, …, *N* − 1, is referred to as the *homogeneous TASEP* (HTASEP).

In TASEP, a particle occupies a single site. However, in translation every ribosome occupies not only the codon it is translating, but also codons after and before it. More precisely, the ribosome footprint is about 10 to 11 codons, and its exit tunnel length is about 31 codons [1,16–19]. In *TASEP with extended objects* (TASEPEO), a particle occupies multiple sites along the lattice [13,20–25]. For TASEPEO with open-boundary conditions (i.e. where the two sides of the lattice are connected to two particle reservoirs, as assumed here), few rigorous analytical results are known [22]. Mean-field approximations, domain-wall arguments and extensive Monte Carlo simulations suggest that the *homogeneous* TASEPEO converges to a steady state and that the model has the same phase diagram as HTASEP, i.e. it contains three phases: low-density, high-density, and maximal current. The phase boundaries depend on the extended object size [24]. TASEPEO with two types of object sizes was studied in [26]. It is important to mention that the extended objects concept is relevant for other intracellular processes, e.g. transcription [27–29].

The *ribosome flow model* (RFM) [30] is a deterministic mathematical model for mRNA translation, obtained via a mean-field approximation of TASEP with open-boundary conditions. As such, it also inherits the property that the particle size is equal to the site size. When the RFM is used to model translation based on real biological data, this issue is handled by coarse-graining the mRNA molecule into sites composed of several consecutive codons. For every site, the translation time of each codon in the site is used to determine the translation time of the site in the RFM (e.g. [30]). It is not clear, however, how to systematically coarse-grain the mRNA in a way that yields the best fidelity between the model structure and parameters and the biological reality.

In this paper, we analyse for the first time a mean-field approximation of TASEPEO. This is a deterministic model that we refer to as the *ribosome flow model with extended objects* (RFMEO). Using the theory of contractive dynamical systems, we rigorously prove that the RFMEO always converges to a steady state. In other words, the density profile of ribosomes along the mRNA molecule always converges to a unique steady state, and thus so does the protein production rate. This shows that the RFMEO is robust in the sense that perturbations (e.g. due to stochastic noise in the biochemical reactions) in the ribosome density and production rate die out with time. This also means that we can reduce the problem of studying the density profile and protein production rate to studying the steady-state profile and production rate. We also prove that the RFMEO entrains (or frequency locks) to periodic excitations. We show using simulations that the RFMEO, unlike the RFM, correlates well with TASEPEO.

The remainder of this paper is organized as follows. The next section briefly reviews the RFM. Section 3 describes the RFMEO. Section 4 describes our main theoretical results on the properties of the RFMEO. Section 5 studies the correlation between RFMEO and TASEPEO. The final section summarizes and describes several directions for further research. To increase the readability of this paper, all the proofs are placed in electronic supplementary material, appendix A. Electronic supplementary material, appendix B, describes how the RFMEO can be derived by a mean-field approximation of TASEPEO. Both appendixes can be found in the electronic supplementary material.

## 2. Ribosome flow model

The RFM [30] is a *deterministic* model for mRNA translation that can be derived by a mean-field approximation of TASEP (e.g. [15, §4.9.7], [31, p. R345]; see also electronic supplementary material, appendix B, in the special case where the extended object size is equal to one site unit). In the RFM, mRNA molecules are coarse-grained into *n* consecutive sites of codons. The state variable , *i* = 1, …, *n*, describes the normalized ribosomal occupancy level at site *i* at time *t*, where *x*_{i}(*t*) = 1 [*x*_{i}(*t*) = 0] indicates that site *i* is completely full (empty) at time *t*. The model includes *n* + 1 positive parameters that describe the maximal possible transition rate between the sites: the initiation rate into the chain λ_{0}, the elongation (or transition) rate from site *i* to site (*i* + 1) λ_{i}, *i* = 1, …, *n* − 1, and the exit rate λ_{n}.

The dynamics of the RFM with *n* sites is given by *n* nonlinear first-order ordinary differential equations:
2.1If we let *x*_{0}(*t*) : = 1 and *x*_{n+1}(*t*) : = 0, then (2.1) can be written more succinctly as
2.2where *h*_{i}(*x*) : = λ_{i}*x*_{i}(1 − *x*_{i+1}). This can be explained as follows. The flow of particles from site *i* to site (*i* + 1) at time *t* is λ_{i}*x*_{i}(1 − *x*_{i+1}). This flow increases with the density at site *i*, and decreases as site (*i* + 1) becomes fuller. This corresponds to a ‘soft’ version of a simple exclusion principle: since the particles have volume, the input rate to site *i* decreases as the number of particles in that site increases. Note that the maximal possible flow from site *i* to site (*i* + 1) is the transition rate λ_{i}. Thus, equation (2.2) simply states that the change in the density at site *i* at time *t* is the input rate to site *i* (from site *i* − 1) minus the output rate (to site *i* + 1) at time *t*.

The ribosome exit rate from site *n* at time *t* is equal to the protein production (or translation) rate at time *t*, and is denoted by *R*(*t*): = λ_{n}*x*_{n}(*t*) (figure 1). Note that *x*_{i} is dimensionless, and that every rate λ_{i} has units of 1/time.

Let *x*(*t*, *a*) denote the solution of (2.1) at time *t* ≥ 0 for the initial condition *x*(0) = *a*. Since the state variables correspond to normalized occupancy levels, we always assume that *a* belongs to the closed *n*-dimensional unit cube:
Let int(*C*^{n}) denote the interior of *C*^{n}, and let ∂*C*^{n} denote the boundary of *C*^{n}. It was shown in [32] that if *a*∈*C*^{n} then *x*(*t*, *a*) ∈ *C*^{n} for all *t* ≥ 0, that is, *C*^{n} is an invariant set of the dynamics. Margaliot & Tuller [32] also showed that the RFM is a *tridiagonal cooperative dynamical system* [33], and that this implies that (2.1) admits a *unique* steady-state point *e* = *e*(λ_{0}, …, λ_{n})∈int (*C*^{n}), that is globally asymptotically stable, that is, lim_{t →∞} *x*(*t*, *a*) = *e*, for all *a* ∈ *C*^{n} (see also [34]). In particular, the production rate converges to the steady-state value *R*: = λ_{n}*e*_{n}.

An important advantage of the RFM (e.g. as compared to TASEP) is that it is amenable to mathematical analysis using various tools from systems and control theory. Furthermore, most of the analysis results hold for the general, non-homogeneous case (i.e. when the transition rates all differ from one another). The RFM has been used to address many important biological problems including the sensitivity of the production rate to small changes in the transition rate, maximizing and minimizing the production rate in an optimal manner, analysis of the effect of competition for shared resources in translation and more [32,34–44].

To account for the fact that each ribosome covers several codons, we analyse here the RFMEO, which is a mean-field approximation of TASEPEO (see electronic supplementary material, appendix B, for more details). An integer ℓ ≥ 1 describes the number of site units covered by each particle. The exclusion principle now implies that the rate of flow from site *i* to site (*i* + 1) is λ_{i}*x*_{i}(1 − *x*_{i+1} − · · · − *x*_{i+ℓ}). Indeed, since the particle covers the next ℓ sites, as the density in any of the ℓ consecutive sites increases the rate of movement slows down. Note that ℓ = 1 yields the RFM, so the RFM is a special case of the RFMEO.

Nevertheless, the RFMEO is a significant generalization of the RFM and its dynamics is quite different from that of the RFM. For example, the RFMEO, unlike the RFM, is *not* a cooperative system; it does *not* satisfy the particle-hole symmetry of the RFM (and of TASEP) [31,44], and unlike the RFM, the RFMEO with ℓ > 1 is *not* a tridiagonal dynamical system.

## 3. Ribosome flow model with extended objects

Being a large complex of molecules, each ribosome typically covers between 10 and 11 codons and the geometry (e.g. length of the exit tunnel) can be longer than 30 codons [1]. A drawback of the RFM and other standard mean field models for translation is that, without additional processing such as coarse-graining, each ribosome (particle) is assumed to cover a single site.

The RFMEO allows modelling the flow of ribosomes where every ribosome covers 1 ≤ ℓ ≤ *n* site units. We assume, without loss of generality, that the ribosome is translating the left-most site it is covering, and refer to this part of the ribosome as the *reader*. A similar assumption is used in TASEPEO (e.g. [13,22–25,45]). Thus, the statement ‘the ribosome is at site *i*’ means that: the reader is located at site *i*; the ribosome is translating site *i*; its corresponding transition rate is λ_{i} and sites *i*, …, *i* + ℓ − 1 are covered by this ribosome. As we will show below, the dynamical equations describing the RFMEO (and thus all the theoretical results in this paper) are the same for any chosen reader location (e.g. choosing the reader at location ℓ/2 results in exactly the same RFMEO equations).

Let *x*_{i}(*t*) denote the (normalized) *reader* occupancy level at site *i* at time *t*, and let *y*_{i}(*t*) denote the (normalized) *coverage* occupancy level at site *i* at time *t*, that is,
3.1

Indeed, since every ribosome covers ℓ sites, any ribosome that is located up to ℓ sites left to site *i* contributes to the total ribosome coverage at site *i*. The term ‘normalized’ here means that each *x*_{i}(*t*) and each *y*_{i}(*t*) takes values in the interval [0, 1] for all *t* ≥ 0. The value zero corresponds to completely empty, and one means completely full. We refer to 1 − *y*_{i}(*t*) as the ‘space’ or ‘vacancy’ level at site *i* at time *t*.

Note that (3.1) implies that *y*(*t*) = *Px*(*t*), where *P* is the lower triangular matrix with all entries zero, except for the entries on the main diagonal and (ℓ − 1) diagonals below the main diagonal that are ones. For example, for *n* = 4 and ℓ = 3:
3.2

The dynamics of the RFMEO with *n* sites is given by *n* nonlinear first-order ordinary differential equations:
3.3Here *q*_{i−1} is the flow into site *i* and *q*_{i} is the flow out of site *i*. The expression for this flow is given by
3.4with *x*_{0}(*t*) ≡ 1 and *y*_{j}(*t*) ≡ 0 for all *j* > *n*.

Equation (3.4) implies that the *reader* flow from site *i* to site (*i* + 1) is proportional to λ_{i}, to the occupancy levels of readers at site *i*, and to the ‘space’ or ‘vacancy’ level at site *i* + ℓ (figure 2). In particular:

— As the number of readers at site

*i*increases, the flow from site*i*increases. This follows the same reasoning as in the RFM.— When a reader located at site

*i*moves to site (*i*+ 1), the coverage occupancies at sites*i*+ 1,*i*+ 2, …,*i*+ ℓ − 1 do not change. However, the reader's tail end will now occupy a new site, which is site (*i*+ ℓ).— The ‘vacancy’ level at site (

*i*+ ℓ) is (1 −*y*_{i+ℓ}), since*y*_{i+ℓ}denotes the total coverage at site (*i*+ ℓ).

To explain (3.3), consider, for example, the equation for the change in the density at site 1 given by
The term λ_{0}(1 − *y*_{ℓ}) represents the entry rate into site 1. Indeed, since the entering ribosome will cover sites 1, 2, …, ℓ, this entry rate decreases with the coverage density *y*_{ℓ} = *x*_{1} + … + *x*_{ℓ}. (In the literature on TASEPEO this is referred to as the ‘complete-entry’ flow [22].) The term λ_{1}*x*_{1}(1 − *y*_{ℓ+1}) is the flow from site 1 to site 2. This increases with the occupancy at site 1 and, similarly, decreases with the coverage occupancy *y*_{ℓ+1}.

### Remark 3.1.

As noted above, the assumption that the ‘reading head’ is located at the left-hand side of the ribosome is arbitrary, but the RFMEO equations do not depend on this assumption. To demonstrate this, consider, for example, the case ℓ = 3 and the three possible locations for the reader: (i) *Left-most site*. In this case *y*_{j} = *x*_{j−2} + *x*_{j−1} + *x*_{j}, so
3.5(ii) *Middle site*. In this case *y*_{j} = *x*_{j−1} + *x*_{j} + *x*_{j+1} and *q*_{i}(*x*) = λ_{i}*x*_{i}(1 − *y*_{i+2}), and this yields (3.5). (iii) *Right-most site*. In this case *y*_{j} = *x*_{j} + *x*_{j+1} + *x*_{j+2} and *q*_{i}(*x*) = λ_{i}*x*_{i}(1 − *y*_{i+1}), again yielding (3.5). Thus, (3.4) is invariant to the reader location.

Consider an index *j* ≥ *n* − ℓ + 2. Then ℓ + *j* − 1 ≥ *n* + 1, so
Thus, the equation describing the flow in these last sites is a linear equation. The same phenomena take place in TASEPEO, as a ribosome ‘reading’ the last ℓ codons must be the last particle on the lattice, with no others to impede its progress. Therefore, it can move without hindrance towards the exit end. The exit rate in this context is referred to as the ‘incremental-exit’ rate [22].

The output rate of ribosomes from the chain, which is the *protein production* (*or translation*) *rate*, is denoted by *R*(*t*): = λ_{n}*x*_{n}.

Note that in the special case ℓ = 1, we have *y*_{i} = *x*_{i} for all *i* = 1, …, *n*, and then (3.3) reduces to the RFM.

### Example 3.2.

Consider a RFMEO with dimension *n* = 4 and particle size ℓ = 2. Then (3.3) yields
3.6In the RFM, the set *C*^{n} is an invariant set of the dynamics. This is no longer true for the RFMEO. For example, for the initial condition
3.7equation (3.6) yields
This implies that *x*_{1}(0^{+}) > 1 for λ_{1} > λ_{0}, so *x*(0^{+})≠*C*^{n}.

### Remark 3.3.

The Jacobian matrix of the dynamics (3.6) is

Note that there are off-diagonal entries here whose sign may change with time (e.g. −λ_{0} + λ_{1}*x*_{1}). This implies that the RFMEO, unlike the RFM, is not a cooperative dynamical system.

It is useful to explicitly write the dynamics of the RFMEO in terms of the coverage state-variables (i.e. the *y* state-vector). Recall that the proofs of all the results are placed in electronic supplementary material, appendix A.

### Proposition 3.4.

*The coverage state variables in the RFMEO satisfy:*
3.8
where ⌈*z*⌉ *denotes the smallest integer that is larger than or equal to* *z*.

### Example 3.5.

Consider the RFMEO with *n* = 4 sites and particle size ℓ = 2. Then (3.8) yields

## 4. Theoretical results

We begin by defining the relevant state-space for the RFMEO. If for some *i*, we have *y*_{i+ℓ} > 1 then *q*_{i}(*x*) : = λ_{i}*x*_{i}(1 − *y*_{i+ℓ}) < 0. This represents a backward flow that according to current knowledge does not take place in ribosome movement. Thus, it is useful to define the state-space as the region where such a backward flow does not take place, i.e. both the *x*_{i}s and the *y*_{i}s are between zero and one. This leads to defining
Note that *H* is a compact and convex set.

### Example 4.1.

Consider the RFMEO with *n* = 3 sites and particle size ℓ = 2. The sets *H* and *C*^{3} are depicted in figure 3.

Note also that for the RFMEO with *n* = 4 and ℓ = 2 the initial condition *x*(0) in (3.7) is *not* in *H* as *y*_{2}(0) = *x*_{1}(0) + *x*_{2}(0) > 1.

From here on we refer to any value *x*∈*H* as a *feasible value*. This represents a state such that every reader density and every coverage density is between zero and one.

### 4.1. Invariance and persistence

The next result shows that the boundary of *H*, denoted ∂*H*, is ‘repelling’ towards the interior of *H*. This means that if the dynamics is initiated with a feasible value that includes a reader/coverage density equal to the extremal value zero (one) then the dynamics will immediately change this to a value larger than zero (smaller than one).

### Proposition 4.2.

*For any* *a*∈∂*H* *the solution of the RFMEO satisfies* *x*(*t*, *a*) ∈ int (*H*) *for all* *t* > 0.

Note that this result implies, in particular, that *H* is an invariant set of the dynamics. In other words, if all the reader and coverage densities are initiated with feasible values (i.e. values between zero and one) at time *t* = 0 then they remain feasible for all time *t* ≥ 0.

The next result shows that the solutions of the RFMEO are ‘persistent’ in the sense that they enter and remain in a set that is uniformly separated from the boundary of *H*. Furthermore, this happens ‘immediately’.

### Proposition 4.3.

*For any* *τ* > 0*, there exists a compact and convex set* *H*_{τ} *that is strictly contained in* *H* *such that for any* *a*∈*H*,

Note that this implies that for any *τ* > 0 there exists such that
for all *i* and all *a*∈*H*. In other words, all the reader and coverage densities ‘immediately’ become and remain uniformly separated from the extreme values zero and one. This is a technical property, but as we will see below it will be useful in the analysis of the asymptotic properties of the RFMEO.

### 4.2. Contraction

Contraction theory is a powerful tool for analysing nonlinear dynamical systems. In a contractive system, trajectories that emanate from different initial conditions approach each other at an exponential rate, that is, the distance between any pair of trajectories, as a function of time, decreases at an exponential rate [46–48].

Consider the time-varying dynamical system 4.1whose trajectories evolve on a compact and convex set .

For *t* ≥ *t*_{0} ≥ 0 and *a*∈*Ω*, let *x*(*t*, *t*_{0}, *a*) denote the solution of (4.1) at time *t* for the initial condition *x*(*t*_{0}) = *a*. Recall that system (4.1) is said to be contractive on *Ω* with respect to (w.r.t.) a norm if there exists *γ* > 0 such that
4.2for all *t*_{2} ≥ *t*_{1} ≥ 0 and all *a*, *b*∈*Ω*. This means that any two trajectories approach each other at an exponential rate *γ*.

To apply contraction theory to the RFMEO, we require the following generalization of contraction w.r.t. a fixed norm that has been introduced in [49]. The time-varying system (4.1) is said to be *contractive after a small overshoot and short transient* (SOST) on *Ω* w.r.t. a norm if for each *ɛ* > 0 and each *τ* > 0 there exists *γ* = *γ*(*τ*, *ɛ*) > 0 such that
for all *t*_{2} ≥ *t*_{1} ≥ 0 and all *a*, *b*∈*Ω*. Comparing this to (4.2), we see that here contraction ‘kicks in’ after an arbitrarily small time transient *τ* and with an arbitrarily small overshoot (1 + *ɛ*).

The next result applies these ideas to the RFMEO. Let denote the *L*_{1} norm, i.e. for , |*z*|_{1} = |*z*_{1}| + … + |*z*_{n}|.

### Proposition 4.4.

*The RFMEO is SOST on* *H* *w.r.t. the* *L*_{1} *norm*, *that is*, *for each* *ɛ* > 0 *and each* *τ* > 0 *there exists* *γ* = *γ*(*τ*, *ɛ*) > 0 *such that*
4.3*for all* *t* ≥ 0 *and all* *a*, *b* ∈ *H*.

Roughly speaking, this means the following. Fix two initial feasible densities in the RFMEO and consider how the two corresponding densities along the mRNA evolve in time. Then these densities become ‘more similar’ to each other at an exponential rate. In particular, the initial density is ‘quickly forgotten’.

Subsections 4.3 and 4.5 below describe important asymptotic properties of the RFMEO that follow from proposition 4.4.

### 4.3. Global asymptotic stability

Write the RFMEO (3.3) as , with . Since the compact and convex set *H* is an invariant set of this dynamical system, it contains a steady-state point *e* = *e*(ℓ, *λ*_{0}, …, *λ*_{n}). In other words, *g*(*e*) = 0_{n}, where 0_{n} denotes a column vector of *n* zeros, and *x*(*t*, *e*) ≡ *e* for all *t* ≥ 0. Proposition 4.2 implies that *e* ∈ int(*H*). Using (4.3) with *b* : = *e* yields the following result.

### Corollary 4.5.

*The RFMEO admits a globally asymptotically stable steady-state point* *e*∈int(*H*), *i.e.*

This means that trajectories corresponding to different initial conditions all converge to the unique steady-state point, that depends on the transition rates λ_{i}s, particle size ℓ, and the length of the chain *n*, but not on the initial condition.

### Example 4.6.

Figure 4 depicts the trajectories of an RFMEO with dimension *n* = 3, particle size ℓ = 2, and rates λ_{0} = 1.0, λ_{1} = 1.2, λ_{2} = 0.8 and λ_{3} = 0.4, for four different initial conditions on the boundary of *H*. It may be seen that each trajectory immediately enters and remains in the interior of *H*, and converges to a unique steady-state point *e* ∈ int(*H*).

The next example demonstrates the contraction property. Let 1_{n} denote the column vector of *n* ones.

### Example 4.7.

Consider the RFMEO with dimension *n* = 7, particle size ℓ = 3, and rates λ_{i} = 1 − *i*/50, *i* = 0, …, 7. In this case, the unique steady-state point is (all numbers are to four digit accuracy):
Figure 5 depicts *r*(*t*) : = |*x*(*t*, *a*) − *e*|_{1} as a function of time for *t*∈[0, 70] for the initial condition . It may be seen that the *L*_{1} distance between the trajectory and *e* monotonically decreases to zero. It may also be seen that the rate of convergence varies with time. This makes sense because we can interpret the RFMEO as an RFM with time-varying transition rates (see the proof of proposition 4.4 in electronic supplementary material, appendix A), and thus a time-varying contraction rate.

Corollary 4.5 implies that the coverage occupancy *y*_{i}(*t*) at site *i* converges to the unique steady-state value:
4.4

Define the *mean reader occupancy* at time *t* by and the *mean coverage occupancy* at time *t* by Note that this implies that lim_{n → ∞}*ρ*^{c}(*t*) = ℓ*ρ*(*t*). Then the mean reader occupancy converges to the unique steady-state value
4.5and the mean coverage occupancy converges to the unique steady-state value
4.6

The next example demonstrates the contraction property using a *Saccharomyces cerevisiae* gene. Let 0_{n} denote the column vector of *n* zeros.

### Example 4.8.

We consider the highly expressed *S. cerevisiae* gene YLR110C that encodes a cell-wall mannoprotein and contains 133 codons (excluding the stop codon). We modelled it using a RFMEO with *n* = 133 and ℓ = 10. The value λ_{0} = 1.33131 (in units of mRNAs per second) was estimated based on the ribosome density per mRNA levels, as this value is expected to be approximately proportional to the initiation rate when initiation is rate limiting [30,35]. The elongation rates λ_{1}, …, λ_{n} were estimated using ribo-seq data for the codon decoding rates [50], normalized so that the median elongation rate of all *S. cerevisiae* mRNAs becomes 6.4 codons s^{−1} [51]. The rates are depicted in figure 6*a* as a function of *i*. To study the rate of contraction, we calculated *e* in the RFMEO (shown in figure 6*b*), and simulated the dynamical system to obtain *r*(*t*) = |*x*(*t*, *a*) − *e*|_{1} with *a* = 0_{133}, as a function of *t* (in seconds). Note that the initial condition *a* = 0_{133} represents an mRNA with no ribosomes. Figure 6*c* depicts the relative *L*_{1} distance in percentage, that is,
4.7as a function of *t* (in seconds). In this case, *ρ* = 0.0534. It may be seen that the relative distance is less than 20% already after about 30 s. We note that typical *S. cerevisiae* mRNA half-lives are of the order of tens of minutes (e.g. [52–54]). This suggests that typically the ribosome density on *S. cerevisiae* mRNAs is ‘very close’ to its steady-state value.

### 4.4. Analysis of the steady state

It is important to understand how the steady-state density *e* depends on the parameters of the RFMEO. To study this, we begin by deriving some equations for *e*. At steady state (i.e. for *x* = *e*), the left-hand side of all the equations in (3.3) is zero (i.e. , *i* = 1, …, *n*), so
where *z*_{j} : = 0, for all *j* > *n*. This yields (see (4.4))
4.8

We can express these in terms of (the generally unknown value) *R* as
4.9and this yields
and

### Example 4.9.

Consider the RFMEO with dimension *n* = 6 and particle size ℓ = 3. Then, the steady-state production rate *R* satisfies
where

It is clear that solving (4.8) is in general non-trivial. Nevertheless, it can be solved in closed-form in some very special cases. The next example demonstrates this.

### Example 4.10.

Consider a RFMEO with *n* sites and with ribosome size ℓ = *n*. Then (4.8) becomes
4.10and this yields
4.11and
4.12where . To understand this, assume in addition that λ_{0} = … = λ_{n} = λ_{c}, i.e. all the rates are equal with λ_{c} denoting the common value. Then (4.11) and (4.12) yield *e*_{i} = 1/(*n* + 1) for all *i*, and
4.13This means that when the ribosome size is equal to the chain size and all the rates are equal then the steady-state density at each site is identical. This makes sense, as every ribosome covers all the sites in the chain.

Another tractable case is when ℓ = *n* − 1 and λ_{0} = · · · = λ_{n} = λ_{c}. Then (4.8) yields
4.14and this admits the solution
and
4.15Note that here *e*_{1} > *e*_{2} = *e*_{3} = · · · = *e*_{n}. This makes sense, because if there is a ribosome with reader at site ≥ 2 then the tail of this (*n* − 1)-sites long ribosome is either at site *n* or already out of the chain, and so there is no hindrance for its movement.

Equation (4.8) can also be used to prove theoretical results. The next result shows that increasing any of the λ_{i}s increases *R*. In other words, increasing any of the transition rates along the mRNA molecule increases the steady-state protein production rate.

### Proposition 4.11.

*Consider the RFMEO with dimension* *n* *and particle size* ℓ. *Then* (∂/∂λ_{i})*R* > 0*, for* *i* = 0, …, *n*.

In the special case where *all* the rates are equal, i.e.
4.16where λ_{q} denotes the common value, we refer to the RFMEO as the *totally homogeneous RFMEO* (THRFMEO). In this case, it is possible to say more about the steady-state occupancies.

### Proposition 4.12.

*Consider the THRFMEO with dimension* *n* *and particle size* ℓ. *Then*
4.17

This means that the steady-state reader occupancies monotonically decrease between sites 1 and (*n* − ℓ + 1) and are equal at the last ℓ sites. This may partially explain the decrease in ribosome density observed along the coding sequences from the 5′ end to the 3′ end (e.g. [19,55]).

### Example 4.13.

The steady-state reader occupancy levels of the RFMEO with dimension *n* = 40 are depicted in figure 7 for three particle sizes: ℓ = 1 (corresponding to the RFM), ℓ = 2 and ℓ = 3. It may be observed that the steady-state reader occupancies monotonically decrease along the chain until the last ℓ densities that are equal. The corresponding steady-state production rates are *R* = 0.2513 for ℓ = 1; *R* = 0.1265 for ℓ = 2; and *R* = 0.0851 for ℓ = 3.

#### 4.4.1. Effect of particle size

It is interesting to analyse how the steady-state occupancies and production rate depend on the particle size ℓ. One might naturally expect the steady-state production rate in the RFMEO to decrease as the particle size ℓ increases. Indeed, roughly speaking one may think of increasing the particle size as replacing small cars travelling along a unidirectional traffic lane with large trucks thus leading to more congestion. This is demonstrated by the next example.

### Example 4.14.

The steady-state reader occupancy levels in the RFMEO with dimension *n* = 60, and rates λ_{0} = · · · = λ_{40} = 1 and , for four different particle sizes ℓ = 1 (i.e. the RFM), ℓ = 2, ℓ = 4 and ℓ = 8 are depicted in figure 8. Note that the steady-state occupancy levels decrease with ℓ. The transition rates here decrease from the value 1 to at site 40. Thus, we expect to see a ‘traffic jam’ of ribosomes before this site. For ℓ = 1 this is indeed what happens. However, for ℓ > 1 much more complicated patterns evolve. The steady-state densities follow a complicated quasi-periodic behaviour, with period ℓ, even though there is no such periodicity in the rates.

### Example 4.15.

Figure 9 depicts the steady-state production rate *R*, the steady-state mean reader occupancy *ρ*, and the steady-state mean coverage occupancy *ρ*^{c} as a function of the particle size ℓ, for a THRFMEO with dimension *n* = 100 and λ_{q} = 1. It can be observed that the steady-state production rate and the mean reader occupancy decrease with ℓ, whereas the steady-state mean coverage occupancy increases with ℓ.

It is interesting to compare these results to the homogeneous TASEPEO. In the thermodynamical limit (i.e. as *N* → ∞), the homogeneous TASEPEO with particle size ℓ, and with *α* = *β* = 1 is in the maximal current phase, where the steady-state output rate is , the mean reader density is and the mean coverage density is [22,24]. Note that this implies that as ℓ goes to infinity the current and the mean reader density go to zero, whereas the mean coverage density goes to one. This is consistent with the results for the THRFMEO depicted in figure 9.

Figure 10 depicts the steady-state production rate *R* as a function of the particle size ℓ for a RFMEO with *n* = 100, λ_{0} = 0.1 and λ_{i} = 1, *i* = 1, …, 100. In this case, λ_{0} is the rate limiting factor, and thus fewer ‘traffic jams’ occur relative to the case λ_{0} = 1. It may be seen that also in this case *R* monotonically decreases with ℓ.

The next result shows that for fixed rates the steady-state production rate in the RFMEO with ℓ > 1 is always smaller than the steady-state production rate in the RFMEO with ℓ = 1 (i.e. the RFM).

### Proposition 4.16.

*Consider an RFMEO with dimension* *n*, *particle size* ℓ > 1, *and rates* λ_{i}, *i* = 0, …, *n*, *admitting a steady-state production rate* *R*. *Consider also an RFM with the same dimension* *n*, *and the same rates* λ_{i}, *i* = 0, …, *n*, *admitting a steady-state production rate* . *Then* .

In many organisms, longer genes have lower protein levels [2,56]. There are many explanations and variables that may contribute to this correlation. However, is it possible that the relations between particle size and translation rate may have a (small) contribution to this correlation? It is possible that longer coding regions are related to longer proteins emerging from the ribosome during translation thus practically increasing the effective ribosome size. This hypothesis may be studied in synthetic systems in the future.

Surprisingly, however, increasing ℓ does not always lead to a reduction in the production rate.

### Example 4.17.

Consider an RFMEO with dimension *n* = 3, and rates
We consider two cases ℓ = 2 and ℓ = 3, and for the sake of clarity we denote the steady-state values in the latter case by overbars. For ℓ = 2,
For ℓ = 3,
Thus, here increasing ℓ from 2 to 3 *increases* the production rate. To explain this, recall that in general increasing ℓ decreases the steady-state reader densities (figures 7 and 8). This is also what happens here. Indeed,
At steady state, the entry rate into the chain is equal to the production rate, so *R* = λ_{0}(1 − *e*_{1} − *e*_{2}) and . Since this is proportional to one minus the sum of densities, . Thus, in this particular case the increase in ℓ yields an *increase* in the production rate.

Similarly, it follows from (4.13) and (4.15) that for any *n* > 3 increasing ℓ from *n* − 1 to *n* in the THRFMEO leads to an increase in the steady-state production rate.

### 4.5. Entrainment

Assume now that some or all of the transition rates λ_{i} are not constants, but time-varying *periodic* functions with a common period *T*. In the context of translation, this corresponds for example, to the case where the tRNA abundances vary in a periodic manner, with a common period *T*. More precisely, we say that a function *f* is *T*-periodic if *f*(*t* + *T*) = *f*(*t*) for all *t*. Assume that the transition rates are time-varying functions satisfying:

(1) There exist such that 0 <

*δ*_{1}≤ λ_{i}(*t*) ≤*δ*_{2}, for all*i*= 0, …,*n*, and all*t*≥ 0.(2) There exists a minimal

*T*> 0 such that every λ_{i}(*t*) is a*T*-periodic function.

We refer to this case as the *periodic RFMEO* (PRFMEO). Note that the PRFMEO includes in particular the case where some of the rates are constant, as a constant function is *T*-periodic for every *T*. However, condition (2) above implies that the case where all the rates are constant is ruled out, as then the minimal *T* is zero. Indeed, this case is just the RFMEO analysed above.

The next result follows from combining the fact that the RFMEO is SOST on *H* with known results on the entrainment of contractive systems to a periodic excitation (e.g. [47]).

### Theorem 4.18.

*The PRFMEO admits a unique function* , *that is* *T-periodic*, *and for any* *a*∈*H the trajectory* *x*(*t*, *a*) *converges to* *ϕ* *as* *t* → ∞.

In other words, the PRFMEO admits a unique periodic solution, with period *T*, and every trajectory of the PRFMEO converges to this periodic solution. This means that the densities along the mRNA, and thus also the production rate entrain to the periodic excitation induced by the transition rates.

As a side note, we point that the RFMEO can also be used to model vehicular traffic. If traffic lights that change periodically produce the transition rates then theorem 4.18 implies that the traffic density converges to a periodic pattern with the same period, i.e. the ‘green wave’ concept (e.g. [57]).

The next example illustrates the dynamical behaviour of the PRFMEO.

### Example 4.19.

Consider a PRFMEO with dimension *n* = 4, particle size ℓ = 2, and transition rates
Note that all the rates here are periodic, with a minimal common period *T* = 8. Figure 11 depicts *x*_{i}(*t*), *i* = 1, …, 4, as a function of *t* for the initial condition *x*(0) = [0.2 0.2 0.2 0.2]′. It may be seen that each state variable converges to a periodic function with period *T* = 8.

### 4.6. Rate limiting steps in the ribosome flow model and the ribosome flow model with extended objects

It has been shown that depending on the biological conditions and the specific organism both initiation and elongation may be rate limiting [8,9,12,58–61]. Since the RFMEO is a better model for biological translation than the RFM, it is interesting to study the rate limiting step in these two models. We now show that the transition from the low density phase, when initiation is rate limiting, to the high density phase, when elongation is rate limiting, is different in the two models: in the RFMEO this transition will take place for a lower initiation rate.

We modelled four *S. cerevisiae* genes: YMR123W, YNL303W, YJR094W-A and YBL094C using both an RFMEO with ℓ = 10 and an RFM, and considered the steady-state production rate and the steady-state mean density as a function of the initiation rate λ_{0}.

As was done in example 4.8, the elongation rate λ_{i} at each site, for both the RFMEO and the RFM, was estimated using ribo-seq data for the codon decoding rates normalized so that the median elongation rate of all *S. cerevisiae* mRNAs becomes 6.4 codons s^{−1}. The site rate is simply the corresponding codon rate. These rates thus depend on various factors including availability of tRNA molecules, amino acids, aminoacyl tRNA synthetase activity and concentration, and local mRNA folding [1,50,58].

Figure 12 depicts the steady-state production rate as a function of λ_{0} for the four *S. cerevisiae* genes for both the RFMEO (figure 12*a*) and the RFM (figure 12*b*). It may be seen that the transition from an initiation rate limiting stage to an elongation rate limiting stage occurs for a lower initiation value in the RFMEO as compared to the RFM.

Figure 13 depicts the steady-state mean density as a function of λ_{0} for the four genes and two models. Again, it can be seen that the transition from an initiation rate limiting stage to the elongation rate limiting stage occurs at lower initiation value in the RFMEO as compared to the RFM. This holds for all four genes.

## 5. High correlation between ribosome flow model with extended objects and totally asymmetric simple exclusion process with extended objects

In this section, we show that the RFMEO correlates better with TASEPEO than the RFM, supporting the modelling of intracellular processes with multi-site biological machines such as translation and transcription using the RFMEO.

The simulations of TASEPEO with dimension *N*, rates *μ* (see (1.1)), and particle size ℓ use a parallel update mode. At each time tick *t*_{k}, the sites along the lattice are scanned from site *N* backwards to site 1. If it is time to hop, and the site that is ℓ sites in front is empty then the reader advances to the consecutive site. If the site that is ℓ sites in front is occupied, the next hopping time, *t*_{k} + *ɛ*_{k}, is generated randomly. For site *i*, *ɛ*_{k} is exponentially distributed with parameter (1/*μ*_{i+1}) (see (1.1)). The occupancy at each site is averaged throughout the simulation, with the first 700 000 cycles discarded in order to obtain the steady-state value. We use to denote the steady-state reader density, *J*: = *β*ϱ_{N} to denote the steady-state current (or output rate), and for the steady-state mean reader density.

In the examples below, we numerically calculated the Pearson correlation coefficients between the steady states of the RFMEO, TASEPEO and RFM.

### Example 5.1.

Consider the RFMEO with dimension *n* = 75, and transition rates λ_{0} = · · · = λ_{75} = 1. Let denote the steady-state density of an RFM with the same dimension and rates. We also simulated TASEPEO with dimension *N* = 75 and rates *μ* = λ. Figure 14 depicts the Pearson correlation coefficient *r*(*e*, ϱ) between the steady-state reader densities of the RFMEO and the TASEPEO, and the Pearson correlation coefficient between the steady-state reader densities of the RFM and TASEPEO, as a function of ℓ∈{1, …, 30}. The corresponding *p*-values were all less than 10^{−50}. It may be seen that *r*(*e*, ϱ) and are somewhat similar for ℓ∈{1, …, 5}, however, for all ℓ > 5, *r*(*e*, ϱ) > 0.94 whereas decreases with ℓ, and is equal to about 0.825 for ℓ = 30. Of course, this makes sense as the RFMEO is a mean field approximation of TASEPO.

The following examples consider the case of non-homogeneous transition rates.

### Example 5.2.

Consider the RFMEO with dimension *n* = 40, particle size ℓ = 15, and rates λ_{i} = 1 + 0.3 sin(2*πi*/41), *i* = 0, …, 40. Figure 15 depicts the RFMEO steady-state reader density *e*, the TASEPEO steady-state reader density ϱ for *μ* = λ, and particle size 15, and the steady-state density in the RFM with the same dimension and rates. It can be seen that *e* provides a far better estimate of ϱ than .

In order to verify that the high correlation between RFMEO and TASEPO holds for a large set of parameters, we also simulated the case where the rates are drawn randomly.

### Example 5.3.

Consider the RFMEO with dimension *n* = 100, particle size ℓ = 10, and rates
5.1where is a random variable uniformly distributed in the interval . We compared the steady-state production rates of this RFMEO with those of the corresponding TASEPO and with two RFMs. One RFM with the same dimension and rates. Another RFM, that we refer to as RFM10, is an approximation of the chain with 10 ‘codons’ per site. Thus, it has dimension , where each site contains 10 consecutive sites of the RFMEO (other than the last site which contains the last 11 consecutive sites of the RFMEO). The rates of RFM10 are , where *T*_{i} = (10(*i* + 1) − 1) if *i* < 9, and otherwise *T*_{i} = 100. Note that since the dimension of this RFM10 is nine, it cannot be used to estimate the entire density profile of the TASEPEO with dimension 100.

We ran 300 tests, where in each test a new set of rates were drawn according to (5.1). Figure 16 depicts the correlation between the steady-state production rates of (i) RFMEO and TASEPEO, (ii) RFM (i.e. RFM with dimension 100 and rates λ_{i}) and TASEPEO, and (iii) RFM10 and TASEPEO, over the 300 tests. It may be seen that the RFMEO provides the best correlation with TASEPEO.

Figure 17 depicts the correlations between the steady-state mean densities for the same three cases. It may be seen that again the correlation between the RFMEO and TASEPEO is high (*r* ≃ 0.927). The correlation between the RFM10 and TASEPEO is slightly better (*r* ≃ 0.944); however, as stated above, RFM10 cannot be used to provide an estimate to the actual (per codon) density profile.

## 6. Discussion

We studied a deterministic mechanistic model for mRNA translation, the RFMEO, that encapsulates many realistic features of this biological process including the fact that every ribosome covers several codons and that ribosomes cannot overtake one another.

The RFMEO is a mean-field approximation of TASEPEO (see electronic supplementary material, appendix B) and, as demonstrated above, its simulation results often correlate well with those of TASEPEO. However, unlike TASEPEO, the RFMEO is amenable to rigorous analysis using tools from systems and control theory.

We proved that the RFMEO converges to a unique steady-state density and steady-state production rate for any set of feasible transition rates. We follow the terminology used in physics, where an equilibrium point (steady state) is characterized by a zero (constant but non-zero) total flow of energy [62]. The convergence to this unique steady state takes place at an exponential rate. In this respect, the RFMEO is robust to the initial conditions.

One may naturally ask whether biological systems are at steady state (that may be more general than the steady state here, e.g. a periodic trajectory). Models with a steady state (or several of steady states) have been found to be useful in numerous studies in systems biology (e.g. [63] and references therein). In practice the state of the art routine biological experiments and their interpretation assume steady state as they are performed in a very specific experimental environment which is kept constant during the entire experiment (e.g. [64–66]).

In particular, the steady state in the RFM has been used to accurately predict several features of gene expression (e.g. [9,30,54]). Here, we used the RFMEO to model a highly expressed *S. cerevisiae* gene. The rates were estimated based on biological data. In the resulting RFMEO the convergence to a state close to the steady state takes approximately 30 s, whereas the mRNA half-life is of the order of tens of minutes. This suggests that at least in this case the steady-state assumption is justified.

An important question is how does the steady state depend on the RFMEO parameters. We proved that increasing any of the RFMEO rates can only increase the steady-state production rate and that in the totally homogeneous case (i.e. when all the rates are equal) the reader ribosomal density monotonically decreases along the mRNA. In addition, we proved that if one or more of the RFMEO rates are time-varying periodic functions, with a common period *T*, then the densities along the mRNA and thus also the production rate converge to a periodic solution with period *T*.

The results reported here can shed light on various biophysical aspects of translation, and can be further studied experimentally. For example, our analysis suggests that higher decoding rates at the last ℓ codons of the coding region can be expected (since in this region no downstream ribosome can block the ribosome movement). This can be validated experimentally for example based on approaches that track the movement of ribosomes at high resolution [67].

In addition, analysis and simulations of the RFMEO demonstrate several surprising and counterintuitive results. For example, increasing the particle size ℓ (i.e. the ribosome footprint) may sometimes lead to an increase in the production rate. Also, for large ℓ the steady-state density along the mRNA may be quite complex (e.g. with quasi-periodic patterns) even for relatively simple (and non-periodic) transition rates. It will be interesting to see if similar patterns are observed experimentally by possibly engineering the codon elongation rates of heterologous or endogenous genes and monitoring translation [19,67].

We believe that the RFMEO could be useful for modelling, understanding and re-engineering translation. Specifically, the advantages of the model mentioned above should make it a better candidate than other alternative models for solving some of the open questions in the field [9].

An important topic for future research is using the RFMEO to model ribosome flow based on biological data. This is a challenging task, as many aspects of translation are still not clear. For example, translation initiation is affected by complex phenomena such as the number of free ribosomes, mRNA folding near the 5′ end of the mRNA, untranslated region length and other features, the nucleotide composition surrounding the start codon and more. In addition, current techniques for measuring ribosome densities provide partial, noisy and biased data (e.g. [68]). Thus, using them to estimate the parameters in a computational model like the RFMEO is a non-trivial challenge.

Another research topic is using the RFMEO (and networks of RFMEOs) to study various phenomena such as competition for resources in mRNA translation [9,43], transcription [27] and evolution of transcripts [9].

## Data accessibility

No experimental data were generated in this study. All the proofs are provided as the electronic supplementary material.

## Authors' contributions

All authors developed the ideas, and contributed to writing and editing of the manuscript.

## Competing interests

We declare we have no competing interests.

## Funding

This research is partially supported by research grants from the Israeli Science Foundation, the Israeli Ministry of Science, Technology & Space, the Edmond J. Safra Center for Bioinformatics at Tel Aviv University and the US-Israel Binational Science Foundation.

## Acknowledgments

We are grateful to the anonymous referees for their comments that greatly helped in improving this paper.

## Footnotes

Electronic supplementary material is available online at https://dx.doi.org/10.6084/m9.figshare.c.3894937.

- Received February 21, 2017.
- Accepted September 18, 2017.

- © 2017 The Author(s)

Published by the Royal Society. All rights reserved.