## Abstract

Eukaryotic mRNAs usually form a circular structure; thus, ribosomes that terminatae translation at the 3′ end can diffuse with increased probability to the 5′ end of the transcript, initiating another cycle of translation. This phenomenon describes ribosomal flow with positive feedback—an increase in the flow of ribosomes terminating translating the open reading frame increases the ribosomal initiation rate. The aim of this paper is to model and rigorously analyse translation with feedback. We suggest a modified version of the ribosome flow model, called the *ribosome flow model with input and output*. In this model, the input is the initiation rate and the output is the translation rate. We analyse this model after closing the loop with a positive linear feedback. We show that the closed-loop system admits a unique globally asymptotically stable equilibrium point. From a biophysical point of view, this means that there exists a unique steady state of ribosome distributions along the mRNA, and thus a unique steady-state translation rate. The solution from any initial distribution will converge to this steady state. The steady-state distribution demonstrates a decrease in ribosome density along the coding sequence. For the case of constant elongation rates, we obtain expressions relating the model parameters to the equilibrium point. These results may perhaps be used to re-engineer the biological system in order to obtain a desired translation rate.

## 1. Introduction

Translation is one of the stages in the flow of gene expression, i.e. the process in which cells produce proteins based on the information encoded in the DNA and mRNA sequences. During translation, ribosomes ‘read’ the mRNA sequence and translate it into a corresponding amino acid chain. Translation occurs in all organisms, in almost all the cells, and under almost all conditions; thus, developing a better understanding of this complex process has numerous potential applications in medicine, biotechnology, systems biology, functional genomics and more.

The last few years have witnessed a surge of new empirical techniques for studying gene expression and specifically the translation mechanism. *Ribosome profiling* [1] provides a quantitative snapshot of the protein synthesis process by ribosomes, including precise measurements of ribosomal positions. Techniques for measuring absolute protein levels [2–6] and fractionation of mRNA–ribosome complexes (polysomes) allow the number of ribosomes on each mRNA to be estimated [1,7]. Other technological advances allow large-scale measurements of mRNA half-lives [8], protein half-lives [9], protein translation rates and mRNA transcription rates [10].

These methods produce vast amounts of data. While the amount of data increases at an exponential rate, the computational approaches for analysing these data seem to evolve at a slower pace. Mathematical and computational models of translation are becoming increasingly more important because of the need to better organize and understand these data.

A standard mathematical model for translation elongation is the *totally asymmetric simple exclusion process* (TASEP) [11]. TASEP was introduced by MacDonald *et al*. [12], and independently by Spitzer [13]. TASEP is a stochastic model for particle movement along some kind of ‘tracks’ or ‘trails’. A lattice of sites models the tracks, and particles hop, with some probability, from one site to a neighbouring one. An important property of this model is that hops may take place only to a target site that is not already occupied by another particle. Hence the term *simple exclusion*. The term *totally asymmetric* is used when the motion along the lattice is unidirectional.

TASEP models have been used to study a large number of biological systems, ranging from extracellular transport and gene transcription or translation to pedestrian dynamics [14]. Despite its rather simple description, it seems that rigorous analysis of the TASEP is non-trivial.

Reuveni *et al.* [15] recently introduced a deterministic model for translation called the *ribosome flow model* (RFM; figure 1). In the RFM, mRNA molecules are coarse-grained into a unidirectional chain of *n* sites of codons. The value of *n* may depend, for example, on geometrical properties of the ribosome. Ribosomes reach the first site with initiation rate *λ* > 0, but are only able to bind if this site is not already occupied by another ribosome. In practice, the initiation rate is a function of physical features such as the number of available free ribosomes, the folding energy of the 5′ untranslated regions (UTRs), the folding energy at the beginning of the coding sequence, the base pairing potential between the 5′ UTR and the ribosomal rRNA, and the context of the START ATG [15–19]. A ribosome that occupies site *i* moves, with transition rate *λ*_{i} > 0, to the consecutive site, if this site is not already occupied by another ribosome. Thus, the RFM captures both the simple exclusion and totally asymmetric properties of the TASEP. The transition rates *λ*_{i} depend on various features of the transcript [18] (see also the Methods section in [15]).

The simulations in the study of Reuveni *et al*. [15] show that, for suitable parameter values, TASEP and the RFM lead to similar predictions. For example, the correlation between their predictions over the set of endogenous genes of *Saccharomyces cerevisiae* is 0.96.

The RFM with *n* sites is given by
1.1Here, describes how occupied site *i* is, where the value 1 [0] means that the site is completely occupied [completely free]. The rate of ribosome flow into [out of] the system is given by *λ*(1 − *x*_{1}(*t*)) [*λ*_{n}*x*_{n}(*t*)], whereas the rate of ribosome flow from site *i* to site *i* + 1 is given by *λ*_{i}*x*_{i}(*t*)(1 − *x _{i}*

_{+}

_{1}(

*t*)).

Since the state variables correspond to normalized occupation levels, we consider initial conditions *x*(0) in the closed unit cube

In the study of Margaliot & Tuller [20], we showed that there exists a unique equilibrium point , where
is the interior of *C*, and that any solution of the RFM, with , converges to *e*. The biophysical interpretation is that the parameters *λ*, *λ*_{i} uniquely determine a steady state of ribosome distributions (and thus translation rate), and that perturbations in this distribution will not change the asymptotic behaviour of the dynamics. In particular, this explains why the simulations of the RFM in Reuveni *et al.* [15] converged to the same final state regardless of the initial condition. Changing the values of the positive paraments *λ*, *λ*_{i} will not change this qualitative behaviour; however, it will change *e*, i.e. the distributions and the translation rates at the steady state.

Determining the relationship between the RFM parameters and *e* is an important and difficult problem. It has been shown [21] that in some cases the transition rate along genes is constant, so the translation efficiencies of all the codons are identical. This happens, for example, when the rate is limited by the concentration of elongation factors and not by the local features of the coding sequence, such as tRNA molecules or when there is a balance between the codon frequencies and tRNA levels [22]. Margaliot & Tuller [23] considered the RFM in the special case where
i.e. the transition rates *λ*_{i} are all equal, and *λ*_{c} denotes their common value. Since this *homogeneous ribosome flow model* (HRFM) includes only two parameters, *λ* and *λ*_{c}, the analysis is simplified. The results in Margaliot & Tuller [23] show that the dependence of *e* on the parameters may be separated into two different regimes. The transition between these two regimes takes place when
1.2This is reminiscent of the phase transition that takes place in the TASEP [11]. For the HRFM, it is also possible to derive an explicit expression for *e* = (*e*_{1}, *…* ,*e _{n}*)

*‘*at the limit of very high initiation rate, namely, 1.3

The steady-state translation rate when *λ* → *∞* is thus
1.4This implies that, when the initiation rate is very high, the translation rate in the HRFM depends on the coding sequence, i.e. the length *n* of the sequence and the translation rate *λ*_{c}, and that elongation becomes the rate-limiting step. Note that, as *n* goes to infinity, *R* in (1.4) goes to *λ*_{c}/4. One may define the *capacity* of a gene as its maximal translation rate. Thus, in the case of the HRFM, we have a closed-form formula for the capacity. These results may be applied when analysing translation rates in mouse embryonic stem cells, as in this case translation rates are usually constant [21].

In this paper, we consider for the first time the RFM as a *control system*, with the initiation rate as the input, and the translation rate *R*(*t*) = *λ*_{n}*x*_{n}(*t*) as the output. We analyse the behaviour of the RFM after closing the loop from output to input with positive linear feedback. The motivation for this stems from the fact that in eukaryotic translation the poly(A)-binding protein binds both the poly(A) tail at the 3′ end of the transcript and eIF4G at the 5′ end of the transcript, promoting mRNA circularization. Thus, if the UTRs of the transcript are short enough, the ribosomes that terminate translation at the 3′ end can diffuse with increased probability to the 5′ end of the transcript, re-initiating another cycle of translation.

Our results show that several nice properties of the RFM and the HRFM hold also for the closed-loop system. In particular, there exists a unique globally asymptotically stable equilibrium point *e*. For the case of equal *λ*_{i}s, we obtain closed-form expressions relating the closed-loop system parameters with *e*. Our results may be used to explain experimental data and, in the context of synthetic biology, to re-engineer the biological translation process in order to obtain a desired synthesis rate.

The remainder of this paper is organized as follows. In §2, we describe the ribosome flow model with input and output (RFMIO) with feedback as a model for the eukaryotic translation process. Section 3 states our main results. The proofs, given in §5, use some known tools that are reviewed in §4. The final section includes a comparison between our model and the TASEP, and describes some open questions for further study.

## 2. Ribosome flow model with input and output with feedback

A typical eukaryotic mRNA includes, in addition to the open reading frame (ORF), also the 5′ and 3′ UTRs (figure 2) [24,25]; it also includes a cap structure (called RNA 7-methylguanosine cap, or *m7G* cap), which is a special tag added to the 5′ end of the mRNA; the cap is a terminal 7-methylguanosine residue that is linked to the first transcribed nucleotide. In addition, the mRNA includes the ORF (i.e. the codons that are translated to the protein), and the poly(A) tail at the 3′ end of the mRNA.

In eukaryotes, mRNA molecules usually form *circular* structures; a simplified schematic of the closed-loop model of translation initiation appears in figure 2 [26,27]. In this structure, the *eIF4F* complex (which includes a few proteins, *eIF4E* complexed with *eIF4G* and *eIF4A*) interacts with both the 5′ end of the mRNA (via the protein *eIF4E*) and the poly(A) tail (via the poly(A) binding protein, *PABP*) and recruits the 40S ribosomal subunit (i.e. the small ribosomal unit) via its interaction with the protein eIF3. The small ribosomal subunit is part of the 43S pre-initiation complex (which includes the 40S ribosomal subunit loaded with the proteins *eIF3*, *eIF1* and *eIF1A*, the initiator *Met-tRNAiMet*, *eIF2* and *GTP* binds the *eIF4F*–mRNA complex) that usually scans along the mRNA molecule from the 5′ end of the 5′ UTR towards the beginning ORF until it reaches the start codon (usually an AUG triplet). Circularization is thought to promote cycling of the ribosomal subunits (figure 2), leading to time-efficient translation, and may also function to ensure that only intact mRNA molecules are translated (partially degraded mRNA characteristically has no *m7G cap* and no poly(A) tail) [28].

The poly(A) tail protects the mRNA molecule from enzymatic degradation in the cytoplasm and aids in: transcription termination, export of the mRNA from the nucleus, and translation [29]; thus, almost all eukaryotic mRNAs are polyadenylated [30] and transcripts lacking the poly(A) tail are expected to have improper regulation.

Summarizing, it is biologically reasonable to consider the translation initiation rate to be affected by the translation termination rate. Mathematically, this can be done as follows. Consider the *RFMIO*
2.1where is a non-negative function of time, and then ‘close the loop’ by letting
2.2Here, *k*_{1} represents diffusion of ribosomes to the 5′ UTRs that is not related to recycling of ribosomes. In practice, this parameter depends on biochemical features such as the number of available ribosomal subunits and mRNA molecules in the cell, and the properties of the 5′ UTR.

The term *k*_{2}*y*(*t*) represents the recycling/reinitation of ribosomes that have already finished translating the mRNA (and is thus proportional to the output rate *y*(*t*)). The value of *k*_{2} depends on various biophysical properties of the mRNA sequence affecting the diffusion of ribosomes that terminated translation at the 3′ end back to the 5′ end.

In this paper, we study the closed-loop system, obtained by combining (2.1) and (2.2), i.e. 2.3

The next examples illustrate the dynamic behaviour of this closed-loop system. In order to allow the trajectories to be plotted, we consider examples with *n* = 3.

### Example 2.1.

Figure 3 depicts the trajectories of the closed-loop system (2.3), with *n* = 3, *λ*_{1} = *λ*_{2} = *λ*_{3} = 1, *k*_{1} = 1 and *k*_{2} = 100, for three initial conditions in *C*. It may be seen that all three trajectories converge to an equilibrium point
2.4(all numerical values are to four-digit accuracy).

### Example 2.2.

Consider (2.3) with *n* = 3, *λ*_{1} = *λ*_{2} = *λ*_{3} = 1 and *k*_{1} = 0.01. The simulations indicate that the closed-loop system admits a unique globally attractive equilibrium point for all *k*_{2} > 0. Figure 4 depicts *e*_{3} as a function of . It may be seen that *e*_{3} increases with *k*_{2}. For small values of *k*_{2}, the increase rate is very low. The reason for this is that *k*_{1} is close to zero, and when *k*_{1} = *k*_{2} = 0 the closed-loop system admits an equilibrium point at 0. The slow increase corresponds to ‘escaping’ from this equilibrium point. For larger values of *k*_{2}, the increase of *e*_{3} is more or less linear in *k*_{2}, and then the graph ‘flattens’ and the limit exists.

If is an equilibrium of (2.3) then the right-hand side of every equation in (2.3) must be zero, so
2.5If *e _{n}* = 0, then this implies that

*e*= 0 for all

_{i}*i*, and

*k*

_{1}= 0, which is a contradiction to our assumption that

*k*

_{1}> 0. Thus, , and then 2.6and 2.7Note that (2.6) implies that and, more generally, for all

*i*≥ 1, 2.8where on the right-hand side the term

*e*appears a total of

_{n}*i*+ 1 times. This implies in particular that

*e*.

_{n}uniquely determines eIt is important to note that the closed-loop system considered here is quite different from the RFM. Indeed, the RFM is an open-loop system with a constant initiation rate, whereas (2.3) includes a feedback connection from *x _{n}* to

*x*

_{1}. From a biological point of view, the closed-loop system is a more accurate description of translation in eukaryotes, as it includes recycling of ribosomal subunits that have not been analysed mathematically before. In fact, the closed-loop system is a generalization of the original RFM, as it includes both a term related to ribosomal recycling and a term related to ribosomal diffusion without recycling.

## 3. Main results

As will be shown in §5, the closed-loop system is a *monotone dynamical system*. Roughly speaking, *bounded* trajectories of such systems have a ‘simple’ behaviour: they almost always converge to an equilibrium point.

### 3.1. Bounded trajectories

The next result shows that *C* is an *invariant set* of the closed-loop dynamics so, in particular, trajectories that emanate from *C* remain bounded for all *t* ≥ 0. Let *∂C* = *C\*Int(*C*), i.e. the boundary of *C*. We use *x*(*t*,*a*) to denote the solution of (2.3) at time *t* for the initial condition *x*(0) = *a*.

### Proposition 3.1.

*For every* , *the solution of the closed-loop system satisfies* *for all t* ≥ 0*. Furthermore, for every* *there exists a time ε* > 0

*such that*.

Section 3.2 shows that all these bounded trajectories converge to a unique equilibrium point.

### 3.2. Equilibrium point and stability

### Theorem 3.2.

*The set C includes a unique equilibrium point e of the closed-loop system* (*2.3*)*. This equilibrium point is globally asymptotically stable in C*.

This implies that , for all , so a simulation of the closed-loop system from any initial condition in *C* will converge to the same final state.

From a biophysical point of view, this means that the closed-loop system admits a unique steady state of ribosome distributions along the mRNA, and thus a unique steady-state translation rate. Perturbations in the distribution of ribosomes due, for example, to events that are assumed to be rare, such as ribosomal drop-off or internal initiation, will not change the asymptotic behaviour of the translation dynamics. It will still converge to the same distribution of ribosomes and the same translation rates.

When the gain *k*_{2} is sufficiently small, more can be said about what happens until convergence. Recall that the *L*_{1} norm of a vector is .

### Theorem 3.3.

*Consider the closed-loop system* (*2.3*)*. If* *then*
3.1*for all a,* .

In other words, the convergence is monotone in the sense that the (*L*_{1}) distance between trajectories is always bounded by the distance between their initial conditions. Taking *b* = *e* in (3.1) yields
i.e. the convergence to *e* is monotone, as the distance to *e* can never increase.

Note that combining proposition 3.1 and theorem 3.2 implies that , so for all *i*. We already saw that *e _{n}* determines

*e*, i.e. that there exist functions

*g*: (0, 1) → (0, 1) such that It is natural to consider the following question. Suppose that we fix the

_{i}*λ*

_{i}s, and change the control parameters (

*k*

_{1},

*k*

_{2}), thus changing

*e*. How will this affect the other coordinates

_{n}*e*? The next result shows that if

_{i}*e*increases [decreases] then every

_{n}*e*increases [decreases].

_{i}### Proposition 3.4.

*The g _{i}*s

*satisfy*

*for all i*= 1, … ,

*n*− 1

*, and all*.

From a biophysical point of view, this implies that if a change in (*k*_{1}, *k*_{2}) yields an increase [decrease] in the translation rate, then it also yields an increase [decrease] in all the ribosome distributions along the mRNA.

For two vectors *a*, , we write *a* ≤ *b* if *a _{i}* ≤

*b*for

_{i}*i*= 1, … ,

*n*. We write

*a*<

*b*if

*a*≤

*b*and

*a*<

_{i}*b*for some

_{i}*i*, and we write if

*a*<

_{i}*b*for

_{i}*i*= 1, … ,

*n*.

The next result provides more information on the change in *e* induced by a change in the control parameters.

### Proposition 3.5.

*Suppose that the λ_{i}*s

*are fixed and let*(

*k*)

_{1}, k_{2}*and*

*be two sets of feasible control parameters. Denote by e and*

*the corresponding equilibrium points. Let*3.2

*If d* > *0 then* . *If* *d* = *0 then* . *If* *d* < *0 then* .

The case where we change only one control parameter follows immediately.

### Corollary 3.6.

*Suppose that the λ_{i}*s

*are fixed. Let e and*

*correspond to the control parameters*(

*k*)

_{1}, k_{2}*and*

*, respectively. If*

*then*

*if and only if*

*. If*

*then*

*if and only if*.

From a biophysical point of view, this implies the intuitive result that increasing any one of the positive feedback parameters leads to an increase in the ribosome distribution along the mRNA and thus to an increase in the translation rate.

It is important to obtain more explicit information on the relationship between *e* and the system parameters. This seems to be a non-trivial problem. However, in the special case of the *homogeneous ribosome flow model with input and output* (HRFMIO) the analysis becomes simpler.

### 3.3. Homogeneous ribosome flow model with input and output with feedback

From here on, we assume that
3.3with *λ*_{c} > 0. In this case, we refer to (2.3) as the HRFMIO with feedback.

The next result gives a closed-form expression for all the coordinates of *e* in terms of *e _{n}*.

### Proposition 3.7.

*Suppose that* (*3.3*) *holds. Let* . *If s* < 1*, let* *be such that* cos(*θ*) = *s. If s* > *1, let θ* > 0

*be such that*cosh(

*θ*) =

*s. Then*3.4

*for k*= 1, 2, … ,

*n*− 1.

### Example 3.8.

Consider the closed-loop system in example 3.1, i.e. with *n* = 3, *λ*_{c} = 1, *k*_{1} = 1 and *k*_{2} = 100. In this case, the simulation results show that *e* is as given in (2.4). In particular, *e _{n}* =

*e*

_{3}= 0.3809, so , and (3.4) yields and and this agrees with (2.4).

Since *e _{n}* determines all the coordinates of

*e*(see (3.4)), it is enough to consider how

*e*depends on the parameters

_{n}*λ*

_{c},

*k*

_{1}and

*k*

_{2}.

### Theorem 3.9.

*Suppose that* (*3.3*) *holds. Define s and θ as in proposition 3.7. Then*
3.5

An important goal in synthetic biology and biotechnology is to artificially engineer and control the expression levels of various genes. Equation (3.5) connects the flow from the last site of the mRNA (i.e. the protein production rate) with *k*_{1}, *k*_{2}. Thus, it may be used for reprogramming the expression levels of genes. For example, modifying the properties or structure of the 5′ UTR may be used to change the value of *k*_{1}, and thus change *e _{n}* (and consequently all the other

*e*s) and the translation rate. The next example demonstrates this.

_{i}### Example 3.10.

Consider the HRFMIO with *n* = 3 and *λ*_{c} = 1. Suppose that we are interested in determining parameter values such that *e*_{3} = 1/4. This corresponds to *s* = 1, so (3.5) becomes
Thus, we can take, for example, *k*_{1} = 3/20 and *k*_{2} = 1; these parameters correspond to the features of the UTRs and can be manipulated relatively easily. Solving (2.6) for these parameter values yields *e* = [3/8, 1/3, 1/4]’, so *e*_{3} = 1/4, as required.

Equation (3.5) implies that the dependence of *e* on the parameters has two different forms. The transition between these forms takes place when *s* = 1, i.e. when *e _{n}* = 1/4, and then (3.5) yields
3.6For example, if

*n*= 3,

*λ*

_{c}= 1 and

*k*

_{1}= 0.01, this implies that, for

*k*

_{2}= 1.56, we have

*e*

_{3}= 1/4 (so the steady-state translation rate is

*R*=

*λ*

_{c}

*e*= 1/4). This agrees with figure 4.

_{n}When either *k*_{1} → *∞* or *k*_{2} → *∞*, equation (3.5) implies that sin((*n* + 2)*θ*) must go to zero. Thus, *θ* → *π/*(*n* + 2), so the equation yields
Note that this implies that *e _{n}* is a monotonically decreasing function of

*n*. For example, for

*n*= 3 this yields and this agrees with figure 4.

From a biophysical point of view, the cases *k*_{1} → *∞* or *k*_{2} → *∞* correspond to the situation where the effective initiation rate is very high because of either general diffusion of ribosomes to the 5′ UTR (large *k*_{1}) or highly efficient recycling of ribosomes (large *k*_{2}). In this case, the steady-state translation rate
depends on the rate-limiting factor, which is the (uniform) elongation rate.

### Remark 3.11.

Note that (3.5) may be written as 3.7

This implies that, given a desired value *e _{n}*, any two control parameters (

*k*

_{1},

*k*

_{2}) and that satisfy the linear equation (3.7) lead to this same

*e*, and thus to the same

_{n}*e*. The next example demonstrates this.

### Example 3.12.

Suppose that *n* = 3, *λ*_{c} = 1, and that the desired value is *e*_{3} = 1/3. Then , and (3.5) becomes
Figure 5 depicts the trajectories of the closed system for (*k*_{1}, *k*_{2}) = (2/3, 1) and for for the same three initial conditions. It may be seen that the dynamics in these two cases is different (although similar), yet in both cases all the trajectories converge to the same equilibrium point, namely,
3.8

### Remark 3.13.

Consider the special case *k*_{2} = 0. In this case, *u*(*t*) ≡ *k*_{1}, so the closed-loop system becomes the HRFM with initiation rate *k*_{1}. Equation (3.5) becomes
3.9

This provides a closed-form expression for the mapping from *e _{n}* to

*k*

_{1}. The

*inverse*of this mapping, i.e. the mapping from the initiation rate

*k*

_{1}to

*e*, combined with (3.4), is the

_{n}*input-to-state characteristic*[31] of the HRFM with input

*u*and output

*y*=

*λ*

_{c}

*e*.

_{n}The next result shows that the steady-state distribution satisfies a kind of ‘traffic jam’ or ‘diffusion-like’ behaviour: as we move towards the end of the chain (i.e. from the 5′ end to the 3′ end of the ORF), the number of ribosomes decreases.

### Proposition 3.14.

*Suppose that* (*3.3*) *holds. The equilibrium point* *satisfies*
3.10

One may think of *e _{i}* as the steady-state ribosomal density at site

*i*. A higher value of

*e*implies a higher chance that site

_{i}*i*is occupied by a ribosome, and so higher values along some part of the chain corresponds to a ‘traffic jam’. The monotonic decrease of

*e*in (3.10) resembles the diffusion of particles (ribosomes in our case) from a region with higher particle density (5′ end of the ORF in our case) to a region with lower density (3′ end of the ORF in our case). These results seem to agree with experimental results for eukaryotes (i.e. usually containing mRNA circulation and thus feedback) showing that there is a decrease in ribosomal density from the 5′ end to the 3′ end of the ORF even when the elongation speed is constant [21]. It is interesting to note that even when the elongation speed is not constant there is an increase in ribosomal density at the beginning of genes [1] (see also [7,32]).

_{i}## 4. Preliminaries

In order to make this paper more self-contained, we briefly review some known results that will be used in the proofs of the main results.

### 4.1. Monotone dynamical systems

Let be an open set. Consider a system of *n* ordinary first-order differential equations
4.1where is continuously differentiable. For *t* ≥ 0 and , let *x*(*t*, *x*_{0}) denote the solution of (4.1) at time *t* for the initial condition *x*(0) = *x*_{0}.^{1}

### Definition 4.1.

The vector field is said to satisfy the Kamke condition if, for any two vectors *a*, satisfying *a* ≤ *b* and *a _{i}* =

*b*,

_{i}The Kamke condition implies that the flow of (4.1) is *monotone* in the following sense.

### Proposition 4.2.

([33]*, ch. 3*) *Let* <* _{r} denote any of the relations* ≤, < or .

*Suppose that the vector field f in*(

*4.1*)

*satisfies the Kamke condition. Then*

*implies that*

In other words, the flow preserves the ordering of the initial conditions for all *t* ≥ 0.

Let denote the Jacobian matrix of *f*, i.e. the matrix whose *ij*th entry is *∂f _{i}/∂x_{j}*. If

*Ω*is convex and 4.2then the Kamke condition holds [33].

Note that (4.2) implies that a positive change in *x _{j}* increases

*f*(

_{i}*x*). Since , this implies that different state variables augment the growth of each other. A system (4.1) that satisfies (4.2) is called a

*cooperative system*. If a cooperative system satisfies

*∂f*= 0 for all |

_{i}/∂x_{j}*i*−

*j*| > 1, then the system is said to be a

*tridiagonal cooperative system*[34]. If, furthermore,

*∂f*> 0 for

_{i}/∂x_{j}*|i*−

*j|*= 1, then the system is called a

*strongly cooperative tridiagonal system*(SCTS). Smillie [34] has shown that any

*bounded*trajectory of an SCTS must converge to an equilibrium point.

As noted in Margaliot & Tuller [20], the RFM (1.1) is an SCTS on the open unit cube Int(*C*). For example, since , with *f*_{2}(*x*) = *λ*_{1}*x*_{1}(1 − *x*_{2}) − *λ*_{2}*x*_{2}(1 − *x*_{3}), we have only for *i* = 1,2,3. Also, for every ,
and
However, the closed-loop system (2.3) considered in this paper is *not* tridiagonal, as *f*_{1}(*x*) depends on *x _{n}*.

### 4.2. Periodic continued fractions

Consider the finite continued fraction
4.3If *b _{i}* =

*b*and

*a*=

_{i}*a*for all

*i*, then (4.3) is (a special case of) a finite

*periodic*continued fraction [35]. In this case, there exists a closed-form expression for

*c*in terms of

*Chebyshev polynomials of the second kind*. These polynomials are defined by the recurrence relation ([36], ch. 1) 4.4

### Lemma 4.3.

[37] *Consider the continued fraction* (*4.3*) *with b _{i}* =

*1 and a*=

_{i}*a for all i, where a*>

*0. Then*

We list some properties of the *U _{i}*s that will be used below. For more details and proofs, see Mason & Handscomb [36]. For ,
4.5where is the angle such that cos(

*θ*) =

*z*. Similarly, for , 4.6where

*θ*is such that cosh(

*θ*) =

*z*.

### 4.3. Contraction principle

We briefly review a *contraction principle* that will be used in the proof of theorem 3.3. For more details, see Vidyasagar ([38], ch. 3), Russo *et al.* [39] and Lohmiller & Slotine [40]. Let be a vector norm. The induced matrix norm is
and the induced *matrix measure* is

Matrix measures have an important role in the stability analysis of dynamical systems. Indeed, consider the linear system . Then, up to a first-order approximation
so *|y*(*t* + *ε*)*|* ≤ *||I* + *εA|||y*(*t*)*|*, and
Hence,
where d^{+}/d*t* denotes the right-hand derivative.

Recall that a set is called an *invariant set* of the dynamical system if implies that for all *t* ≥ 0. Intuitively, this means that, once a trajectory of the system enters *K*, it will never leave it.

### Theorem 4.4 (*Contraction principle*)

*Consider the equation*
4.7*with* *continuously differentiable. Let* *denote the Jacobian of f. Suppose that a convex set* *is an invariant set for* (*4.7*)*, and that there exists a vector norm* *such that the induced matrix measure satisfies*

*Then for all* *a*, *and all* *t* ≥ 0,
For a self-contained proof of this result, see Russo *et al.* [39].

If *r* < 0 then the distance between the two trajectories *x*(*t*, *a*) and *x*(*t*, *b*) decays exponentially with *t*, the system is said to be contracting, and *r* is called the *contraction rate*.

From the point of view of mathematical analysis, the RFM has three important properties. It is a monotone dynamical system, it is almost a contractive system, i.e. it is contractive with zero contraction rate, and the equations describing *e* can be put in the form of a continued fraction (in the HRFM, a periodic continued fraction).

For our choice of input and output, the RFMIO is a *monotone control system* as defined by Angeli & Sontag [41]. There are several interesting results on the behaviour of monotone control systems with feedback [41–44]. An important tool in the analysis of such systems is a reduced-order system that depends on the input-state/output characteristic of the monotone control system. For the RFMIO, it seems that obtaining an explicit expression for this characteristic is non-trivial. Instead, our analysis relies on combining the theory of monotone systems (and not monotone control systems) with the special structure of the HRFM.

## 5. Proofs

An immediate yet important observation is that the closed-loop system (2.3) is a monotone dynamical system. Indeed, a calculation of the Jacobian *J*(*x*) of (2.3) appears below, where *u* = *k*_{1} + *k*_{2}*λ*_{n}*x*_{n}. It may be seen that all the non-diagonal elements are non-negative for all , so the Kamke condition holds.

### Proof of proposition 3.1

Seeking a contradiction, assume that there exists a (first) time *T* > 0 such that . Then , i.e. for at least one index *k*. This implies at least one of the following two cases.

*Case 1.* There exists a (minimal) index *i* such that
5.1and *x _{j}*(

*T*) > 0 for all

*j*<

*i*. If

*i*= 1 then Since for

*t*<

*T*, this implies that

*x*

_{1}(

*T*) > 0. We conclude that the case

*i*= 1 is not possible, so

*i*> 1. Now (2.3) yields . Since

*x*(

_{j}*T*) > 0 for all

*j*<

*i*, this implies that , so

*x*(

_{i}*T*) > 0. This contradicts (5.1), so we conclude that case 1 is not possible.

*Case 2.* There exists a (maximal) index *i* such that
and *x _{j}*(

*T*) < 1 for all

*j*>

*i*. If

*i*=

*n*, (2.3) yields . Since for

*t*<

*T*, this implies that

*x*(

_{n}*T*) < 1. We conclude that the case

*i*=

*n*is not possible, so

*i*<

*n*. Using (2.3) again yields . Since

*x*(

_{j}*T*) < 1 for all

*j*>

*i*, this implies that , so

*x*(

_{i}*T*) < 1. We conclude that case 2 is also not possible. This contradiction implies that Int(

*C*) is an invariant set of the RFM.

To complete the proof of the proposition, consider an initial condition . Then for at least one index *k*. This implies at least one of the following two cases.

*Case 1.* There exists a (minimal) index *i* such that *a _{i}* = 0, and

*a*> 0 for

_{j}*j*<

*i*. We will show that 5.2If

*i*= 1, then (2.3) yields , so (5.2) indeed holds. If

*i*> 1, then and since

*x*

_{i−}_{1}(0) =

*a*

_{i−}_{1}> 0, this implies that , so again (5.2) holds. It follows from the proof above that, for all

*j*<

*i*, for all

*t*≥ 0. Thus, for all , and then for all and all

*t*≥

*τ*.

*Case 2.* There exists a (maximal) index *i* such that *a _{i}* = 1 and

*a*< 1 for

_{j}*j*>

*i*. A similar argument shows that for some

*τ*> 0.

Inductively, this implies that there exists a time *η* > 0 such that for all and all *t* > *η*. This completes the proof of proposition 3.1. *▪*

### Proof of theorem 3.2

Since *C* is a convex and compact invariant set of the dynamics, it must include an equilibrium point. Assume that *e*, are two different equilibrium points in *C*. Since *e _{n}* determines

*e*(see (2.8)), this means that , so we may assume that 5.3Equation (2.6) yields and proceeding in this fashion yields , so (2.7) yields Since

*k*

_{1}> 0 and

*k*

_{2}≥ 0, this implies that . This contradicts (5.3). We conclude that the equilibrium point in

*C*is unique. For cooperative systems, the existence of an invariant set that contains a

*unique*equilibrium implies that this equilibrium is globally asymptotically stable [45] (see also [46]), and this completes the proof.

*▪*

### Proof of theorem 3.3

Let denote the matrix measure induced by the *L*_{1} norm. It is well known ([38], ch. 3) that
5.4i.e. the maximum of the column sums, with non-diagonal elements taken in absolute values. Consider *μ*_{1}(*J*(*x*)). For , the off-diagonal elements of *J*(*x*) are non-negative, so we may ignore the absolute values in (5.4). Let *s _{i}*(

*x*) denote the sum of the elements in column

*i*of

*J*(

*x*). Then and

*s*

_{2}(

*x*) =

*s*

_{3}(

*x*) = … =

*s*

_{n−}_{1}(

*x*) = 0 for all

*x*. For , we have for all . Combining this with theorem 4.4 proves (3.1).

*▪*

### Proof of proposition 3.4

The proof is by induction. First, it follows from (2.6) that *g _{n−}*

_{1}(

*z*) =

*λ*

_{n}

*z*/(

*λ*

_{n−}

_{1}(1 −

*z*)), so

Now suppose that *∂/*(*∂z*)*g _{n−i}*(

*z*) > 0 for

*i*= 1, … ,

*k*− 1. Using (2.6) yields , so Using the induction hypothesis and the fact that completes the proof.

*▪*

### Proof of proposition 3.5

Consider the case . We already know that this happens if and only if . In particular, if and only if . By (2.7), this happens if and only if
i.e. if and only if *d* > 0. The proof in the cases and is similar. *▪*

### Proof of corollary 3.6

Consider first the case where . Then (3.2) becomes 5.5We consider three subcases.

*Case 1.* Suppose that *d* < 0. Then proposition 3.5 implies that , so in particular . Combining this with (5.5) yields .

*Case 2.* Suppose that *d* = 0. Proposition 3.5 implies that . Combining this with (5.5) yields .

*Case 3.* Suppose that *d* > 0. Proposition 3.5 implies that , and combining this with (5.5) yields .

This completes the proof for the case . The proof for the case is similar.*▪*

### Proof of proposition 3.7

Combining (3.3) and (2.6) yields
5.6where the term *e _{n}* appears a total

*i*+ 1 times. The right-hand side of this equation is a periodic continued fraction, and, by lemma 4.3, If

*e*< 1/4, then . Let

_{n}*θ*be such that . Using (4.6) yields If

*e*= 1/4, then . It follows from (4.4) that

_{n}*U*(1) =

_{k}*k*+ 1, so If

*e*> 1/4, then . Letting be the angle such that , and using (4.5), yields This completes the proof of (3.4).

_{n}*▪*

### Proof of theorem 3.9

Combining (2.6), (2.7) and (3.3) yields
5.7where *f* = *λ*_{c}*e _{n}/(k*

_{1}+

*k*

_{2}

*λ*

_{c}

*e*), and on the right-hand side the term

_{n}*e*appears a total of

_{n}*n*times. Applying lemma 4.3 yields where ). Thus, where the last equation follows from (4.4). Suppose for the moment that , and let be the angle such that cos(

*θ*) =

*s*. Then (4.5) yields

This proves (3.5) when *s* < 1. The proof in the cases *s* = 1 and *s* > 1 is similar.*▪*

### Proof of proposition 3.14

Since , we have in particular *e _{n}* < 1. Combining this with (2.6) and (3.3) yields
5.8Using (2.6) again yields
and (5.8) implies that

*e*

_{n−}_{2}>

*e*

_{n−}_{1}. Proceeding in this fashion proves (3.10).

*▪*

## 6. Discussion

In this paper, we analysed for the first time the RFM with positive linear feedback. Our results show that several nice properties of the RFM/HRFM remain true for the closed-loop system. In particular, it admits a unique globally attractive equilibrium point *e* in *C*, and, in the case of equal translation rates, it is possible to derive expressions relating *e* to the model parameters.

Any good mathematical model must encapsulate the complexity of the real system, yet remain tractable. This is true for the RFMIO as well. However, even the particular case of the HRFMIO model is not sequence independent. For example, the value of the parameter *λ*_{c} depends on features of the coding sequence and their effect on translation elongation; genes with sequence features that promote a higher elongation rate (e.g. higher adaptation to the tRNA pool [47] and/or weaker folding of the mRNA sequence [18]) correspond to a higher value of *λ*_{c}. Similarly, the variables *k*_{1} and *k*_{2}, related to the closed-loop initiation rate, depend among other on features of the 5′ UTR such as the Kozak sequence [17], the strength of the folding at the 5′ end of the ORF [47,48], and the length and nucleotide content of the ORF and UTRs [49]. Thus, the results reported in this study encapsulate a wide range of sequence features related to translation initiation and elongation efficiency.

As noted earlier, some of the model predictions can be compared with biological experiments; for example, the monotonic decrease of the ribosomal density (*e _{i}*) along the coding sequence is in agreement with previously reported results [21,50]. It may be interesting to try and validate other predictions of the model. For example,

— The convergence rate of the translation process to steady state (see theorem 3.3) can be experimentally studied via real-time observation of translation of genes with high-enough mRNA half-lives [51].

— The relations between (

*k*_{1},*k*_{2}) and the ribosomal density and translation rate (see proposition 3.5 and corollary 3.6) can be validated experimentally by the synthesis and analysis of heterologous gene libraries based on the manipulations of the UTRs (which should affect these parameters) [3], but not the ORFs, and the measuring of the ribosomal density [1] and protein levels (for example, with green fluorescent protein [3]). Other formulae reported in this paper (see proposition 3.4 and theorem 3.9) can also be validated via the same procedure.

It is interesting to compare our results with those obtained for the TASEP model with feedback connections. Recall that the one-dimensional open-boundaries TASEP includes *L* sites numbered 1, … , *L*. The configuration of the particles is defined by *n*_{1}, … , *n _{L}*, where

*n*= 0 [

_{i}*n*= 1] indicates that site

_{i}*i*is vacant [occupied by a particle]. The total number of particles on the lattice is . The dynamics is random and sequential: at each time step a particle is chosen in random and moves to the consecutive site with unit rate, provided that this site is not occupied. Particles enter site 1 with rate

*α*and exit from site

*L*with rate

*β*. At steady state, the average overall density is constant, and the system settles into one of three different phases:

—

*maximal current*(MC), here*α*and*β*are large, and the internal hopping rate is the limiting factor;—

*low density*(LD), where the entry rate*α*is the limiting factor; and—

*high density*(HD), where the exit rate*β*is the limiting factor, resulting in a ‘traffic jam’ at the end of the lattice.

A summary of the behaviour of the model is given by
6.1The average current of particles in all these cases is *J* = *ρ*(1 − *ρ*), i.e. the product of the density *ρ* and the ‘hole density’ (1 − *ρ*). Note that these results are mean-field solutions, i.e. they are approximations that become exact when *L* → *∞*.

The LD and HD regions coexist on the line *α* = *β* < 1/2. Domain wall theory [52] assumes that the system admits two regions separated by a zero-width wall. In other words, an LD *ρ*_{−} = *α* on sites [1, *k*], and an HD *ρ*_{+} = 1 − *β* for sites [*k* + 1, *L*]. The well-defined boundary between these domains is called shock, and this scenario is known as a shock phase. Intuitively, this corresponds to a road with a high-speed, low-density region ending with a traffic jam.

Note that (3.5) suggests that the HRFMIO with feedback admits two different phases. The transition between these phases takes place when *s* = 1, i.e. when *e _{n}* = 1/4. For

*λ*

_{c}= 1, this corresponds to the steady-state translation rate

*R*=

*λ*

_{c}

*e*= 1/4. This is reminiscent of the phase transitions in the TASEP that take place along lines in the (

_{n}*α*,

*β*) plane corresponding to

*J*= 1/4 (see (6.1)).

Ha & den Nijs [53] introduced the *constrained TASEP* to model the effects of *finite resources*. In their model particles enter the lattice from a *finite* pool. The pool models a parking garage, and the lattice models a closed road that starts and terminates at the garage. Let *N*_{p} denote the number of particles in the pool. The total number of particles, *N*_{tot} = *N* + *N*_{p}, is a fixed constant, and the effective entry rate, denoted *α*_{eff}, depends on the size of the pool, i.e.
Thus, the total occupation on the lattice feeds back into its filling process. In the study of Ha & den Nijs [53], *f*(0) = 0 (a car cannot emerge from an empty garage) and *f*(*k*) = 1, for all *k* > 0.

Adams *et al*. [54] considered the constrained TASEP in the context of modelling translation, and thus suggested using a function *f*(*x*) that is linear in *x* for small *x*. Specifically, they used
6.2where *N*^{*} is the average number of particles in the standard TASEP. The main interest in these studies is to determine how *N* varies as *N*_{tot} is increased, while the exit rate remains fixed at *β*.

Note that these models correspond to *negative* feedback, as when the number of particles on the lattice *N* increases, *N*_{p} = *N*_{tot} − *N* decreases, and thus the entry rate decreases.

A comparison between the constrained TASEP and the RFMIO with feedback is not straightforward. Indeed, a coherent theory of constrained TASEP is still a challenge, and also the physical point of view, which focuses on phase transitions, domain wall theory, etc., is different from the control-theoretic approach used here.

Two important differences are that in constrained TASEP the feedback is defined in a somewhat implicit way via *f*(*N*_{p}). Indeed, the emphasis is on modelling finite resources, rather than the feedback itself. In the model studied here, *u* = *k*_{1} + *k*_{2}*y* is an explicit function of the translation rate *y*. In most naturally expressed genes (i.e. endogenous genes), the number of mRNA molecules of each gene is relatively low (e.g. less than 1% of the total mRNA molecules in the cells). Therefore, the effect of the number of ribosomes on the mRNA related to one specific gene on the total ribosomal pool is negligible; thus, changes in the ribosomal densities and mRNA levels of one such gene are not expected to significantly affect the total ribosomal pool. (There is a rare exception; namely, highly expressed heterologous genes in which the mRNA molecules related to one gene may be a significant part of the total cellular mRNA levels [16].)

Also, analysis of the TASEP always provides approximate results (which become accurate as the number of sites goes to infinity) while the analysis of the deterministic RFMIO provides results that are valid for every *n*. Summarizing, we believe that the RFMIO is a more appropriate model than TASEP both for studying gene translation and for applications in synthetic biology.

Topics for further research include the following. The TASEP (or its variants) have been used to model not only protein synthesis, but several other real systems as well, including biomolecular motors [55,56], collective motion of ants [57], traffic flow [58–60] and surface growth [61]. In many of these systems, it is reasonable to consider feedback connections, so modelling such systems using the RFMIO with feedback may be of interest.

Several researchers studied the scenario in which several TASEP models compete for a joint pool of resources (see [19,62] and the references therein). This may model several translation processes taking part in parallel in a cell, while competing for the finite pool of ribosomal units. The competition effectively couples all the lattices and leads to interesting new phenomena. In particular, a phase transition in one lattice affects the others. It may be interesting to develop a similar model using a set of coupled RFMIOs (or HRFMIOs).

Finally, another study [63] has demonstrated selection for coding sequence features that optimize tRNA molecule recycling during translation elongation (e.g. pairs of consecutive codons of the same amino acid tend to be identical). Thus, from the functional genomic and evolutionary points of view, it may be interesting to study whether there is selection for transcript features that contribute to ribosomal recycling re-initiation. In addition, a study by Kondrashov *et al.* [64] has suggested a new layer of specificity in the control of gene expression and mammalian development that requires additional modelling and analysis.

## Footnotes

↵1 For the sake of simplicity, we assume from here on that a unique solution

*x*(*t*,*x*_{0}) exists for all*t*≥ 0.

- Received March 24, 2013.
- Accepted May 8, 2013.

- © 2013 The Author(s) Published by the Royal Society. All rights reserved.