## Abstract

Circadian clocks are gene regulatory networks whose role is to help the organisms to cope with variations in environmental conditions such as the day/night cycle. In this work, we explored the effects of molecular noise in single cells on the behaviour of the circadian clock in the plant model species *Arabidopsis thaliana*. The computational modelling language Bio-PEPA enabled us to give a stochastic interpretation of an existing deterministic model of the clock, and to easily compare the results obtained via stochastic simulation and via numerical solution of the deterministic model. First, the introduction of stochasticity in the model allowed us to estimate the unknown size of the system. Moreover, stochasticity improved the description of the available experimental data in several light conditions: noise-induced fluctuations yield a faster entrainment of the plant clock under certain photoperiods and are able to explain the experimentally observed dampening of the oscillations in plants under constant light conditions. The model predicts that the desynchronization between noisy oscillations in single cells contributes to the observed damped oscillations at the level of the cell population. Analysis of the phase, period and amplitude distributions under various light conditions demonstrated robust entrainment of the plant clock to light/dark cycles which closely matched the available experimental data.

## 1. Introduction and motivations

Circadian clocks are gene regulatory networks present in most organisms that help them to adapt to the 24 h day/night cycle. They are composed of a small number of genes involved in interlocking transcriptional feedback loops and enable eukaryotic organisms to anticipate daily changes in environmental conditions such as the duration of the light period. Circadian clocks have two essential features: (i) in the absence of external stimuli (e.g. constant light), the amounts of the involved mRNAs and proteins oscillate rhythmically with a period of approximately 24 h (*circadian*) and (ii) in the presence of external stimuli (e.g. light on/off), the oscillations *entrain* to the external stimuli, adjusting their rhythm to it. Stochasticity is known to play a major role in the behaviour of circadian clocks owing to the small copy numbers in which most of the involved molecules are usually present, and a number of studies on the effect of stochasticity and noise-induced fluctuations on circadian clocks in various organisms exist. For instance, a discrete stochastic model of the circadian clock in mammals has been presented by Forger & Peskin [1], whereas Gonze *et al.* [2–4] developed and compared a deterministic and a discrete stochastic model of the *Drosophila* clock; Akman *et al.* [5,6] studied the effect of discreteness and stochasticity on the robustness of the clock in *Neuropora* and presented a discrete stochastic model of the clock in the *Ostreococcus* alga, which is considered to be a naturally simplified version of the clock in higher plants such as *Arabidopsis*. In all these studies, important differences between discrete stochastic models and their continuous deterministic approximation were found, and in most cases stochasticity was shown to have a positive effect on the robustness of the free-running (i.e. not entrained) clock.

*Arabidopsis thaliana* is the most studied model plant and it has been widely used for studying the plant circadian clock. Several deterministic models of the circadian clock in *A. thaliana* have been developed previously [7–10], and they are based on multiple data in wild-type and mutant plants.

Here, we explore the impact that intrinsic molecular noise has on the plant clock behaviour, by adding stochasticity to the behaviour of the recent deterministic model presented by Pokhilko *et al.* [10]. The size of the system (i.e. the copy numbers of the molecules involved in the system) is unknown, but it is supposed to be small by analogy to other species and, consequently, the molecular noise is believed to have a significant impact on the behaviour of the system. This makes a discrete stochastic interpretation of the model essential in order to capture the noise-induced behaviour of the clock. We investigate the effect of the stochastic fluctuations in a single cell on the clock behaviour, the differences between continuous deterministic and the discrete stochastic interpretations of the model, and their relation to experimental data. We also apply measures commonly used for time-series analysis in signal processing in order to quantify the relevant features of the dynamic behaviour of the clock, such as the dampening of oscillations and their distribution of phase, period and amplitude.

## 2. The clock model

The principal scheme of the *Arabidopsis* clock introduced by Pokhilko *et al.* [10] is shown in figure 1. The key dawn genes *LATE ELONGATED HYPOCOTYL* (*LHY*) and *CIRCADIAN CLOCK ASSOCIATED 1* (*CCA1*) are described in the model by a single variable, *LHY/CCA1*. They activate the expression of their own inhibitors, *PSEUDO-RESPONSE REGULATOR 9 (PRR9), PRR7* and the night inhibitor *NI* (*PRR5*) inside the morning loop. LHY and CCA1 proteins also inhibit the expression of the evening genes *GIGANTEA* (*GI*), *TIMING OF CAB EXPRESSION 1* (*TOC1*) and its hypothetical activator gene *Y* in the morning, and thus provide a delay in their expression. GI promotes the inhibition of TOC1 protein function by positively regulating the F-box protein ZEITLUPE (ZTL). Evening genes form an evening loop of the clock, which feeds back to the morning loop by activating *LHY/CCA1* expression at the end of the night. The evening loop also inhibits *PRR9* expression at night and thus reduces night inhibition of *LHY/CCA1*, increasing the robustness of oscillations. Light activates LHY/CCA1, PRR9, PRR7, NI, GI and Y at the transcriptional and/or post-transcriptional levels, providing the day length sensing by both morning and evening loops.

Our model is based on the deterministic model introduced by Pokhilko *et al.* [10], which formalizes the scheme described above.

In order to model the light and the experimental settings, we applied a two-stage light function to describe an initial entrainment stage followed by an observation stage. Indeed, in laboratory experiments, plants are usually kept in alternating light/dark cycles for a few days, and then transferred to the desired experimental condition (e.g. constant light, dark or different photoperiods). The light on/off switch can be modelled either as a discrete step function, which switches instantly from 0 to 1 at dawn and from 1 to 0 at dusk, or as a smooth function in which the switch is rapid but not instantaneous. Though the discrete step function matches the experimental conditions more closely, the smooth function is closer to real-life conditions and, moreover, has the advantage of being integrable. In our experiments, we have not noticed significant differences in the model response to these two light functions. The results shown in the following have been obtained using the smooth function introduced in Salazar *et al.* [11]. The formal definition of the light function is given, together with the full model, in the electronic supplementary material, appendix C.

We coded the clock model in a computational language called Bio-PEPA [12], which is a stochastic process algebra developed for modelling biochemical systems. Process algebras are formal languages commonly used in computer science areas, such as performance modelling and concurrency theory. In addition to simulation, formal languages enable modellers to *verify* systems: for example, formal properties of models can be verified to uncover causal relations between events, reachability of specific states or equivalences between different systems. Formal methods and process algebras have been used to model and analyse biochemical systems in several studies in recent years (see, for instance, [13,14] for reviews on the approach). The Bio-PEPA language has been developed to represent biochemical systems in a formal and compositional way. It is equipped with both a discrete stochastic semantics defined in terms of a continuous-time Markov chain and a continuous deterministic semantics defined in terms of ordinary differential equations (ODEs). This allows us to analyse our model of the clock both by running stochastic simulations (using standard methods such as the Gillespie [15] and Gibson–Bruck [16] algorithms) and by solving its (equivalent) underlying set of ODEs. The fact that the discrete stochastic and continuous deterministic models are formally derived from the same Bio-PEPA description ensures consistency between them, thus allowing us to compare the stochastic and the deterministic behaviour of the clock. A short introduction to Bio-PEPA is given in the electronic supplementary material, appendix B; for a more detailed presentation of the language, the reader is referred to Ciocchetta & Hillston [12].

In order to have a correct stochastic interpretation of the model, its variables must represent discrete molecule counts rather than continuous concentrations. Hence, the model variables and kinetic laws described in Pokhilko *et al.* [10] are rescaled by a parameter *Ω* which represents the size of the system (see §3.1 for a discussion of *Ω*). Consequently, the set of ODEs underlying the Bio-PEPA model is the same set of ODEs of the model presented in Pokhilko *et al.* [10] except for the scaling factor *Ω*. In our model, the species amounts represent molecule counts, where species that reached peak values close to 1 in the relative units of Pokhilko *et al.* [10] now peak close to *Ω*. The stochastic semantics of our Bio-PEPA model follows the standard stochastic chemical kinetics approach, where reaction rates represent propensity functions which define the probability of occurrence of each reaction from a given state, governed by the chemical master equation and where stochastic simulation methods (such as the Gillespie method [15]) are commonly adopted for model analysis. It is worth pointing out that our model is not a continuous stochastic model described by stochastic differential equations (in which a Gaussian noise term is added to ODEs to represent stochasticity), but rather a discrete stochastic model in which stochasticity emerges by the randomness of the behaviour defined by the reaction propensity functions.

For model analysis, we used a tool called the Bio-PEPA Eclipse Plug-in [17]. Among the various simulators and solvers available within this tool, we used the Gibson–Bruck stochastic simulator [16] and a Dormand–Prince [18] adaptive step-size deterministic solver.

The Bio-PEPA model of the clock is described in the electronic supplementary material, appendix C, and the full model is available as a Bio-PEPA file (electronic supplementary material, appendix F) and as an equivalent Systems Biology Markup Language (SBML) model exported from Bio-PEPA (electronic supplementary material, appendix G). The SBML model is also available from the Biomodels database [19] and the Plant Systems Biology Modelling database [20].

## 3. Results

Here, we discuss some of the simulation results we have obtained, comparing them with experimental results reported elsewhere [7,8,10,21,22]. In the following, we will use the notations LL and DD for constant light and constant dark, respectively, and *h*_{l} L∶*h*_{d} D for alternating cycles of *h*_{l} hours of light and *h*_{d} hours of dark. The initial conditions are the same for all the simulation results reported below, for both the deterministic and the stochastic simulations: namely, the initial values are taken from the limit cycle reached by the deterministic model in 12 L : 12 D, as in Pokhilko *et al.* [10] (rescaled to obtain molecule counts, as described in detail in the following section), and dawn is assumed to be at *t* = 0.

We will use the classic notions of phase, period and amplitude to characterize the oscillatory behaviour of the system; their most commonly adopted definitions are the following.

—

*Peak*: highest value obtained by the variable in one oscillation.—

*Trough*: lowest value obtained by the variable in one oscillation.—

*Phase*: time of the day at which the peak occurs, relative to dawn.—

*Period*: peak-to-peak time difference between two cycles.—

*Amplitude*: difference between peak and trough values.

Further details on these notions and on the method we used to compute them for time-series data obtained from stochastic simulation are given in the electronic supplementary material, appendix D.

### 3.1. The system size and its effect on molecular noise

The continuous approximation underlying ODE models of biochemical systems is valid when the copy numbers of the involved molecules are large. As for many genetic networks, this assumption is generally not valid for circadian clocks: their system size (i.e. the average copy numbers of the involved molecular components) is believed to be small and, therefore, stochastic noise generally has a major effect on the system's behaviour (see, for instance, [3,5,6]).

The system size of the clock network in *Arabidopsis* is unknown. The variables of the original deterministic model are given in relative units and are normalized to peak at 1. In order to correctly quantify stochastic fluctuations, the species amounts must represent numbers of molecules per cell. Hence, we introduced the characteristic size of the system as an additional parameter *Ω*, which is used to scale the species amounts and kinetic laws accordingly (see the electronic supplementary material, appendix C). Note that *Ω* is not the maximum number of molecules of any or all species. Rather, *Ω* scales the variables of the ODEs such that a concentration of 1 in the ODEs becomes a molecule count close to *Ω*.

To compare the individual variations in the copy numbers of the clock components, we analysed the data underlying relative quantitative polymerase chain reaction (qPCR) measurements, which demonstrated similar peak expression levels of the clock genes, with at most a 10-fold range among the transcripts: this justifies our use of a single scaling factor (shown in the electronic supplementary material, table S1, appendix A).

The average copy number of the clock components at the peaks was estimated to be around several hundreds of molecules per cell (see the electronic supplementary material, appendix A and figure S1, for details).

Quantifying the fluctuations owing to stochastic variations provides another method to estimate the unknown system size. Therefore, we used stochastic simulation to verify the above estimations: by observing the effect of the parameter *Ω* on the behaviour of the system (such as the ability of protein and mRNA amounts to entrain to the light stimulus), we could identify the value of *Ω* for which the behaviour is closest to the experimental data. Figure 2 reports a comparison of the behaviour of the clock for *Ω* = 10 and *Ω* = 100 (which represent 10 and 100 molecules, respectively). These results show that, in the entrained clock (12 L : 12 D, figure 2*a*–*d*), while for *Ω* = 10 the protein and mRNA amounts exhibit poor entrainment in a single simulation (i.e. representing a single cell, figure 2*a*), for *Ω* = 100, instead, the entrainment is good (figure 2*c*). Moreover, the mean behaviour for *Ω* = 100 (figure 2*d*) is closer to the experimental data (figure 2*e*) than the mean behaviour for *Ω* = 10 (figure 2*b*); in particular, the asymmetry in the peak visible for *Ω* = 10 is not observed in the experimental data, and the standard deviation for *Ω* = 10 is very high. It is worth noting that experimental techniques introduce additional extrinsic noise into the data, which is not considered in our stochastic simulations. In particular, qPCR measurements provide more precise information about the shape of profiles than luciferase (LUC) bioluminescence traces, because the qPCR measures native RNA directly, whereas LUC luminescence is an indirect measure of transcription [25]. However, individual qPCR assays are more variable than LUC image analysis (figure 2*e,f*). Our simulations in the free-running clock (LL, figure 2*g*–*j*) also show that a very small system size (*Ω* = 10) results in low-amplitude, fast-damping oscillations (figure 2*h*), owing to the high noise, whereas *Ω* = 100 (figure 2*j*) gives a better match with the experimental data (figure 2*k*). Similar results (not shown) have been obtained for all the other considered photoperiods. Intermediate values for *Ω* such as 50 (results not shown) also resulted in worse agreement with the experimental results than *Ω* = 100. On the other hand, higher values for *Ω* such as 1000 (results not shown) led to a further decrease in noise compared with *Ω* = 100 that makes the stochastic effects become almost insignificant and results in worse agreement with the experimental data in constant light than *Ω* = 100. The conclusion that we can draw from these results is that the size of the real system must be of the order of a few hundreds of molecules per cell. Further confidence in the correctness of these model estimations of the system size is provided by the fact that they are of the same order of magnitude as the estimations, based on the experimental data, which are discussed above and are in the electronic supplementary material, appendix A. All the results reported in the remainder of this paper are obtained using the estimated value *Ω* = 100 as the system size.

### 3.2. Individual versus mean behaviour

Each individual stochastic simulation run is one possible evolution of the model and hence, given that the model represents the behaviour of a cell, it can be seen as the behaviour of one single cell. The experimental data suggest that individual oscillators in the plant cells are essentially independent over time scales of several days [26]. One report quantifies the coupling across the surface of detached leaves and finds that it is very weak, with most potential significance over long time scales [27]. The weak coupling between cells allows us to consider the mean behaviour of a stochastic model (obtained by averaging over multiple simulation runs) and the deterministic behaviour (which also, implicitly, is a mean behaviour) as representative of the mean behaviour of a uniform population of independent cells, at least over several days. This is the typical time scale for the data available in plants, and the main focus of our work: the suggested intercellular coupling is too weak to have much effect on this time scale. We also simulate long-term behaviour under the same simplifying assumption. If the results of Fukuda *et al.* [27] are representative of cells in the intact plant, then the intercellular coupling cannot prevent desynchronization but may limit its extent.

Generally, experimental data represent a mean behaviour of a population of cells (standard currently available experimental tools report average measures over tens of thousands of cells taken from whole seedlings or even multiple plants). However, learning how single cells behave is essential in order to understand how a whole organism behaves and, therefore, simulation can be a powerful tool, enabling us to hypothesize the behaviour of a single cell which would be impossible to observe experimentally. On the other hand, experimental data can be used for model validation by comparing them with the mean simulated behaviour.

In the following, we consider constant light and light/dark conditions, and compare the deterministic behaviour with the stochastic one. The qualitative behaviour observed in the deterministic model for both the entrained (12 L : 12 D) and free-running clocks (LL and DD) is a perfectly periodic permanent oscillation (see Pokhilko *et al.* [10]). The differences in these three settings lie in the period, phase and amplitude of oscillations; for instance, the period is 24 h in 12 L : 12 D, around 24.5 h in LL, and around 27 h in DD.

The behaviour observed in a single evolution of the stochastic model in 12 L : 12 D is similar to the regular oscillatory behaviour of the deterministic model [10], with the addition of noise-induced stochastic fluctuations (e.g. figure 2*c*): the noise causes the oscillations to be less regular in terms of amplitude, but the light entrainment forces the oscillations to synchronize with the light and, consequently, only allows for a very small variation in period.

Contrary to the entrained system, the effect of discreteness and stochasticity on the behaviour of the clock is clearly visible in the free-running clock: while the deterministic behaviour is a persistent oscillation, the mean stochastic behaviour is a damped oscillation, matching well with the experimental population data (figure 3 shows the LL system).

Our stochastic simulation results show that the damping of oscillations in constant light conditions is most probably the result of averaging over multiple noisy oscillators in different cells: despite the stable oscillations in the single run, the mean stochastic behaviour damps fast (figure 3*b*) because of the lack of synchronization in the oscillations in different runs.

In addition to asynchrony, there are other potential explanations for the experimentally observed damping. First, the depletion of the luciferin substrate could contribute to the damping by reducing the signal progressively over time. Indeed, data suggest that a decaying exponential with a time constant greater than 3 days might reflect this substrate depletion, as recently observed in algal cultures [28]. The observed faster damping of the amplitude of oscillations suggests the existence of another major mechanism of damping unrelated to experimental variation (electronic supplementary material, figure S2, appendix E). The oscillations at the level of the whole organism might also subside because of the damping of oscillations in each individual cell. Though we cannot completely rule out this hypothesis, the recent deterministic model of the plant clock, which was based on multiple experimental data, suggests that oscillations in single cells are stable [10]. The damping of the oscillations in a population of persistent cellular clocks was also observed in fibroblast cultures and in peripheral explants from mammalian tissues [29,30]. Oscillation damping could also be reproduced by a population of deterministic oscillators with different periods [31]. However, because of the low molecule numbers involved in this (and other) biological clock, the presence of stochastic effects is certain, with or without additional period variation. Thus, we conclude that noise-induced asynchrony among different cells is the most probable mechanism of the population-level damping of oscillations in constant light conditions.

Additional insight on the model behaviour and its variability is obtained by computing statistical measures such as the distribution of phase, period and amplitude of oscillations. We compute these distributions over individual time series for both the deterministic and the stochastic model; for the latter, we also compute time-dependent distributions at a specific time over a number of different simulation runs. Details on the computation of these distributions can be found in the electronic supplementary material, appendix D.

Figure 4 shows the distribution of the period and amplitude of the *LHY* mRNA level in a single 80 day-long stochastic simulation run of the 12 L : 12 D system: this confirms that the period is distributed quite tightly around 24 h, while the amplitude is more variable. Comparing these distributions with the ones for the free-running clock in LL and DD (figure 4 and the electronic supplementary material, figure S5, appendix E), we can observe how the light entrainment makes the period more stable (the distribution is less wide), whereas the variation in amplitude is smaller in LL than in 12 L : 12 D. This agrees with the experimental observation [7,22]. The additional noise in the amplitude in 12 L : 12 D is probably related to the noise brought by acute activation of *LHY*, *PRR9* and *GI* expression by light.

The distributions of period and amplitude shown in figure 4 illustrate the variability of the system in an individual run, i.e. how regular the behaviour is over time. Instead, Figure 5 and the electronic supplementary material (figure S6, appendix E) illustrate the variability over multiple runs, i.e. how similarly different runs behave: they report the time-dependent distribution of phase, period and amplitude of the oscillations over 1000 independent simulation runs at specific times (on the second and on the 80th day of the observation phase) for each of the three light conditions. Figure 5*a* confirms that in 12 L : 12 D the distribution of period is more narrow than the one of amplitude, shows that the entrainment is quite quick (i.e. the difference between the distributions at day 2 and those at day 80 are not significant) and shows that the period of *LHY* mRNA oscillations is always very close to 24 h and the phase is near dawn. In free-running systems, instead, the variability of period length results over time in desynchronization of runs so that, at day 80, the distribution of phase spans the entire cycle (figure 5*b* and electronic supplementary material, figure S6, appendix E). The fact that, in the free-running clock, the period and amplitude distributions are quite similar at early and later times while the phase distributions are wider at later times confirms the visual observation that the oscillations in each individual run are rather regular but they are not synchronized (i.e. the peak occurs at different times of the day), thus causing the dampening of oscillations observed under multiple runs and in experimental population data (figure 3 and [10]). Figure S3 in the electronic supplementary material, appendix E, shows the distribution of period and amplitude of the *LHY* mRNA level over 80 days in the deterministic solution of the model: in the free-running LL system, the period is slightly longer (24.5 h instead of 24) and the amplitude is smaller.

Summing up the comparison between the distributions of phase, period and amplitude in ODEs and stochastic simulations: (i) in the entrained system they agree (although the stochastic simulation results show quite a lot of variation in amplitude) and the distributions at early times are similar to the distributions at later times (i.e. there is little or no transient effect) and (ii) in the free-running system, they show substantial differences—the ODEs show stable oscillations with a shift in phase caused by the period longer than 24 h, the single stochastic simulation also shows persistent (noisy) oscillations, but the mean stochastic behaviour dampens quickly owing to the variation in phase in different runs.

The lack of synchronization between different cells is further illustrated in figure 6*b*. The five random simulation runs plotted clearly show how the oscillations quickly become out of phase in LL; this makes the probability distribution of values so spread that the average over multiple runs dampens fast. This demonstrates why we are unable to obtain the persistent oscillatory behaviour of the cellular system from the population data, neither the experimental ones (figure 3*a*) nor those obtained by averaging multiple stochastic runs (Figures 3*b* and 6*b*). In contrast, the individual oscillations of the entrained clock are synchronized, although with variable amplitude. The distribution of values is quite tight around the mean (figure 6*a*), so the population data (figure 2*d*–*f*) resemble the single cell data (figure 2*c*). Figure 6 also shows that, in LD, the distribution of *LHY* mRNA levels is broader during the rising phase than during the falling phase. Our model suggests an explanation for this asymmetry. Because different runs are not perfectly synchronized, the rising phase of *LHY* mRNA during the night can begin at slightly different times in different runs, hence causing some variability of the levels at any specified time in the rising phase. The faster light-dependent degradation of *LHY* mRNA, however, causes *LHY* mRNA levels to quickly decrease at dawn in all runs. The regulated mRNA degradation derives directly from experimental results [24,32]. Consequently, we observe less variability in mRNA levels at a particular time in the falling phase, and hence better synchronization among runs. A secondary effect is that the *LHY* mRNA peak is lower in runs in which the rising phase started later. This explains both the asymmetry in the broadness of the distribution of *LHY* mRNA level in LD and also the higher variability in peak height compared with LL, in which the rising phase is not abruptly interrupted by dawn.

### 3.3. Moderate noise in a multi-cellular environment accelerates light entrainment

The mean stochastic behaviour gives a good description of the experimental data and, particularly in certain experiments, exhibits a faster entrainment to the light when compared with the deterministic behaviour of the system. Figure 7 shows, for instance, the behaviour of the model after the transition from short days (9 h light) to long days (15 h light). In the deterministic model (figure 7*a*), the amplitude of *LHY* mRNA peak value exhibits a transient *doubling of period*, that is, an alternation of a higher and a lower peak, which is still observed after 50 days (figure 7*a* shows the first 20 days). The stochastic model does not show this effect and instead predicts that *LHY* mRNA is quickly entrained to a regular value. A similar behaviour is observed for other proteins and mRNAs (results not shown).

The doubling of period in the current deterministic model is a consequence of superposition of the mutual regulation of *LHY* and *PRR* genes and strong acute light induction of the PRR wave. This results in an alternation of higher and lower amplitudes of LHY: a higher LHY amplitude results in high *PRR7/5* expression, which in turn reduces the expression of *LHY* the following morning, while strong acute light induction of *PRRs* cuts LHY amplitude at a lower level immediately after dawn on the next day. Such a strong regulation greatly improves the entrainment and robustness of the clock [10]. However, it can cause some transient doubling of period in the deterministic model. The stochastic simulations explain why the doubling of period was never observed experimentally: the noise, together with the averaging over multiple runs, reduces the effect of variations in amplitude, which causes doubling of period in the deterministic solution of the model, consequently improving the speed of entrainment.

### 3.4. Stochasticity improves the entrainment to ‘skeleton’ artificial light patterns

To investigate the effects of the dawn and dusk light separately, some of the experiments were carried out in special entrainment conditions where 12 h of light was interrupted by 6 h of dark. This creates a second dawn in the evening and, as *LHY* is responsive to light, it has the effect of introducing a secondary peak in addition to the morning peak (figure 8*d*).

The evening peak is not reproduced by the deterministic model (figure 8*a*), but is instead clearly present in the mean stochastic behaviour (figure 8*b*). The better match to the data obtained using the stochastic model is related to the low LHY peaks during the first light pulse, which sometimes occur in single runs. The first (dawn) light pulse occasionally results in a lower than normal LHY peak (figure 8*c* at 72 h). This causes lower *PRR7/5* expression while TOC1 trough levels are elevated. This in turn increases the *LHY* response to the second (dusk) light pulse (figure 8*c* at 81 h). Thus, the addition of noise-induced behaviour improves the description of experimental data on the ‘skeleton’ photoperiods also.

### 3.5. Autocorrelation of time-series data: a quantification of phase distribution

Most of the above-mentioned considerations have been drawn by visual comparison of time-series data obtained from simulation laboratory experiments. Using this approach, qualitative features such as ‘oscillations dampen’ or ‘oscillations are persistent’ can be easily observed, but it is hard to provide a precise quantitative measure of the stochastic effects. As discussed in the electronic supplementary material, appendix D, when considering noisy data such as experimental data or those coming from stochastic simulation, performing data analysis (e.g. measuring phase, period and amplitude of oscillations) can be difficult. Time-series analysis techniques originating in signal processing can be used in order to compute summary measures of the quality of the circadian clock related to the regularity of the oscillations, such as the distribution of phase.

The autocorrelation function of a time series measures the similarity of the time series with a shifted version of itself, and is commonly used in signal processing to detect periodicity in noisy signals. For perfectly periodic signals, the autocorrelation function oscillates regularly between +1 and −1, with the same period as the signal, and with the highest value at time *t* = 0. For noisy periodic signals, the autocorrelation function is also oscillating; if noise affects the phase of the oscillations of the signal, then the oscillations of the autocorrelation are dampened, the envelope of the function decreases exponentially and the speed of dampening can be used as a measure to quantify the effect of noise on signal periodicity. For signals which are only random noise, the autocorrelation function immediately reaches values very close to 0.

Generally, a signal is considered to be statistically different from random noise if its autocorrelation function is outside the 95% confidence band (which can be computed approximately as , where *N* is the number of samples in the time series). The *half-life* of the autocorrelation, corresponding to its 50 per cent decrease (i.e. the time at which the envelope of the autocorrelation function is smaller than half its maximum), can be used as a measure of the robustness of periodic signals with respect to noise. An application of autocorrelation to biochemical oscillators can be found in Gaspard [33].

Figure 9 shows the autocorrelation function for both the entrained and the free-running system computed over *LHY* mRNA time-series data obtained from stochastic simulation. Figure S4 in the electronic supplementary material, appendix E, reports the autocorrelation of the deterministic behaviour. In the 12 L : 12 D system, the autocorrelation of the deterministic time series is periodic and regular, from +1 to −0.5 (electronic supplementary material, figure S4*a*). The mean autocorrelation of multiple stochastic runs is similar to the deterministic one: it is also regular but with a smaller amplitude, owing to the variation in the oscillations (figure 9*a*). The autocorrelation of a single stochastic run is also regular in periodicity, but not in amplitude as the amplitude of the time series is not regular (figure 9*b*).

In LL, the autocorrelation of the deterministic time series is similar to the one in 12 L : 12 D, it is periodic and regular from +1 to − 0.8 (electronic supplementary material, figure S4*b*). The mean autocorrelation of multiple stochastic runs instead quickly dampens (figure 9*c*), and its half-life is around 31 h. The autocorrelation of a single stochastic run, despite being noisy, persistently oscillates outside the 95% confidence band (figure 9*d*), thus clearly showing the presence of a regular circadian rhythm. In figure 9*c*, we can also observe that the mean autocorrelation function remains outside the 95% confidence band for 8–10 days. Hence, in addition to confirming the persistence of rhythmic oscillations in the individual cell behaviour, the above analysis also suggests that phase synchrony is totally lost only after 8–10 days.

The results for DD are similar to those for LL (electronic supplementary material, figure S7, appendix E): the autocorrelation of the deterministic time series is periodic and regular, from +1 to −0.7, the autocorrelation of the mean stochastic behaviour dampens and the autocorrelation of an individual stochastic run is noisy and constantly outside the 95% confidence band. Compared with the LL system, in LL the autocorrelation half-life is longer, around 55 h, and the phase distribution is slower, with the mean autocorrelation function remaining outside the 95% confidence band for 18–20 days.

Being normalized, the autocorrelation amplitude and its half-life are good measures to compare the effect of noise on phase distribution and its relation with system size. The autocorrelation for the stochastic model with *Ω* = 10 (electronic supplementary material, figure S8, appendix E) is qualitatively similar to those for *Ω* = 100 but shows important quantitative differences, and provides a further measure to quantify the noise-induced variations and their relation with the system size. Specifically, in the LD system, we can observe that the autocorrelation amplitude is considerably smaller for *Ω* = 10 because of the larger variation in amplitude. In the LL system, we notice that the mean autocorrelation of multiple stochastic runs immediately drops to zero, with a half-life shorter than a single cycle, around 20 h, showing that no synchrony is observed among the oscillations of individual cells; the autocorrelation of a single stochastic run, in fact, does not exhibit oscillations with a constant period owing to the huge noise-induced fluctuations observed in the single cell behaviour.

## 4. Conclusions

The importance of molecular noise in the behaviour of genetic networks such as circadian clocks has been shown previously in various systems (e.g. [3,5,6]). In this work, we have investigated the noise-induced behaviour in the circadian clock of the model organism *A. thaliana*. The model we have presented here is obtained by extending an existing model of the clock [10] with the addition of stochasticity.

The simulation results obtained by analysing the discrete stochastic representation of the system allowed us to estimate the unknown size of the system to around 100 copies of proteins per cell. We also showed how moderate noise is able to accelerate the entrainment of the plant clock under certain light conditions and that the stochastic interpretation of the clock model showed better agreement with the available experimental data than the deterministic solution. The analysis of individual simulation runs, representing the behaviour of individual independent cells, helped us to explain the experimentally observed dampening of oscillations in constant light, which was not captured by the deterministic model: the oscillations in single cells, although persistent, are not synchronized in different cells, thus causing phase diffusion.

The distribution of phase, period and amplitude computed over multiple simulation runs closely matched the available experimental data, exhibiting robust entrainment of the plant clock to light/dark cycles with some fluctuations of the amplitudes. The half-life of the autocorrelation of simulation time-series data was used as a measure to quantify the dampening of oscillations in the free-running system.

The suitability of stochastic models for understanding single cell behaviour in the presence of noise is well known. Interestingly, our stochastic model has also been shown to give a better account of population data than the deterministic model: the mean stochastic dynamics captures the damping of oscillations experimentally observed in constant light (§3.2) and shows a better entrainment to light in several light patterns, such as a faster entrainment to changes in photoperiod (§3.3) and the occurrence of secondary evening peaks in ‘skeleton’ photoperiods (§3.4).

## Acknowledgements

The Centre for Systems Biology at Edinburgh is a Centre for Integrative Systems Biology (CISB) funded by BBSRC and EPSRC, reference BB/D019621/1. A.P. was supported by BBSRC award E015263 to A.J.M., and by European Commission FP7 Collaborative Project TiMet (award 245143) to A.J.M and others. A.P.F. was supported by the ROBuST project, award F005237 from BBSRC and EPSRC to K.J.H. and others. We are grateful to Dr Hirofumi Ishihara and Ronan Sulpice of the TiMet project team, Max Planck Institute of Molecular Plant Physiology, Potsdam-Golm, Germany, for the information about the total number of cells in *Arabidopsis* tissues. We gratefully acknowledge the expert technical assistance of Gavin Steel.

- Received June 13, 2011.
- Accepted August 8, 2011.

- This journal is © 2011 The Royal Society