## Abstract

Biological organisms rely on their ability to sense and respond appropriately to their environment. The molecular mechanisms that facilitate these essential processes are however subject to a range of random effects and stochastic processes, which jointly affect the reliability of information transmission between receptors and, for example, the physiological downstream response. Information is mathematically defined in terms of the entropy; and the extent of information flowing across an information channel or signalling system is typically measured by the ‘mutual information’, or the reduction in the uncertainty about the output once the input signal is known. Here, we quantify how extrinsic and intrinsic noise affects the transmission of simple signals along simple motifs of molecular interaction networks. Even for very simple systems, the effects of the different sources of variability alone and in combination can give rise to bewildering complexity. In particular, extrinsic variability is apt to generate ‘apparent’ information that can, in extreme cases, mask the actual information that for a single system would flow between the different molecular components making up cellular signalling pathways. We show how this artificial inflation in apparent information arises and how the effects of different types of noise alone and in combination can be understood.

## 1. Introduction

Information theory—as conceived by Claude Shannon—is the branch of the mathematical sciences that deals with the quantification of structures, regularities or semantic patterns in a stream of symbols or observations [1]. Information, different from meaning, was defined probabilistically in Shannon's work and this notion has been applied with great success in the engineering and physical sciences [2,3]. More recently, information theoretic approaches have gained in popularity in the biological sciences [4,5].

It is obviously important for biological organisms, ranging from single cells to multicellular organisms, to sense, process and correctly adapt to their environment or their own physiological state [6–11]. A host of recent studies have applied such information theoretical measures and analyses to biological systems, in particular gene regulation and signal transduction systems [12–16]. In such a framework we can, for example, study how well a given molecular pathway relays information arriving at cell receptors into the cellular interior, and in eukaryotes perhaps into the nucleus. If a cell ‘misinterprets’ its environment—or fails to initiate an appropriate physiological response—then this can have obvious detrimental effects; we would therefore expect molecular signal transduction and information processing to have been finely honed by evolution. Information theory provides a framework in which we can attempt to quantify the accuracy and efficiency with which information is mapped onto physiological responses or actions [17,18].

In molecular and cellular systems biology, but also in engineering or signal processing applications, we are often primarily concerned with the efficiency of information transmission, which is also, of course, the context in which Shannon's original work was set. For example, we may model the input and output of an information channel as random variables, *X* and *Z*, respectively. The *mutual information*, *I*(*X*, *Z*), is then a measure that tells us how much the uncertainty about *Z* is reduced if we know the state of the random variables *X* [3]. In terms of the entropies, *H*(*X*), *H*(*Z*), of random variables, *X*, *Z*, the canonical measure of the uncertainty associated with a random variable, we have
1.1where *H*(*Z*|*X*) is the appropriate conditional entropy of *Z* if the state of *X* is known, etc.; below, we assume that we can calculate entropies and derived quantities using the Lebesgue, counting or mixed measures as appropriate. For a perfect, noiseless channel information transmission is loss-less and, if the event spaces of *X* and *Z* are identical, we will therefore have *H*(*Z*|*X*) = 0.

There are two subtle but we feel important differences between traditional applications of information theory and those found in biological systems. First, any molecular signal transduction pathway typically maps the input (such as an environmental stimulus), *X*, onto an appropriate cellular response (e.g. the concentration of an active transcription factor), *Z*. Instead of faithfully reproducing the signal it is in fact *processed*, i.e. altered. Adaptive behaviour, where the response of the system attenuates back to the baseline level for a continuing stimulus, serves as a useful example, where *I*(*X*, *Z*) would vary over time and eventually, for perfectly adapted behaviour, will approach zero. This can be partly accommodated into a conventional information theoretical approach by considering *X* and *Z* to be random variables with different event spaces [9]. This may happen, for example, for switch-like behaviour, where *Z* ≈ 0 for *X* smaller than some threshold *X* < *X _{t}*, and

*Z*≈

*c*> 0 for

*X*>

*X*; here, a continuous input is mapped onto distinct ‘ON’ and ‘OFF’ states [19].

_{t}The second difference lies in the physical manifestation of biomolecular signal transduction systems, which differ profoundly from their engineering counterparts: in physical systems, there is a clear distinction between the channel or information transmission infrastructure (typically wires or antennas), the message (electrons or electromagnetic waves) and the energy required to deliver the message—although in, for example, single molecule transistors such a separation no longer holds. In biomolecular systems, the information processing machinery, the message and the energy units are all molecules, and medium and message are intricately linked [11,20,21].

Here, we investigate how information is transduced by simple molecular systems, and using extensive simulation studies, we attempt to distil the principles underlying molecular information processing. In particular, we shall focus on noise and its impact on information processing in signal transduction [16,22–24]. The concept of ‘noisy channels’ has been central to information theory since its conception [1], but in a molecular context, as outlined above, we are not necessarily able to separate between the channel and the transmitted message.

Given that information is measured by entropy and that dynamics at the molecular scale tend to be stochastic, we may expect systemic distortion of signals owing to the noise inherent to molecular dynamics [22]. Furthermore, in addition to such ‘intrinsic noise’, different individual cells will be subject to ‘extrinsic noise’ sources [25,26]. These include, by definition, variability among cells owing to factors not explicitly considered in the analysis. In signal transduction, this may, for example, be due to variability in the number of cell receptors, ribosomes, proteasomes, kinases, phosphatases, etc.

Below, we will discuss the roles of intrinsic and extrinsic noise in the context of very simple signal transduction systems. We will focus on proteomic processes and very simple ON/OFF signals. Three main lessons emerge from this analysis: (i) noise, especially extrinsic noise, can lead to a systematic inflation in the apparent information transmitted through molecular interaction networks; (ii) transmission of stationary and even the simplest time-varying variable signals can differ quite profoundly (also in respect to which different noise sources affect the information transmission) even for very simple signalling motifs; and (iii) whereas we by now have good insights, often even intuition, as to how different molecular architectures/motifs affect the dynamics of a biological system, understanding the transmission of information, despite considerable recent progress, is still challenging. Even for the simple motifs considered here, a rich behaviour of the conventional mutual information can be found.

## 2. Methods

In order to analyse the effects of noise in cellular signal processes, we use information theoretical concepts, in particular mutual information. Obtaining its value is widely regarded as challenging, and there is little consensus as to the best estimator of mutual information [27–30]. Except for the special case where the joint distribution *p*_{X,Y} is multivariate Gaussian, analytical values cannot be obtained. For this reason, many different types of estimators have been devised; here, we focus on a kernel density estimator-based approach applied on instantaneous measurements of input and output states; we also use the linear noise approximation (LNA) for trajectories, where we have analytical results as the distributions are multivariate normal (MVN).

### 2.1. Kernel density estimator

The kernel density estimation (KDE) was employed by Steuer *et al*. [31]. Considering a Gaussian kernel, a one-dimensional distribution *f*(*x*) is approximated from a dataset as
2.1where *h* is the bandwidth. For our Gaussian kernel, we use the Silvermann's rule of thumb [32], which is considered to be the optimal choice when the underlying distribution is Gaussian, where *σ* is the standard deviation. These one- and two-dimensional estimates can be plugged into the continuous form of mutual information, which is a functional of probability densities
2.2

However, in order to simplify the algorithm, we will typically represent the mutual information by 2.3

and sample *N* times from a mixture of multivariate Gaussians with mean (*x _{i}*,

*y*) and apply the KDE with respect to each Gaussian. In addition, we apply the copula transformation by transforming the input data into quantiles [33]. It is important to note that mutual information appears less sensitive with respect to the chosen smoothening parameter than the probability density.

_{i}### 2.2. The linear noise approximation

When dealing with intracellular processes, low copy numbers of molecules often lead to significant stochastic fluctuations governing the system dynamics. The LNA provides a reliable solution for many such systems, especially those for which the molecule number does not pass below ≈10 and which are not highly nonlinear [34]. It has been previously successfully applied to simulate biochemical systems, for the inference of kinetic rate parameters [35,36], and for the sensitivity and robustness analysis of stochastic reaction systems [37].

We consider a general system of *N* species made up of *X _{i}*,

*i*= 1,…,

*N*molecules inside a volume

*Ω*, giving a concentration

*x*=

_{i}*X*/

_{i}*Ω*. The state of the system can change by one of

*R*chemical reactions corresponding to an event

*j*, leading to a change in species

*i*according to the stoichiometric matrix

*S*= {

*S*}

_{ij}

_{i}_{=}

_{1,2,…,N;j}

_{=}

_{1,2,…R}. The probability of an event occurring in the time interval [

*t*,

*t*+ d

*t*] is given by the mesoscopic transition rates The LNA approximates the chemical master equation by dividing the system's state into a deterministic and a stochastic part, which describe the mean concentration of the reactants and the deviation of the reactants from their mean concentration values, respectively, 2.4

The equation describing the stochastic part is itself made up of two terms, respectively, comprising the drift *A* and diffusion *E*
2.5and the final distribution across all species is MVN at all times. We use the Stochsens package [38] to obtain the LNA equations for our given systems.

### 2.3. Simulations

In addition to analyses using the LNA, we also considered stochastic differential equations, which were simulated using the Euler–Maruyama approximation [39]; the SciPy library (www.scipy.org) was used to solve ordinary differential equations. The simulations account for three different cases: the presence of intrinsic noise, of extrinsic noise and of both. In order to account for extrinsic noise in the systems we consider parameters characterizing different cells distributed according to a Gaussian distribution. In our simulations, we balanced the relative effects of extrinsic and intrinsic noise to study their respective effects on information transmission, and simulations were used to calibrate the system such that the variance in system output at steady state owing to each individual noise source amounted to 10% of the mean/deterministic steady-state abundance.

## 3. Results

### 3.1. Noise in simple motifs

We begin by estimating the mutual information between molecular species that form the inputs and outputs of three increasingly complex molecular signalling motifs [40–42] under the effects of different types of noise (figure 1). The first motif we consider is a basic input–output system with just two molecular species, corresponding to, for instance, a kinase and its regulated target substrate, with no additional interactions. Next, we introduce an additional species, to create a cascade motif, made up of a chain of elements with linear dependencies, which in a biological context could represent a simple signalling cascade [43]. The final motif is a three-species system of the simple feed-forward loop (FFL) type [40,44]. Note that there are eight different structural types of FFL based on different combinations of activation and repression, each categorized into coherent and incoherent depending on whether the signs of the direct and indirect regulation path are the same or opposite. Here, the focus is on the so-called coherent type-1 FFL which appears commonly in both *Escherichia coli* and *Saccharomyces cerevisiae* [44].

Using a stochastic differential equation (SDE) model, extrinsic noise is introduced via parameters that themselves follow a Gaussian distribution. We observe that most of the time the mutual information *I*(*X*; *Z*) is highest for the cases with extrinsic-only noise; information transmission is most affected, therefore, by the presence of intrinsic noise and, perhaps counterintuitively, the loss in fidelity is not sufficiently mitigated by increasing signal strengths. Interestingly, the mutual information value in the presence of both types of noise appears, most of the time, to be between that displayed solely in the presence of intrinsic and solely extrinsic types of noise. This is somewhat counterintuitive as we would expect the mutual information of the systems in the presence of both types of noise to be the lowest of the three. It leads us to believe that the two types of noise interact in a non-additive manner.

Furthermore, although this conclusion about the relative impact of the different types of noise cannot be generalized to all signalling pathways, the observed pattern is surprisingly similar across the three otherwise non-trivially different systems. The pattern also remains consistent when analysing these systems with different input signals *S*. In addition, our results show that increasing the value of *S* decreases the mutual information significantly in the presence of extrinsic noise.

To investigate the generality of our results further, the same set of motifs was also considered but accounting for the possibility of basal transcription and activation of each of the molecular species under mass action kinetics—this would be expected to reduce the ability of the system to trace the signal appropriately. In figure 2, we show the trajectories for molecular species *Z* in the simple, linear and FFL motifs obtained with the LNA compared with 100 trajectories obtained with the SDE; the mean value of the latter (red line) corresponds to the ODE trajectory obtained with the former (green line). As is apparent in figure 2, while the average behaviour of the LNA and the SDE is in excellent agreement (as are their respective variances), the (analytical) mutual information (MI) estimates obtained for the LNA are consistently higher than those obtained from the SDE simulations (which were estimated using KDE described in the Methods section); this reflects the way in which the LNA fails to capture non-Gaussian noise. The MI estimates in figure 2 for the LNA appear inflated compared with the SDE case, as the LNA restricts the joint distribution of the state variables; here input, *X*, and output, *Z*, compared with the SDE (which captures the stochasticity of the system fully). Nevertheless, both estimates display the same qualitative dependence on the signal and are consistent across motifs.

Thus far we have only looked at stationary signals. Biologically more interesting are, of course, dynamically changing signals—such as spatial differences in nutrient abundance or temporally varying environmental signals. For simplicity we consider a very simple form for a signal that changes with time: a square wave process that alternates between successive ‘ON’ and ‘OFF’ states. We model the motifs under two scenarios of this signal, corresponding to different switching rates, and investigate the covariances and mutual information values between the inputs and outputs of the three motifs for different levels of noise.

In a first instance, the motifs are simulated with the LNA which already accounts for the intrinsic noise, to which varying degrees of extrinsic noise are added as displayed in figure 3. It was interesting to see how the mutual information varies in relation to the signal dynamics. In fact, it oscillates between two values, reaching its peak and base points just after the switch is turned OFF and ON, respectively. The average mutual information between input and output is not affected by the level of extrinsic noise, but its variance increases quite considerably; we do, however, observe that this increase with extrinsic noise is somewhat suppressed as the motifs become more complicated (as the effects of different origins of noise can balance one another out). Interestingly, the covariances shown in figure 3 trace the dynamics of the inputs more faithfully than do the MI estimates; this is because the MI also depends on the individual variances of *X* and *Z*, which have their own temporal dependencies. The details of this do, of course, depend on the parameters of the motifs analysed here, but the results obtained here are characteristics for the behaviour that can be observed for even such simple motifs. Obviously, signalling dynamics will also depend on the frequency characteristics of the input signal.

In particular, for the FFL, depending on the parameters, any number of different types of behaviour can be observed. But generally, we find that the mutual information between *X* and *Z*, *I*(*X*, *Z*), is less variable, compared with the other motifs, as the extrinsic noise is increased. More generally, the mutual information traces the signal more faithfully for the FFL, as *X* affects *Z* both directly and indirectly via *Y*, which integrates out some of the variability resulting from extrinsic noise.

To complete the analysis of the motifs under dynamic conditions, we analyse the behaviour of the motifs with the same signal conditions, simulated with ordinary differential equations but perturbed by extrinsic noise only, and compared them with the corresponding stochastic system (intrinsic noise) and a system with both types of noise. In figure 4, we show trajectories for molecular species *Z* in the three systems. What can be seen from the trajectories is that the intrinsic noise appears to govern the dynamics of *Z*; in fact, combining both noise sources affects the trajectories only marginally. In the bottom of figure 4, we focus on the mutual information estimates. The mutual information was computed via the KDE across all types of noise for two specific time points. The time points *T*1 and *T*2, displayed on the trajectories of species *Z*, were selected based on the observations made in figure 3; specifically, we chose time points just after the switch is turned OFF and ON to estimate the peak and base in the mutual information oscillation.

We show the estimated mutual information for the linear three-node motif and the FFL are highest for the ODE with extrinsic noise, whereas for the simple motif it is lowest. This result is highly dependent on the relative sizes of the different parameters and can be explained by the effect that different parameters have on the system dynamics. Generally, cells that have different internal parameters will map inputs onto different outputs, which may appear as inflated information transmission: note that the steady-state abundances (if steady states exist) of all molecules will depend on the parameters. We also compare the information transmission efficiency for different signal frequencies (*S*1 and *S*2); the effects here are more pronounced in the presence of both sources of noise: when extrinsic noise is present, we find a greater difference between minimal and maximal transmitted mutual information. When only intrinsic noise is present, this apparent dependence of transmitted information on the frequency of the signal is not pronounced. Across these systems, it would appear once again that the addition of intrinsic noise to extrinsic noise is not cumulative, as the value of mutual information in the presence of both types of noise seems to be oscillating around that of the intrinsic noise.

### 3.2. Noise in protein expression and activation

So far we have considered generic models that have previously been described in, or applied to, biological signalling or regulation dynamics. Here, we apply the same perspective to a model of protein expression that is more immediately connected to biological processes [45,46]. Protein expression requires a cascade of biomolecular reactions to produce functional protein. Each reaction is associated with a relative loss of information, but some may distort signals more than others. We consider the model used by Komorowski *et al*. [23] to describe gene expression and activation of the protein product via reversible phosphorylation, where the kinase and phosphatase are assumed to be abundant and at constant activity levels.

The following equations, involving mRNA *m*, protein *P* and active (phosphorylated) protein *P**, were considered in order to simulate the system in a similar fashion to that above
3.1

The model represented in figure 5 was considered with different rates of dephosphorylation (parameter *ω*) and degradation of the active protein (parameter *μ*), which were previously shown to be the reactions that make the largest relative contributions to the variability in the abundance of the active protein [23]. For this system, we proceed as before and estimate the MI for the three noise scenarios between the three molecular species at steady state. Again, we find that extrinsic noise leads to an apparent increase in the mutual information (see figure 5), whereas intrinsic noise leads to a clear reduction in the mutual information. Mutual information is always highest between *P* and *P** but low overall. Again, we observe a trend where mutual information appears to increase in the presence of extrinsic noise, and in the presence of both types of noise the mutual information typically takes on intermediate values.

The dependence of the mutual information (for the two cases exhibiting extrinsic noise) on the rates of dephosphorylation and degradation is such that it initially decreases as the rates of these two processes increase, before increasing again with further increase in the dephosphorylation and degradation rates. We will revisit these observations below. The most important result of this finding is that the apparent amount of mutual information between, for example, mRNA and protein or active protein can be inflated by cell-to-cell variability owing to extrinsic sources of variability.

## 4. Discussion

An important issue to remember is that, while mutual information can shed light upon the effectiveness of transmission of information, information is only a statistical measure for the regularity of patterns in a stream of data—not all of this may be biologically relevant. Extrinsic noise—the systematic differences in molecular parameters between different cells—will often (but not always) act to distort or stretch out signals (see also figure 6). For the purpose of illustration, we consider a system with two molecular species described by random variables *X* and *Y* (e.g. the simple linear motif), with *x* ∼ *f*(*S*, *θ*) and *Y* ∼ *g*(*x*, *θ*) (we consider systems where *Y* is independent of *S* conditional on the state of *X*). Then, given an input signal (which may change over time), *S*(*t*), in cells which have identical parameters *θ*_{0}, we obtain measurements and

In the presence of extrinsic noise, the cells will differ in their respective parameters, and we obtain *x*_{1}, *x*_{2},… and *y*_{1}, *y*_{2},…,
Then, we generally expect (and have indeed found in the results shown here) that
4.1The dynamics of the signal transduction system affect the mutual information as well as the entropies of the random variables *X* and *Y*; suppression of the effects of extrinsic noise will not be the rule, but its effect will be reduced if intrinsic noise is appreciable. We can also rationalize the inequality (4.1) by considering the effects of extrinsic noise on the terms in the definition of the mutual information:

Extrinsic noise will tend to lead to an increase in the spread of *Y*, and hence *H*(*Y*) will increase under extrinsic noise. Depending on the dynamics of the signalling system, we would also expect the conditional entropy *H*(*Y*|*X*) to decrease as both *Y* and *X* are functions of the parameters *θ* that differ between cells.

For intrinsic noise alone, the interplay between the dynamics of the molecular information processing system and the concomitant inherent stochasticity are already difficult enough to disentangle and have attracted considerable attention [45–48]. Especially for nonlinear systems, the combined effects of noise and dynamics can give rise to rich and diverse behaviour of the system [24]. Here, we have mostly focused on the stationary dynamics and there, as far as the information transmission is concerned, we can typically ignore much of this complexity (provided a stable set of equilibrium solutions exists). Because of the lack of normalization of the mutual information, the channel capacity [3,5],
is sometimes preferred over the mutual information; this is a variational problem over the possible input distributions, *p*(*S*), of the signal, *S*.

But the channel capacity also implicitly depends on the parameters, *θ*, characterizing the information processing network, i.e. the function *f*(*S*; *θ*). This makes the interpretation of the information processing capability of populations of systems/cells in the presence of extrinsic noise less straightforward. In principle, we could consider the channel capacity averaged across the ensemble but this would hopelessly skew the results, as it will be the between-cell variability that will drive the ‘apparent’ information between inputs and outputs that is captured by the mutual information. In each single cell—or any ensemble of cells with the same kinetic parameters—the mutual information will be much smaller as intrinsic noise alone will only decrease and never increase information—of course, extrinsic noise only increases apparent information (the ‘level of surprise’ at seeing a given symbol/signal). Taken together, we cannot predict *a priori* how these contrasting forces will interact. Certainly, the effects of different types of noise on signal transduction are not simply additive.

Extrinsic noise, i.e. different parameters characterizing the biomolecular reaction networks in different cells, can even lead to qualitatively different behaviour across a population of cells; some cells might, for example, oscillate, whereas others attain a stable equilibrium (depending on the eigenvalues of the corresponding Jacobian matrices describing the different systems). Different parameters (see figure 6) are associated with different gradient fields that may drive solutions (even for identical initial values) to diverge; especially for linear and monotonic systems, we would then expect to observe an apparent increase in the mutual information for extrinsic uncertainty. Intrinsic noise, by contrast, will typically (but not always) broaden a deterministic solution. How and when these different sources of noise work together, and how they affect information transmission, is highly dependent on the system under consideration; this is especially true for nonlinear and non-monotonic dynamical systems.

However, it is clear that these types of noise do not contribute to the information transmission across the system in a simple additive way. The present analysis appears to suggest that extrinsic noise can give rise to ‘apparent information’ and it is important to be aware of this when assessing biological information processing systems, or comparing single-cell- and population-level processes from an information theoretical perspective.

What this analysis has provided is a quantitative assessment of the effects of different types of noise on the information transmission along simple network motifs inspired by biophysical systems. We feel that there are two important lessons that follow from this work: (i) even for very simple systems and simple signals, the information theoretic analysis reveals rich and diverse behaviour. Because of the statistical definitions of entropies and mutual information this diversity may be hard to glean from looking at the dynamics of the system alone; instead, we really have to understand the effect of the dynamics of the molecular reaction network on the distributions of inputs; (ii) as single cell data are becoming available more routinely, it becomes important to be able to deal with extrinsic noise as it tends to affect our assessment of biological information processing and can lead to inflated estimates of the mutual information from single cell data. There are different ways to implement extrinsic noise, and the one chosen here is perhaps among the most straightforward and convenient [25,26].

The comparison of the LNA with exact stochastic simulations was instructive in showing that the amount of variability and the shape of the distribution of outputs can play a profound role. The LNA may miss some of this as real distributions may differ quite substantially from the underlying Gaussian assumption in the LNA, especially for low molecule abundances and/or nonlinear dynamics. The dynamical features of a system affect the information transmission even at stationarity. Analysis of such simple motifs and the way that they shape cellular information transmission can only be a first step, but it is a necessary one, towards understanding of cellular decision-making processes. Relevant information is, however, not always just encoded in terms of abundances (or amplitudes); frequency and gradients are sensed as well, and for these more explicitly dynamical notions of mutual information such as the transfer entropy need to be considered from the outset. The present results suggest, however, that the role of extrinsic noise ought to be considered explicitly, as failure to properly account for its effects is likely to be misleading.

## Competing interests

We declare we have no competing interests.

## Funding

S.S.M.M. and O.L. acknowledge studentship support from the Department of Life Sciences and the BBSRC, respectively. M.P.H.S. is a Royal Society Wolfson Research Merit Award holder and acknowledges support from the BBSRC, EPSRC and the Leverhulme Trust.

## Acknowledgements

We thank the members of the *Theoretical Systems Biology Group* for helpful discussions.

- Received July 4, 2015.
- Accepted August 10, 2015.

- © 2015 The Author(s)

Published by the Royal Society. All rights reserved.