Abstract
There is increasing interest in the use of the percolation paradigm to analyse and predict the progress of disease spreading in spatially structured populations of animals and plants. The wider utility of the approach has been limited, however, by several restrictive assumptions, foremost of which is a strict requirement for simple nearestneighbour transmission, in which the disease history of an individual is influenced only by that of its neighbours. In a recent paper, the percolation paradigm has been generalized to incorporate synergistic interactions in host infectivity and susceptibility, and the impact of these interactions on the invasive dynamics of an epidemic has been demonstrated. In the current paper, we elicit evidence that such synergistic interactions may underlie transmission dynamics in realworld systems by first formulating a model for the spread of a ubiquitous parasitic and saprotrophic fungus through replicated populations of nutrient sites and subsequently fitting and testing the model using data from experimental microcosms. Using Bayesian computational methods for model fitting, we demonstrate that synergistic interactions are necessary to explain the dynamics observed in the replicate experiments. The broader implications of this work in identifying diseasecontrol strategies that deflect epidemics from invasive to noninvasive regimes are discussed.
1. Introduction
The statistical analysis of observations of spatial disease spread using contact distribution models and percolation methods has been recognized as an extremely useful approach for characterizing transmission mechanisms since it was first applied to virus diseases of citrus [1]. Standard epidemiological models such as Susceptible, Infectious or Removed (SIR) cellular automata can be conveniently considered as percolation processes [2,3], with concepts such as the percolation threshold being analogous to the threshold between invasive and noninvasive behaviours [3]. This enables the properties of largescale phenomena, in this case an epidemic, to be related to the interactions among individuals in the population, or vice versa. Work by Gibson et al. [4] has advanced this to the stage of obtaining estimates of the percolation parameters using Markov chain Monte Carlo (MCMC) methods on experimental data, comprising successive spatial snapshots of infected and susceptible individuals in plant populations. The use of the percolation paradigm for epidemiological spread has been advocated for an increasingly wide range of host–pathogen systems. Hence, previous work has adapted the paradigm not only to the spread of pathogenic fungi through populations of live hosts in soil [5] but also to the spread of saprotrophic fungi through fragments of organic matter in soil [6]. Within animal populations, the percolation paradigm has recently been used to analyse the spread of plague in gerbil populations in Kazakhstan [7]. There are, however, certain assumptions that may restrict the application of the standard percolation paradigm to epidemiology. Foremost among these is the assumption that the challenge presented by an infected to a susceptible site does not depend upon the state of any other neighbours of these sites. This would suggest that multiple challenges to a susceptible site from one, two or more neighbouring infected sites are mutually independent. Other assumptions include the absence of a latent period, whereby infected sites are immediately infectious, and restrictions to nearestneighbour transmission of infection. Recently, the need for the first assumption has been removed thanks to a generalization of the percolation paradigm [8] that includes synergistic interactions. Here, we focus on methods of inference for this more general model and on applying it directly in the analysis of experimental data to provide the first demonstration of the potential presence of synergy in a real system.
We use a tractable experimental model system and a particular type of synergy that relates to fungal infection in soilborne plant pathogens [9]. For many soilborne fungi, infection occurs by mycelial growth from infected to susceptible sites. Translocation of nutrients along mycelium to hyphal tips is known to occur for many fungi [10–12], and hence the connectivity of infected sites may be assumed to influence the probability of infection of newly encountered susceptible sites. In such a model, the infection process is therefore assumed to depend upon the number of other infectious individuals connected to the infectious site (figure 1a). We introduce a method for estimation of the parameters defining the infectious properties of individuals in the presence of synergy by analysing the largescale dynamics of an epidemic. Our experimental methods also apply to saprotrophic dynamics involving soilborne fungi that spread by mycelial extension through populations of organic fragments and which are also known to exhibit translocation and synergy [13]. We have shown previously that the epidemiological paradigm can be used to analyse the spatial and temporal dynamics of saprotrophic colonization by treating uncolonized, colonized and nutrientexhausted sites as the equivalents of susceptible, infected and removed sites in epidemiological models [6,14].
We first introduce a Bayesian computational method to estimate parameters for a percolation model with synergistic enhancement of infectivity, hereafter referred to, for brevity, as synergy and test the sensitivity of the inferences using simulated data. Thereafter, we apply the methods to data from a simple experimental system that mimics both the pathogenic and saprotrophic systems. Successive snapshots for the spread of the fungus Rhizoctonia solani Kühn over a lattice of nutrient sites are used to estimate the parameters for a percolation model with synergy and to infer mechanisms underlying the synergy. Rhizoctonia solani is a ubiquitous soilborne fungus with both pathogenic and saprotrophic activities that is representative of a broad range of fungi. It causes dampingoff disease of seedlings and root or stembase rots in a very broad range of host species that include economically important agricultural crops as well as species in natural and seminatural habitats [15]. It is therefore an economically and ecologically important fungus in its influence on plant and crop health and in nutrient cycling in soils.
Specifically, we investigate the following biological hypotheses:

Do the transmission rates depend upon the local state of the infection neighbourhood?

How does the number of infectious neighbours connected to a site affect the transmission rate?

Does the relationship change with the nutrient state of the sites?
The consequences of the results for the interpretation and prediction of fungal and pathogen invasion are briefly discussed together with the robustness of the results to the assumptions that underpin the model and analyses and the general role of the methodology in epidemiological inference.
2. System and model
The spread of R. solani (R5, AG21, IMI385769) was monitored through a lattice of nutrient sites as a series of spatiotemporal snapshots (cf. figure 1b). The experimental system is the same as that detailed in Otten et al. [6] and described in detail in §3. In essence, the system consists of a circular Petri plate with 177 nutrient sites comprising potato dextrose agar (PDA) available for fungal colonization arranged in a square lattice. The central site is inoculated with R. solani, and the progress of colonization during successive 24 h periods is mapped for a maximum of 30 days. There were two treatments: one with high nutrient density (HND) at vertices to encourage sharing of nutrients and enhance synergy in colonization, and another with low nutrient density (LND) which is expected to suppress the synergy. Note that differences in nutrient density at nutrient sites are expected to influence both the standard rate of transmission as well as an additional synergy parameter. There were 20 replicates of the HND treatment, and 10 for the LND.
Since synergy depends upon the connectivity between infected sites, the model for the system depends upon the state of the sites (vertices) and the links (or bonds) between sites. The model is based upon a standard SIR formulation. Each site is in one of the three states: Susceptible, Infectious or Removed. On becoming infectious, a site remains so for a fixed period τ after which it enters the removed state. During the infectious period a bond to each neighbour is created as the first event in a Poisson process with instantaneous rate λ. Should this first event fall outside of the infectious period, the bond is not created. Creating a bond between an infectious and a susceptible individual causes the infection of the susceptible. Bonds can also be created between two infectious sites, enabling sharing of nutrients, possibly changing the rate of bond creation of each. A site becomes removed at time τ after it has become infected.
We consider two models for synergistic enhancement. In both cases, the transmission rate λ for an infectious site varies with the number n of infected neighbours to which it is connected. For example, figure 1a shows an infectious site that is surrounded by two infectious and two susceptible sites and creates a bond with each susceptible site independently during its infectious period according to a Poisson process with rate λ. Since the central infectious site has a connection to only one of its infected neighbours, n = 1. In the first synergy formulation, λ has the simplest linear dependence whereby λ = α + (n − 1)β, and α is the initial rate of infection, and β is a parameter quantifying the amount of synergy. The appearance of n −1 in the rate as opposed to n is motivated as follows: initially, each infectious site (apart from the first) has one connected neighbour—the source of the infection. The above definition of λ therefore ensures that the initial rate of infection is always α, making comparison with the standard percolation model (which corresponds to β = 0) more convenient. If λ is a constant, this model precisely corresponds with bond percolation [3].
The second model is a generalization of the first that allows for any specified dependence on n, and allows us to assess whether the linear model is adequate to describe the synergy. In this case, the rate for each distinct number of infectious neighbours is estimated independently leading to four parameters, λ(n) for n ∈ (0, 1, 2, 3). This allows for unknown effects, such as, for example, saturation of the rate. Both models exemplify the concept of dsynergy as described by PerezReche et al. [8]. The first corresponds to the linear situation considered therein [8], whereas the second provides a more flexible framework for representing synergistic interactions. The results of this paper suggest that the generality of the latter model may be required to explain realworld synergistic interactions. The focus on dsynergy is felt to be appropriate as this form of synergy captures the known ability of fungi to translocate nutrients over considerable distances. A site with multiple connections to other infectious sites may therefore be expected to have access to a greater supply of nutrients through this mechanism and consequently exhibit greater infectivity.
3. Experiments and model fitting
The fungal plant pathogen R. solani Kühn (R5, AG21, IMI385769) was introduced on to a Petri plate (140 mm diameter) containing a population of PDA sites. Each site was a 3 mm diameter dot of either 10% (w/v) PDA for the LND treatment, or 40% (w/v) PDA for the HND treatment. The sites were arranged in a square lattice with an intersite distance of 8 mm, extending as far as the circular boundary of the plate, giving a total of 177 sites. The central site was inoculated with a single hyphal strand, 1 mm in length, removed from the edge of a 4dayold colony of R. solani growing on water agar. To avoid desiccation of the agar, moist filter paper was placed into the lid of each plate. The plates were sealed and incubated in the dark at 23°C, and assessed daily for 28 days using a dissecting microscope (magnification 40×) and the number and location of colonized sites recorded. The LND treatment was replicated 10 times and the HND treatment 20 times. The results of the 10 per cent PDA treatment were first published and analysed previously [6], whereas the 40 per cent treatment has not been published previously. Note that the path of infestation and presence or absence of hyphal links between sites was not recorded.
To fit the mathematical model to the experimental data, a Bayesian analysis using Markov chain Monte Carlo (MCMC) methods has been performed. These are a set of algorithms used in this work to determine a distribution of belief in the values of parameters that underlie the data—in this case, α, β and τ, or λ(0, …, 3) and τ—by simulating the distribution using a Markov chain whose state space is the parameter space. A general MCMC calculation requires the use of any prior knowledge of the parameters, encoded as a probability distribution p(θ), where θ is the set of parameters defined above, and a likelihood function L(xθ), where x is the set of observed data. This likelihood is, in essence, the sampling density of obtaining the data given values of the parameters. In the Bayesian paradigm, these are combined to produce the posterior density: 3.1
The prior distribution represents the beliefs regarding the parameters before the experiment is performed. In these experiments, the spacing and nutrient density of the agar dots were chosen to ensure that the system was close to the critical point separating invasive from noninvasive behaviour. For the linear model, this corresponds to the value of α being close to the critical point of a pure percolation process, i.e. α_{c} = ln 2/τ [3]. Similarly for the unconstrained model, this corresponds with λ(1) being close to α_{c}. The priors on these parameters (α and λ(1)) can therefore be fairly informative, whereas the other parameters (β and λ(0, 2, 3)) have vaguer priors. In all cases, independent uniform prior distributions have been used to represent prior belief regarding parameters. The specific hyperparameters used to specify each uniform distribution are tabulated in table 1.
The likelihood L(x,θ) in this case is the joint probability density of creation times of the bonds x = {t_{i}} considered as a function of the parameters of the model θ = (α, β, τ). By exploiting the independence of the Poisson processes underlying bond creation, the likelihood can be expressed as a product of the individual contributions l_{i} from each bond, both created and absent: 3.2where 3.3for the created bonds, where λ(t,θ) is the rate of bond creation as a function of time. For absent bonds, l_{i} is simply the probability that the bond was not created, and is equal to 3.4
The rate λ_{i}(t,θ) is calculated from the history of the environment of bond i over time. The bond connects two sites A and B, and its rate of creation is a function of the creation times of the neighbouring bonds, as well as the removal times of the neighbouring sites of A and B. As soon as A or B becomes infectious, λ_{i}(t,θ) takes a value of α + (n − 1)β or λ(1). For this initial event, n will always be one, since it takes one and only one infectious site connecting to infect A, hence λ_{i}(t,θ) is initially α (or λ(1)). As the environment changes, λ_{i}(t,θ) also changes, increasing as bonds are added to A and B, and decreasing as neighbours of A and B become removed. If B becomes infectious, the rate λ_{i}(t,θ) becomes the average of the rate due to A and the rate due to B. Once A or B becomes removed, λ_{i}(t,θ) drops to 0, since bonds cannot be made with removed sites. Biologically, the removal of a site corresponds to the exhaustion of the nutrients at that site, and therefore no sharing can happen.
In order to obtain the above likelihood, not only are the exact infection times for sites required, but also knowledge of the bonds created for nutrient sharing and their creation times. However, this information is not contained explicitly in the experimental data analysed. Observations were taken at daily intervals, where the state of each site was recorded. The exact path of infection (and hence the states of the bonds) was not recorded. Thus, the data provide the infection times of each site rounded up to the next day. These observations are denoted as y.
Since the infection process is determined entirely by the process of bond creation, we view creation times of bonds as the modelled variables rather than the infection times of the sites. The experimental observations then specify constraints on the bond creation times. A bond's existence implies that both A and B are infectious (as the bond's creation immediately infects any noninfectious sites). The minimum time the bond can have been created is therefore max[min(t_{A}), min(t_{B})], where t_{X} is the range of possible infection times of site X. Similarly, a bond cannot be created linking a removed cell, therefore the maximum time the bond can have been created is min[max(t_{A}), max(t_{B})] + τ, where τ is the infectious period. Any set of exact bond creation times x = {t_{i}} that satisfy the observation y gives a particular value of the likelihood. An average should be taken over the set X of such x to obtain the marginal likelihood as a function of the model parameters θ: 3.5The integration is not tractable, but can be approximated via computational Bayesian methods. Firstly, the unobserved bond creation times are treated as additional unknown parameters, and MCMC methods are then used to investigate the posterior distribution of this ‘extended’ parameter vector.
Following standard procedure, a joint posterior density on Θ × X is defined by 3.6where p(θ) is the prior distribution and L(θx) is evaluated using the method described above. The integration is performed by constructing a Markov chain on Θ × X. Steps are taken by either proposing new times for the individual bonds, proposing that bonds change their state from ‘created’ to ‘absent’ or vice versa, or updating the parameters θ. These proposed steps are then accepted or rejected according to the standard Metropolis–Hastings updating. For each component of θ, α, β and τ (or λ(0, …, 3) and τ), a new value is proposed, giving the new parameter vector θ^{p}. This is then accepted with probability 3.7
The proposal of changes for the individual bonds is complicated by the fact that it is not known a priori which bonds are created and which are not. Thus, the updating of an individual bond can either modify its creation time, or change the state (‘on’ to ‘off’ or vice versa). This necessitates the use of Reversible Jump MCMC [16], which modifies the standard Metropolis–Hastings acceptance probability to become 3.8where j_{m}(x) is the probability of proposing move m from x, for example the probability of proposing to change the time at which a bond was created, or of proposing to remove the bond entirely. The term denotes the probability of proposing the reverse move m^{r} from state x′. Here g_{m}(u) is the probability density at the particular value of the random quantity u required to make the jump. When moving up in dimension, this is the probability density of the new variable: in this case, the probability density of choosing the new bond creation time which is selected uniformly from its permissible range. The corresponding term is unity when moving down in dimensionality. The Jacobian ∂(x′,u′)/∂x,u is also unity in this case, since the dimensionchanging moves are simply projections.
From a state x, the new proposed state is constructed by selecting one of four possibilities, depending on the current state of the bond. If the bond is present in state x, it can either be removed, or its time altered. If the bond is currently not present, it can either be inserted with an allocated creation time, or left as it is. The two alternatives for each state are chosen randomly with equal probability, so that j_{m}, the probability of choosing a particular jump, is identical for all m, and cancels from the above acceptance probability. Therefore, when the dimensionality remains the same, the acceptance probability reduces to standard Metropolis–Hastings. When moving up or down in dimensionality, a factor of 1/Δ is introduced, where Δ is the width of the permissible range of the bondcreation time. In order to analyse all the experimental replicates jointly, the likelihood function is simply replaced with the joint likelihood function, L^{J}(θx) = ∏_{j}L(θx_{j}), where x_{j} represents the state vector of the j^{th} replicate. This new likelihood function is used when updating components of the model parameter vector θ.
This updating scheme specifies a Markov chain whose stationary distribution is the target distribution π(θ, xy), and by simulating this chain the marginal posterior distributions of the parameters θ = (α, β, τ) or (λ(0, …, 3), τ) can be obtained. These distributions represent the belief regarding parameter values given the experimental data.
Applying this technique to real data presents additional challenges in that infections between nextnearest neighbours occurred in the experimental results. These should not be permitted by the model outlined so far. There are several techniques for dealing with this problem, e.g. allowing for longrange infection processes [17] or by introducing a small rate of primary infection [4]. However, in this case, we have chosen to model these rare events as spontaneous infection of sites, with the exact infection time an additional parameter over which to integrate the likelihood, constrained only by the observation and not by the creation time of an appropriate bond.
4. Results
The statistical methodology was applied to fit the two models of synergy to the data from the microcosm experiments. Figure 1b illustrates spatiotemporal maps for the HND and LND experiments at 7, 12 and 20 days after initiation. It is evident that the HND sites are being colonized at a much higher rate than the LND sites: there are also far fewer gaps in the resulting pattern for colonization in the HND treatment. This is significant in that the presence of gaps provides important information with which to infer the lifetime τ of the fungus on a site. Inspection of the timecourse for individual replicates (figure 2) clearly shows that the higher rate of spread occurs in the HND treatment. The data from the LND treatment also show that apart from having a lower average rate of spread, there is much more variability in the final size of the epidemic, e.g. in two replicates, no site other than the first became colonized.
The principal results of the MCMC analysis are shown in figures 3 and 4 and offer some immediate insights into the questions outlined in §1. In the first instance. there is clear evidence that for both LND and HND treatments, a synergistic effect is present with the marginal posterior density for β being concentrated far from zero in both cases. This can be seen by inspection of the MCMC samples from the posterior density of (α, β) for the respective treatments (figure 3). The joint posterior density from the MCMC estimation for the transmission parameters, α and β, shows that nutrient density affects not only the initial rate of infection, α, but also the nature and strength of synergy (figure 4a,b) when it is characterized by fitting the linear synergy model. The HND treatment exhibits both a higher initial rate of infection as well as a greater synergistic effect (figure 4b,d).
There is clear evidence that the nature of the synergistic interaction varies with the treatment. The linear and unconstrained models are compared in figure 4c,d, along with the 95% highest posterior density interval (HPDI) calculated from the posterior densities of α and β. The models characterize the rates in a very similar manner for the lowdensity treatment (figure 4c), with good overlap for all values of n, indicating that the linear model gives a parsimonius description of the synergy effect. However, for the highdensity experiment (figure 4d), there is a strong deviation (concave upwards) from the linear model in the rates fitted using the unconstrained model. Note that the linear model attempts to capture this feature, but is bounded by the requirement that λ(n) > 0, and is perhaps most influenced by the rates λ(1) and λ(2) for which there are most data available. Predictions made using the linear model in the HND should be expected to underestimate significantly the infectivity of sites with three infected neighbours.
Examination of the timeseries plots for the parameters α, β and τ as the Markov chain progresses (figure 4a,b) verifies that the Markov chains mix well and that the estimated posterior densities are likely to be accurate. However, we note that the traceplot for τ covers smaller values and exhibits lowfrequency fluctuations in the highdensity treatment indicating lessefficient mixing of the chain in this case. This phenomenon may be anticipated given that the acceptable range for the lifetime (τ) is constrained by the current configuration of bonds and creation times. Should the proposed lifetime invalidate the presence or creation time of any bond, the proposal would immediately be rejected. This is more likely to happen under the highdensity treatment for which there were more replicates and more bonds per replicate. An accurate estimate of the posterior density of τ is nevertheless possible within the length of simulation runs described here (figure 4a,b). We infer that the smaller value for τ, and hence the shorter ‘infectious’ period associated with the highnutrient source, reflects differences in modes of growth, whereby the fungus with ready access to nutrients grows faster and more vigorously, and hence uses the nutrients more quickly.
5. Discussion
The percolation paradigm is widely invoked as a plausible mechanism for the spread of epidemics [18]. The impact of percolation models on decisionmaking is generally achieved through studies that first estimate key parameters in order to identify whether the system is in the invasive or noninvasive regions of parameter space. Control measures can then be designed that affect the parameters so as to move the system from the invasive to the noninvasive region. Until recently, the utility of this approach was hampered by several of the simplifying assumptions in the standard percolation paradigm. One such assumption that is likely to be violated in many biological systems is the restriction to nearestneighbour interactions in which interactions between a given pair of neighbours are not influenced by the states of other neighbours. Failure to account for these interactions could give highly misleading inferences about the probability of invasion and the effectiveness of potential control strategies. Our methods exploit the flexible paradigm of percolation with synergy, as introduced earlier [8], to provide a statistical scheme to avoid this risk by explicitly detecting and characterizing synergy in the transmission process. In particular, we formulate Bayesian computational techniques that can distinguish synergistic interactions at the individual scale that underlie the statistical properties of largescale patterns of disease spread. The methods use modest amounts of experimental data and replication.
We have used the methods to answer the three questions listed in §1 concerning the effects of synergy on transmission rates. In particular, under both LND and HND treatments, the inferences on parameters suggest that the transmission dynamics are not captured by a standard percolation model; rather, there is overwhelming evidence that favours the presence of synergistic enhancement of infectiousness. We conclude that the transmission rates depend upon the local state of the infection neighbourhood, since the posterior distribution for β is far from zero. Moreover, while the analysis shows that a linear dependence of infectiousness on the number of infected neighbours is supported by the data for the LND treatment, the parameter estimates for (λ(0), λ(1), λ(2), λ(3)) when the unconstrained model is fitted to the HND treatment show that the dependence is highly nonlinear. This confirms the hypothesized effect of nutrient status of sites on the characteristics of local transmission. It is plausible that this nonlinearity could arise through a synergistic effect owing to connections with infectious sites beyond the nearest neighbours in the HND case. Any site with connections to three infectious neighbours may tend to be located in a region of high infection density and may consequently have synergistic connections with other infectious sites. The influence of these sites may be being captured in the nearestneighbour model with general synergy by λ(3) taking an inflated value, leading to the nonlinearity. A further plausible mechanism may be a tendency of the fungus to explore preferentially directions that avoid areas already colonized. These mechanisms are speculative at this stage.
Attention here is focused on a particular system using models that make a number of strong assumptions. We remark that the methods can be readily applied, with appropriate modification, in more general settings. For example, the ‘memoryless’ assumption that underlies the formation of bonds can be dispensed with, as can the assumption of a constant infectious period (cf. [19,20]) without compromising the feasibility of the parameter estimation techniques. It is also possible to investigate synergistic effects for models that include transmission beyond nearest neighbours. We are confident that there exist numerous opportunities for widespread application of the approach.
 Received July 28, 2011.
 Accepted October 10, 2011.
 This journal is © 2011 The Royal Society