Royal Society Publishing

A stochastic model for ecological systems with strong nonlinear response to environmental drivers: application to two water-borne diseases

Claudia Torres Codeço, Subhash Lele, Mercedes Pascual, Menno Bouma, Albert I Ko


Ecological systems with threshold behaviour show drastic shifts in population abundance or species diversity in response to small variation in critical parameters. Examples of threshold behaviour arise in resource competition theory, epidemiological theory and environmentally driven population dynamics, to name a few. Although expected from theory, thresholds may be difficult to detect in real datasets due to stochasticity, finite population size and confounding effects that soften the observed shifts and introduce variability in the data. Here, we propose a modelling framework for threshold responses to environmental drivers that allows for a flexible treatment of the transition between regimes, including variation in the sharpness of the transition and the variance of the response. The model assumes two underlying stochastic processes whose mixture determines the system's response. For environmentally driven systems, the mixture is a function of an environmental covariate and the response may exhibit strong nonlinearity. When applied to two datasets for water-borne diseases, the model was able to capture the effect of rainfall on the mean number of cases as well as the variance. A quantitative description of this kind of threshold behaviour is of more general application to predict the response of ecosystems and human health to climate change.


1. Introduction

Ecology is pervaded by models with threshold behaviour, that is, models showing drastic shifts in population abundance or species diversity in response to relatively small changes in the parameters that control their dynamics. Resource competition theory, for example, predicts that microbial community should change drastically when resource ratio crosses a critical point where growth limitation shifts from one resource to another (Grover 1997). Epidemic theory predicts that parasites should persist only if host population density increases above a critical value (Anderson & May 1992). Population dynamic models also describe drastic population responses to environmental factors. Ellis & Post (2004), for example, show that the effect of temperature and density on wolf population is dramatically different below and above a critical density value. Below this threshold, population density is positively and linearly associated with temperature due to its effect on prey availability. Above the threshold, the effect of temperature becomes weak when compared with density effects. Grenfell et al. (1998) and Stenseth et al. (2004) describe a similar threshold behaviour for soya sheep in islands. Jacobson et al. (2004) show that the rate of population increase of ungulates changes drastically as snow depth crosses a critical point. The existence of such thresholds has significant consequences for the impact of climate change on ecological systems.

Classical approaches for modelling systems with threshold behaviour assume the existence of two (or more) non-interacting regimes: ‘low-density’ and ‘high-density’ (Ellis & Post 2004) or ‘regular’ and ‘explosive’ (Engel et al. 2001). Depending on the current regime, the system's response to covariates is governed by a different probability distribution or a different dynamic equation. The value of the threshold is often fixed and deterministic, and identified from data by visual inspection of scatter plots (Jacobson et al. 2004) or by investigating the threshold value that optimizes the fit of piecewise functions (Grenfell et al. 1998; Engel et al. 2001). This approach, however, often hides several difficulties that arise when either looking for theoretically predicted threshold phenomena in real datasets or representing observed thresholds using mathematical models. First, in real data, the existence of thresholds may be blurred by stochasticities and inadequate data resolution (Lloyd-Smith et al. 2005) which may soften the observed transition between regimes, making the identification of critical points a difficult task. Moreover, the variability in the system's response and not just its mean can vary drastically across a threshold. Figure 1 shows two examples of this pattern that will be considered in this paper. Similar patterns arise in the abundance of blue-green algae as a function of N : P ratios in lakes (Smith, 1983).

Figure 1

Two examples of systems with threshold response to the environmental driver. (a) Number of weekly severe cases of leptospirosis in a hospital in Salvador, Brazil, versus the amount of rain (mm) in the previous week (log-transformed), 1996–2001 (Flannery et al. 2001). (b) Number of monthly cholera deaths in Parganas, Bangladesh, versus the amount of rainfall in the same month (log-transformed), 1893–1940 (Pascual et al. 2002).

Here we propose a statistical modelling framework that allows for a flexible treatment of the transition between regimes, including different degrees of sharpness in the transition as well as drastic changes in system variability. The proposed model assumes two underlying stochastic processes, each of which dominates at one end of the environmental spectrum. Between these two extremes, the observations are a mixture of the two processes and the mixture proportion is a function of the driver. Thus, the transition between states can be more or less gradual and the actual shape of the transition is determined from the data as a function of the covariates. With this approach, we model situations where shifts between regimes do not follow a simple on and off response.

We illustrate the model with an application to time-series data for two water-borne diseases, leptospirosis and cholera, and their seasonal response to rainfall. The proposed threshold model captures not only the change in mean disease levels, but also their variances, as a function of rainfall, providing a functional form for the stochastic transition between regimes.

2. Two examples of environmental thresholds

Figure 1 shows scatter plots of disease versus rainfall for leptospirosis in Salvador, Brazil (Flannery et al. 2001) and cholera in Parganas, Bangladesh (Pascual et al. 2002). Leptospirosis is a water-borne acute bacterial disease that affects both humans and animals. Human infection occurs by contact with rodent urine-contaminated environment, surface water, soil and objects (WHO 2005). In many tropical urban centres, leptospirosis is an endemic disease, presenting a baseline incidence throughout the year and sporadic outbreaks after heavy rains, possibly due to the increased contact of people with contaminated flood water (Sarkar et al. 2002; Karande et al. 2003).

Historically, large cholera outbreaks in Bangladesh have been associated with deficient monsoon rainfalls (see Pascual et al. 2002, for a review of environmental drivers). This pattern has been classically attributed to the short supply of potable water during droughts. Similarly, the typical seasonal pattern in this endemic region involves two peaks per year with a drastic reduction in cases during the monsoons. A number of other hypotheses have been proposed linking rainfall to seasonal cholera cases through environmental factors that affect the concentration of the pathogen in aquatic environments (brackish and freshwater). Larger inocula of Vibrio cholerae in the environment would result from warming up of the water and associated plankton blooms outside the rainy season (Lobitz et al. 2000). Increased rainfall would dilute the concentration of the pathogen through an increase in volume of the environmental reservoir, lower salinity (below 15 0/00; Miller et al. 1982), lower pH (below 8.5; Huq et al. 1984) and reduced numbers of lysogenic vibriophages in the water (Faruque et al. 2005). Since a heavy bacterial load must be consumed to initiate a life-threatening case of cholera, the number of total deaths by cholera goes down in response to this dilution effect. A negative effect of rainfall has also been described for other regions of the world where periodic floods occur, as in the region of Madras, India (Pascual et al. 2002) and the Brazilian Amazon (Codeço 2001). However, the relationship between rainfall and cholera is complex with historical patterns of positive seasonal association in dry regions such as the Punjab and the northern districts of Madras where intermittent outbreaks occur (Pascual et al. 2002), as well as in more endemic, less dry regions at longer lags in southern Madras (Ruiz-Moreno et al. 2007). Evidence for positive interannual associations have also been recently described between floods and cholera transmission following El Niño events in recent decades in a rural area south of Dhaka (Koelle et al. 2005).

Despite the anecdotal evidence and the many known mechanisms by which rainfall can drive the seasonal outbreaks of water-borne diseases, a quantitative analysis of the nonlinear response of incidence to this environmental driver is still missing. Here, we illustrate a positive response of leptospirosis, with increased incidence after heavy rainfall in Salvador, Brazil, and a negative response of cholera, with a decrease in cases during the monsoon season in Bangladesh.

Both diseases show a different response at low and high values of the environmental driver and a sharp transition between these behaviours. At one extreme, the number of cases are consistently low (relatively to the average), while at the other extreme, a mixture of high and low cases is found. With the threshold model, the objective is to describe the effect of the environmental driver not only on the expected number of cases but also on the system's variability.

2.1 Threshold models: description and inference

To identify and describe the threshold behaviour suggested by the above disease patterns, we introduce a threshold model that incorporates two underlying processes and considers that the observed response is a mixture of these two processes. At extreme values of the environmental driver, one of the processes predominates. Between these two extremes, the observations are a mixture of the two processes and the mixture proportion is a function of the values of the environmental covariate.

McLachlan & Peel (2000) provide a good introduction to the general theory of probabilistic mixtures of distributions and likelihood-based inference. We use here the theory of estimating functions (Godambe 1991), instead of applying the full probabilistic modelling approach and likelihood-based inference, and consider only the mean and the variance functions of the process to estimate the parameters of the model.

Let μ1, μ2 denote the mean responses and Embedded Image, Embedded Image denote the variances of the two processes, respectively. Let π(z) denote the mixture proportion that is a function of the observed covariates Z. Then, the mean and the variance of the observed mixed process are given byEmbedded Image(2.1)

Embedded Image(2.2)

In the water-borne diseases we consider here, we model the individual processes such that Embedded Image and Embedded Image. This is similar to modelling the component processes as having a Poisson distribution. One may also consider other mean and variance functions. Furthermore, note that if one takes μ1=μ2, this approach allows one to model observations where the variance of the observed process is a mixture of the variance of the component processes depending on the covariates, but the mean is unaffected by these values. We model the proportion in the mixture using the logistic function, namely,Embedded Image(2.3)This flexible function takes values in the (0,1) range and allows for sharp or slow changes in the mixture proportion. Of course, any other function that takes values between 0 and 1 may also be used. In our examples, the change in the mixture proportion as a function of the covariate is fairly sharp and hence we call these models ‘threshold models’.

Let us denote the set of parameters that we want to estimate as θ. To estimate θ, we minimize the objective function as follows (Lele & Taper 2002):Embedded Image(2.4)with respect to the parameters. The theory of estimating functions can be used to prove that the estimators obtained by this procedure are consistent and asymptotically normal. The standard errors and confidence intervals (CI) for the parameters are obtained by the parametric bootstrap approach (Efron & Tibshirani 1993).

3. Results

Table 1 shows the point estimates and their bootstrap 95% CI for the two datasets. All calculations were performed using the R v. 1.5 statistical package (scripts are available upon request).

View this table:
Table 1

Parameter estimates for the threshold model.

For leptospirosis, we found an expected incidence of 1.77 cases per week in the drought scenario, and 13.8 cases per week after extreme floods. The other parameters, a and b, describe the effect of the environmental driver on the mixture of the two processes. Since rainfall is associated with increasing reporting of leptospirosis cases, parameter b is positive. Figure 2a shows the estimated mixture function for leptospirosis. The abrupt change in the mixture proportion when log(rainfall) exceeds 2 supports the hypothesis of a threshold response to rainfall.

Figure 2

Estimated mixing proportion (Embedded Image(z)) for (a) the leptospirosis and (b) cholera datasets.

For cholera, the expected number of deaths in the drought scenario is 2214 (95% CI: 2132, 2348) per month. During extreme floods, deaths go down to an average of 157 (95% CI: 156,159). The mixture function shows a negative association between rainfall and deaths (figure 2b), and the estimated response is less steep than that for leptospirosis.

To evaluate goodness of fit, we compared empirically measured and model-derived estimates of mean number of cases per level of rainfall intensity, as well as its standard deviation (figure 3). Empirical estimates were obtained by calculating mean and standard deviations for the observations within the rainfall intervals: [0,1], [1,2], …, [5,6]. For the threshold model, mean and average were calculated using fitted values within the same intervals. We further included, for the sake of comparison, estimates from a standard Poisson regression model in which expected cases are considered as a function of rainfall. From the results, it is clear that the threshold model captures the effect of rainfall reasonably well on the expected number of cases and the variance, for both diseases. By contrast, the simple Poisson regression model captures well only the expected value but not the variability.

Figure 3

Observed and predicted means and standard deviations of (a) leptospirosis cases (i,ii) and (b) cholera deaths (i,ii) as a function of rainfall (open circles, observed; solid line, threshold model; dashed line, standard Poisson regression model).

4. Discussion

The probabilistic model introduced here allows the estimation of nonlinear functional responses of ecological systems to environmental variables. Contrary to previous approaches, this model is capable of capturing sharp transitions between system states without resorting to a priori definition of cut points. The presence of thresholds is left to the user as it is not imposed by the model. For example, in the leptospirosis case, the steep mixing function suggests the presence of a threshold response to rainfall; hypothetically, when rainfall exceeds the structure of the city's run-off system, flooding occurs. In the absence of flooding, leptospira contamination is restricted to a few sites; in its presence, contamination spreads to large areas and the number of exposed individuals increases drastically. From the estimated mixing function, a threshold point for hospital alerts could be determined and used in combination with rainfall forecasts.

On the other hand, a less sharp mixing function is estimated for cholera and may not be compatible with a mechanistic threshold (on–off) response. The relationship between rainfall and cholera is known to be more elusive, involving both positive and negative responses depending on geographical region, temporal lag and overall water availability in the environment (Pascual et al. 2002). The negative nonlinear response at zero lag described here for seasonal cholera to rainfall in Parganas is also present in other districts of former Bengal (results not shown). Future work should examine the basis for a positive effect of rainfall in other regions (e.g. Punjab), for longer lags than that of the dilution effect itself (e.g. Madras; Ruiz-Moreno et al. 2007), and at longer (interannual) temporal scales as indicated by the recently reported association between floods and cholera transmission in Matlab, Bangladesh (Koelle et al. 2005).

As mentioned in §1, sharp transitions between regimes have been observed in many ecological and epidemiological systems. For epidemiological systems, the identification of thresholds is important for the development of climate-based alert systems (Curriero et al., 2001; Charron et al., 2004; Grover-Kopec et al. 2005; Platonov 2006); for ecological systems, to evaluate the impact of environmental changes on ecological dynamics. More generally, a characterization of the nature of the transition between states may serve as a basis for the development of more mechanistic models for intervention analysis.

The threshold model we have developed is static in the sense that it does not involve the temporal dynamics of the system itself. As such, the model is concerned with the estimation of functional responses to environmental drivers and differs in scope from the dynamical models of Grenfell et al. (1998) and Stenseth et al. (2004). The latter are time-series autoregressive models in which the threshold is not a function of an environmental covariate but of the state of the system itself, specifically of the (log) population density in the previous year. The population evolves in time according to a piecewise linear equation whose parameters and additive noise level vary across the threshold. Future developments of our model should investigate two possible directions. The first one is to incorporate probabilistic threshold responses to environmental covariates within time-series models for ecological dynamics. This could be done within the framework of new inference methods for stochastic dynamical systems (e.g. Ionides et al. 2006) to examine, for example, the association of residuals with environmental covariates. The second one is to incorporate the type of threshold considered here, involving a mixture of stochastic processes and changes in the variance, within time-series models in which the autoregressive variable itself exhibits a threshold, as in Grenfell et al. (1998) and Stenseth et al. (2004). More generally, the resulting approaches should be applicable to threshold dynamics in other types of ecological systems (e.g. Scheffer & Carpenter 2003).


This project was carried out as part of the working group on ‘Seasonality and the population dynamics of infectious diseases’ supported by the National Center for Ecological Analysis and Synthesis, a centre funded by NSF, the University of California, and UC Santa Barbara. The leptospirosis project is supported by grants of the Brazilian National Research Council (300.861/96-6) and the National Institutes of Health (AI-052473 and TW-00919). The cholera research was supported by joint funding from the National Science Foundation, National Institutes of Health (Ecology of Infectious Diseases Grant EF 0430 120) and the National Oceanic and Atmospheric Administration (Oceans and Health Grant NA 040 AR 460019; to MP). The first author held an International Fellowship from the Santa Fe Institute during this project.


    • Received June 10, 2007.
    • Accepted July 12, 2007.


View Abstract