## Abstract

We investigate the role of human mobility as a driver for long-range spreading of cholera infections, which primarily propagate through hydrologically controlled ecological corridors. Our aim is to build a spatially explicit model of a disease epidemic, which is relevant to both social and scientific issues. We present a two-layer network model that accounts for the interplay between epidemiological dynamics, hydrological transport and long-distance dissemination of the pathogen *Vibrio cholerae* owing to host movement, described here by means of a gravity-model approach. We test our model against epidemiological data recorded during the extensive cholera outbreak occurred in the KwaZulu-Natal province of South Africa during 2000–2001. We show that long-range human movement is fundamental in quantifying otherwise unexplained inter-catchment transport of *V. cholerae*, thus playing a key role in the formation of regional patterns of cholera epidemics. We also show quantitatively how heterogeneously distributed drinking water supplies and sanitation conditions may affect large-scale cholera transmission, and analyse the effects of different sanitation policies.

## 1. Introduction

Cholera is an enteric disease caused by the bacterium *Vibrio cholerae*. Although a large number of infected individuals develops just mild symptoms that do not require medical attention—or even no symptoms at all [1]—cholera may in many cases cause profuse diarrhoea, which in turn leads to progressive dehydration and, in the most severe cases, to death. In fact, cholera still represents a global threat to public health [2], especially in developing countries where infrastructures to provide access to safe water and basic sanitation are lacking. The most recent data provided by the World Health Organization (WHO) indicate in fact that more than 220 000 cases have been reported worldwide in 2009, with about 5000 cholera-related casualties and a case-fatality rate of 2.24 per cent. However, these figures might be even seriously underestimated, because of under-reporting, lack of surveillance and inconsistencies in case definitions [3]. Understanding and possibly predicting cholera outbreaks are thus crucial goals of both public health policies and humanitarian efforts, as also testified by the post-earthquake Haiti cholera outbreak. Started in late October 2010, the epidemic peaked during January 2011, with more than 344 000 reported cases and about 5400 deaths to date (data retrieved on 12 June 2011, available online at http://mspp.gouv.ht/).

Cholera infection occurs through the ingestion of food (e.g. raw seafood, fruits and vegetables) or water contaminated by *V. cholerae*. While this pathogen can be found as a natural member of the microbial communities of brackish coastal ecosystems, where it lives in association with zooplankton and phytoplankton [4,5], the main source of water contamination during disease outbreaks is represented by the faecal excretions of infected individuals [6], particularly in regions where sanitation conditions are inadequate [7]. Hydrological controls are argued to play a crucial role in the spread of the disease [8,9]. Riverine corridors can in fact propagate cholera pathogens from coastal to inland regions, and from inland epidemic sites to the neighbouring areas through active and passive transport mechanisms. Building on the spatially implicit, mean-field epidemiological model proposed by Codeço [10], Bertuzzo *et al.* [8] developed a spatially explicit framework to model the spreading of cholera epidemics through surface waters as a reactive transport process along networks, thus proposing that the spread of the disease is mediated by the connectivity provided by the hydrological network.

However, spatio-temporal patterns of cholera epidemics can also be affected by other factors, typically by human mobility. Susceptible people who need to travel or commute on a daily basis for educational, business or personal reasons may be exposed to cholera in (possibly infected) destination sites and take the disease back to the (possibly uninfected) communities where they regularly live. On the other side, moving infected individuals can be responsible for long-range dissemination of the bacteria to far-reaching water reservoirs. In fact, most people infected by cholera do not show any acute symptoms, yet they may shed bacteria into the environment through their faeces, thus potentially infecting other susceptibles [7]. Therefore, human mobility is expected to favour the emergence of the disease far from epidemic hotbeds, thus being a major determinant of epidemic spread [4]. This represents a key issue in the dynamics of any infectious disease [11], yet it has received relatively poor attention with specific reference to cholera epidemics.

The first step along this avenue is represented by the work by Righetto *et al.* [12], who analysed the connection between human mobility and cholera spreading via a reaction–diffusion approach accounting for diffusive movements of infected individuals and pathogen dispersion along a simplified, one-dimensional river network structure. However, mobility patterns of human pathogens are likely to be rather complex owing to the intertwining of short- and long-distance movements of people. These movements do typically cause just a temporary displacement of the hosts from their home locations, differently from what is implicitly assumed by diffusion models. Even more important, human mobility is neither expected to be limited by the structure of the underlying hydrological matrix, nor to be confined to the originating river basin. Patterns of mobility are in fact mostly related to the transportation network, e.g. the road network that connects communities of different importance and dimensions. A simple, yet powerful approach to describe such mobility patterns is the so-called ‘gravity transmission’, whose main ingredients are the sizes of local communities and the distances separating them [13]. Because of their effectiveness, gravity models from transportation theory [14] have already been applied in the epidemiological literature to describe the impact of human mobility on the emergence of a suite of human diseases, including influenza [11,15,16], human immunodeficiency virus [17] and measles [13,18,19]. They can in fact provide a satisfactory depiction of human movement among surrounding locations when data on actual mobility patterns are not available [16,18]. The recent Haiti cholera outbreak, which renewed interest on large-scale cholera modelling, has been the occasion for the first applications of gravity models to cholera epidemics [20–22].

In this work we aim to assess, in particular, the specific role played by the short-term human mobility as a driver of cholera propagation complementary to the hydrological transport of pathogens via surface- and groundwaters. To this end, we add a specific model of the mobility of susceptible and/or infected individuals to the connectivity structure provided by the hydrological network [8,9], thus creating a two-layer network for disease propagation. We test this novel, two-layered network model against epidemiological data recorded during the cholera epidemic occurred in KwaZulu-Natal (KZN, South Africa) over the period 2000–2002. Specifically, we focus on the major outbreak (2000–2001), which involved the whole KZN province and totalled about 124 000 reported cases (figure 1). Because of the large spatial extent of the geographical region under analysis (92 100 km^{2} with about 8.5 million inhabitants), particular attention is devoted to study how drinking water supplies and sanitation conditions (which are rather heterogeneously distributed in the province; figure 2) may affect cholera transmission. Specifically, georeferenced data on the availability of piped drinking water supplies and of improved sanitation (from basic toilet facilities to wastewater and sewage collection networks preventing uncontrolled vibrio dispersal from stools to the environment through surface- or groundwaters), recorded during the 2001 South African census, are used to derive spatially distributed values for the relevant epidemiological parameters.

## 2. The model

We extend the spatially explicit network model proposed by Bertuzzo *et al.* [8,9] to include long-range dissemination of pathogens owing to short-term movement of susceptibles and of infected individuals who do not require medical attention (i.e. hospitalization). As in Bertuzzo *et al.* [8], we also account for hydrological transport of *V. cholerae* over the river network, which is described as an oriented graph in which nodes depict human settlements, each characterized by its local water reservoir, while edges portray hydrological interconnections among communities (figure 3*a*). Local cholera dynamics are described via a susceptible–infected–bacteria (SIB) model based on the work by Codeço [10].

### 2.1. A spatially explicit model of cholera epidemiological dynamics and spread via hydrological transport

Let *S*_{i}(*t*) and *I*_{i}(*t*) be the local abundances of susceptible and infected individuals in each node *i* of the river network at time *t*, and let *B*_{i}(*t*) be the concentration of *V. cholerae*. Epidemiological dynamics and pathogen transport over the river network are described by the following set of ordinary differential equations [9]
2.1

As for epidemiological dynamics, the evolution of the susceptible compartment (first equation) is described as a balance between population demography and infections owing to contact with *V. cholerae*. The host population, if uninfected, is assumed to be at a demographic equilibrium *H*_{i} (the size of the *i*th local community), with *μ* being the human mortality rate. The parameter *β*_{i} represents the site-dependent rate of exposure to contaminated water, and
is the probability of becoming infected owing to the exposure to a concentration *B*_{i} of vibrios (with *K* being the half-saturation constant; see figure 4*a* and [10]). The dynamics of the infected compartment (second equation) is represented as the balance between newly infected individuals and losses owing to recovery and natural/cholera-induced mortality, with *γ* and *α* being the rates of recovery and mortality owing to cholera, respectively. The evolution of the local concentration of vibrios that live free in the aquatic environment (third equation) assumes that bacteria are excreted by infected individuals and immediately diluted in a well-mixed local water reservoir *W*_{i} at a site-dependent rate *p*_{i}. Free-living bacteria are also assumed to die at a constant rate *μ*_{B}.

As for hydrological transport, the spread of cholera over the river network is described as a biased random walk process on an oriented graph [23]. Specifically, we assume that vibrios have a mobility rate *l* and move between any two nodes of the hydrological network (say from *i* to *j*) with a probability *P*_{ij} given by
where *P*_{out} (*P*_{in}) is the probability of moving along an outward (inward) edge and *d*_{out} (*d*_{in}) is the outdegree (indegree) of node *i*, that is the number of outward (inward) edges. The bias of hydrological transport along the river can thus be defined as *b* = *P*_{out} − *P*_{in} = 2*P*_{out} − 1. The transport process is assumed to be conservative, i.e. ∑_{j=1}^{d(i)} *P*_{ij} = 1, where *d*(*i*) = *d*_{out}(*i*) + *d*_{in}(*i*) is the total degree of node *i*. Jump probabilities can also be used to account for suitable boundary conditions for vibrio transport at inlet and outlet nodes. Note, finally, that if *l* = 0, model (2.1) reduces to a set of *n*-disjointed local models (with *n* being the number of network nodes) equivalent to that proposed by Codeço [10].

### 2.2. The role of human mobility patterns in long-range cholera dissemination

While preserving the structure of the network model proposed by Bertuzzo *et al.* [8,9], we add a second pathway layer describing the effects of human mobility on epidemic dynamics. We assume that the nodes (human communities) of this second layer correspond to those of the hydrological layer, whereas edges are defined by connections among communities (figure 3*b*). We also suppose that susceptible and infected individuals can undertake short-term trips from the communities where they live towards other settlements for educational, business or personal reasons. While travelling or commuting, susceptible individuals can be exposed to *V. cholerae* and return as infected carriers to the settlement where they usually live. Similarly, infected hosts can disseminate the disease away from their home community.

Human mobility patterns are technically defined according to a gravity model in which individuals leave their original node (say *i*) with a probability *m*, reach their target location (say *j*) with a probability *Q*_{ij} and then come back to node *i*. The movement probability *Q*_{ij} is proportional to the ‘mass’ of the target node (its local community size) divided by some measure of the distance *D*_{ij} between *i* and *j*, i.e.
where *ξ* is a shape parameter discounting the effect of distance between nodes. This represents a simplified version of gravity transmission that assumes a linear relationship between movement fluxes and the ‘masses’ of both donor and recipient locations [15,17,19]. With these hypotheses in mind, model (2.1) can be modified as follows
2.2where *σ* is the fraction of infected individuals who actually show symptoms requiring intense medical care (i.e. hospitalization) and therefore do not move. In the first two equations of system (2.2), the term within brackets represents the rate of new infections owing to exposure to *V. cholerae* during short-term trips, while the last term of the third equation describes long-distance water contamination owing to travelling asymptomatic individuals. If *m* = 0, model (2.2) does obviously reduce to model (2.1).

## 3. Application of the model to the KwaZulu-Natal case study

We apply the model described above to the cholera outbreak occurred in the KZN province of South Africa in 2000–2001. To this end, we make use of (i) local epidemiological and demographic data; (ii) data about availability and distribution of drinking water resources and toilet facilities; (iii) information on local hydrological networks. Specifically, the relevant epidemiological data (i) are provided by the KZN Health Department (http://www.kznhealth.gov.za/). Data consist of a record of cholera cases including information about date and location (i.e. health subdistrict) of each hospitalized case. The dataset also includes the record of local population size for each health subdistrict (see again figure 1). Georeferenced data on the availability of piped drinking water and improved toilet facilities (ii) have been retrieved from the geographic information system (GIS) set up for the 2001 South African census (http://www.statssa.gov.za/census01/html/default.asp; see again figure 2). Hydrological data about the river networks of the KZN province (iii) are derived from the GIS provided by the South African Department of Water Affairs and Forestry (http://www.dwaf.gov.za/).

In our model, perennial rivers and channel endpoints are considered as edges and nodes of the hydrological network, respectively (figure 3*a*). Demographic, epidemiological, hydrological and sanitary data have thus to be interpolated from health subdistricts to network nodes. Specifically, nearest neighbour interpolation has been used, with distances being computed from subdistrict centroids. Interpolated population abundances and Euclidean node-to-node distances have been used for the gravity model as well (figure 3*b*).

Water resources are assumed to be related to local population size according to the following relation
where *c* is a proportionality constant and *H*_{T} is the population threshold below which water resources are considered to be independent of the size of the local community (figure 4*b*; see also the relevant discussion in Bertuzzo *et al.* [8]). Sanitary data can then be used to derive a plausible formalization for the spatially distributed parameters, namely exposure to contaminated water *β*_{i} and contamination rate *p*_{i}. As for exposure, we assume the following dependence upon the availability of piped water
where *β*_{m} is the maximum exposure rate, *ω*_{i} is the local fraction of households without access to piped water and *ɛ* is a shape parameter (figure 4*c*). As for contamination rate, we assume
where *p*_{m}, *τ*_{i} and *ϕ* are the maximum contamination rate, the local fraction of households without access to improved toilet facilities and a shape parameter, respectively (figure 4*d*).

By introducing the dimensionless concentration *B* ≜ *B*/*K*, model (2.2) can thus be written as
and
where *θ* = *p*_{m}/(*Kc*) is the aggregated contamination rate. A linear stability analysis of the model carried out for *l* = 0 and *m* = 0 [10] shows that, given an initial condition of the type *S*_{i}(0) = *H*_{i} , *I*_{i}(0) > 0, *B*_{i} (0) = 0, an epidemic outbreak can occur in node *i* only if
where is the local value of the basic reproductive number of the disease at site . From a technical perspective, is the condition for the disease-free equilibrium to be unstable. Therefore, if , the epidemic is expected to start growing at a rate corresponding to the dominant eigenvalue of the disease-free equilibrium, which can be readily derived with the same linear stability analysis used to compute and reads as

The basic reproductive number and the initial growth rate of the disease can be used to estimate another important epidemiological parameter, namely the generation time *T*_{i} of the disease (also known as serial interval), defined as the time interval between the time of infection of a secondary infectee and the time of infection of its primary infector [24,25]. Assuming a delta distribution for generation times (i.e. that secondary infections always occur with a constant delay), the generation time can be computed [25] as

Model simulations can be initialized by setting *S*_{i}(0) = *H*_{i} and *I*_{i}(0) = 0 in almost all nodes *i* of the network (as the KZN province is not endemic for cholera, all individuals were potentially susceptible to the disease in 2000–2001), except for those nodes *j* where the first cases were reported. Specifically, we set *I*_{j}(0) = 1 and *S*_{j}(0) = *H*_{j} − 1 in the nodes where cholera cases were reported during the first week of the outbreak. As for pathogens, we set *B*_{i}(0) = *B*_{j}(0) = 0, thus assuming that no free-living bacteria were present in the water before the onset of the epidemic, which is a reasonable assumption for non-endemic countries [8]. For the same reason, we impose absorbing boundary conditions at the outlets of the river networks (while we set reflecting boundaries at the inlet nodes). In order to compare simulation results with epidemiological records, we have to evaluate the dynamics of reported cholera cases, i.e. those requiring hospitalization. According to the model, the dynamics of cumulative hospitalized hosts would be the solution of the following equation
which is initialized as I′_{i}(0) = *I*_{i}(0) = 0 and I′_{j}(0) = *I*_{j}(0) = 1.

## 4. Model parametrization and parameter estimation

Several model parameters can be reliably drawn from the literature, or else from demographic/epidemiological data (table 1). The mortality rate of the population (*μ*) can be computed as the inverse of the average human lifetime in the KZN region (about 60 years; [8]), hence *μ* = 4.6 × 10^{−5} (d^{−1}). Following Codeço [10] and Bertuzzo *et al.* [8], we assume that people can be exposed to contaminated water or food at most once a day in the worst-case scenario (*ω* = 1), thus we set *β*_{m} = 1.0 (d^{−1}). The recovery rate *γ* from cholera can be evaluated as the inverse of the average duration of the disease in infected individuals, which is approximatively 5 days [10], therefore *γ* = 0.20 (d^{−1}). Based on the count of lethal cholera cases recorded during the KZN epidemic, Bertuzzo *et al.* [8] estimated the value of the additional mortality rate owing to cholera as *α* = 4.0 × 10^{−4} (d^{−1}). In the same study, it is suggested that the numerical value of the mortality rate *μ*_{B} of free-living vibrios in the Thukela basin (the largest watershed of KZN), i.e *μ*_{B} = 0.23 (d^{−1}), which is well within the range usually reported in the literature [10]. Bertuzzo *et al.* [8] also estimated the numerical threshold *H*_{T} for settlement size in the water-population relationship introduced above, based on data from the KZN province, i.e. *H*_{T} = 2.9 × 10^{4} (no. of individuals). By considering that about 25 per cent of infected individuals show symptoms, and that among these people just around 20 per cent develop profuse diarrhoea requiring medical attention [7], the fraction of symptomatic hospitalized individuals can be set at *σ* = 5.0 × 10^{−2}.

The remaining parameters cannot be derived from the existing literature. Although some of them could perhaps be evaluated by targeted field campaigns, the conceptual nature and the likely time variability of some others (for instance, exposure and contamination rates) may hardly result in directly observable identifications. This is specifically the case of the quantities *θ*, *ɛ* and *ϕ*, which are related to KZN sanitation conditions—which are studied here for the first time in a spatially explicit framework. The parameters *l* and *P*_{out} describing pathogen transport along the river network cannot be deduced from the existing studies either. Although Bertuzzo *et al.* [8] estimated both parameters for the Thukela basin, the proposed numerical values clearly reflect their assumption that *V. cholerae* moves solely along the river network—an assumption that is relaxed here by assuming that there exists different propagation pathways for the disease. For the same reason, no direct estimates for the parameters describing human movement (namely *m* and *ξ*) exist. All these parameters have thus to be numerically tuned through proper techniques.

Parameter estimation is performed using an optimization approach based on Markov chain Monte Carlo (MCMC) sampling, i.e. a family of methods allowing for the exploration of the posterior density function of a desired probability distribution (in our case, the joint probability distribution of the set of tuning parameters; see [26]). Specifically, we use the DREAM (differential evolution adaptive Metropolis) algorithm [27], an efficient implementation of MCMC that runs multiple different chains simultaneously to ensure global exploration of the parameter space, and adaptively tunes the scale and orientation of the jumping distribution using differential evolution (a genetic algorithm proposed by Storn & Price [28]) and a Metropolis–Hastings update step [29,30]. More specifically, we adopt the DREAM_{ZS} variant of the DREAM algorithm, which also makes use of sampling from past states visited by the Markov chains and of a snooker update step (in addition to the parallel update steps) to generate candidate points in each individual chain—thus reducing the number of parallel chains needed for an effective exploration of the posterior distribution while at the same time increasing the diversity of candidate points [31]. The goodness of each single simulation is computed as the residual sum of squares between weekly hospitalized cholera cases per hydrological unit as evaluated from the KZN epidemiological record and from the model. The algorithm is initialized with broad flat prior distributions for parameter values and is let running until convergence (about 15 000 iterations).

## 5. Simulation results

The best-fit model is able to reproduce fairly well the spatio-temporal dynamics of the KZN cholera epidemics (table 2 and figure 5). In fact, not only does the model provide a good match of the temporal evolution of the outbreak (figure 5*c*, coefficient of determination *r*^{2} = 0.89), but it also provides a satisfactory description of the spatial distribution of the reported cholera cases (figure 5*d*; *r*^{2} = 0.79). In particular, the model is able to correctly reproduce the diffuse spread of the epidemics in coastal regions, the relatively small incidence of the disease in the highly populated area around Durban (i.e. the eThekwini metropolitan municipality; figure 3*b*), the high number of cases reported in the Thukela basin (the largest basin in the region; figure 3*a*) and the patchy distribution of hospitalized cases in other inland regions. In addition, the model reproduces the fact that the highest values of cholera incidence are found in medium-sized local communities (figure 5*d*, inset), as already found by Bertuzzo *et al.* [8] with reference to the KZN epidemic.

As a result of the spatially heterogeneous distribution of population densities, water resources and sanitation conditions, all the epidemiological parameters introduced above (*R*_{0}, *r* and *T*) are found to be highly variable in space. In particular, the basic reproductive number *R*_{0} (computed with *l* = 0 and *m* = 0) assumes values between 1.55 × 10^{−2} and 1.31 (5–95th percentiles), with a mean value of 0.321 (1.51 for the nodes where *R*_{0} > 1) and about 90 per cent of the network nodes characterized by *R*_{0} ≤ 1. This suggests that transport processes play a role of paramount importance in the development and the maintenance of the epidemic. As for the other epidemiological parameters, the initial growth rate of the epidemic takes values between 1.30 × 10^{−3} and 0.114 (d^{−1}), with a mean value of 4.62 × 10^{−2} (d^{−1}), while the generation time falls between 7.13 and 9.29 days, with a mean value of 8.51 days. Note that growth rates and generation times have been evaluated only for the nodes where *R*_{0} > 1. Note also that, quite interestingly, our estimates of the generation time are very close to those usually proposed for cholera epidemics (7–10 days; see Checchi *et al.* [32]).

To disentangle the impact of each model parameter on the intensity of the epidemic outbreak, we have performed a sensitivity analysis of the model outcomes with respect to variations in the parameter values. In particular, we allowed the parameters to vary one by one through repeated model runs (±20% variations with respect to the best-fit parameter set). We have then computed the variations of simulated yearly cholera incidence (i.e. the total number of reported cholera cases) evaluated in the whole area with respect to the best-fit simulation. The results of this sensitivity analysis (figure 6) show that the parameters quantifying the availability of clean water (*β*_{m} and *ɛ*) and toilet facilities (*θ* and *ϕ*) are responsible for some of the largest variations of cholera incidence, thus showing that sanitation conditions do obviously play a key role in controlling the spread of the disease. Note that epidemiological dynamics and the ecology of *V. cholerae* do also matter, respectively, through the recovery rate *γ* and the mortality rate *μ*_{B}.

On the contrary, the variations of the human mobility parameters *m* and *ξ* considered in figure 6 produce a small effect on cholera incidence. However, this does not imply that human mobility can be neglected. The role played by human mobility in the formation of the spatio-temporal epidemic pattern can be in fact better understood by examining the performance of a model that does not account for it. To this end, we have fitted a model with *m* = 0 (i.e. model 2.1). Our results (table 2 and figure 7) show that in the absence of human movement, cholera cannot spread outside the hydrological catchments where the epidemic started. As a consequence, both the intensity (figure 7*a*) and the spatial extent (figure 7*b*) of the epidemic outbreak turn out to be remarkably underestimated. Therefore, human mobility is the actual driver of inter-catchment dissemination of cholera pathogens, thus proving instrumental to the understanding of large-scale patterns of cholera spread. Therefore, a model without human mobility cannot be expected to perform as well as a model accounting for human-mediated vibrio dissemination. Similar performances might perhaps be achieved in a model without mobility at the expense of modelling the ‘seeding’ of different outbreaks with the right timing in each basin—which would lead to a remarkably less parsimonious approach, as it would require the detailed knowledge of the initial condition of the epidemic, and would preclude the use of models like this as a predictive/management tool.

Because of their effects on disease incidence (see again figure 6), sanitation conditions are ideal candidates as targets for possible proactive intervention strategies aimed at reducing the impact of cholera epidemics. As suggested by WHO, in fact, prevention is essential to avert cholera outbreaks, namely by expanding access to clean water resources and toilet facilities [3]. Model 2.2 can be effectively used to assess the impacts of interventions designed to reduce the fraction of households without access to piped water or toilet facilities. To this end, we assume that
where () is the fraction of households currently without access to clean water (toilet facilities) and *u*_{ω} (*u*_{τ}) is the sanitation effort. As a proof of concept, we separately consider actions on piped water and toilets, and assume three different spatial allocations for sanitation efforts, i.e. (i) spatially homogeneous; (ii) targeted to poorly sanitated communities ( or ); (iii) targeted to large communities (*H*_{i} ≥ 10 000). Higher levels of sanitation efforts do obviously lead to larger portions of the population being subject to interventions. Our analysis (figure 8) shows that all such interventions may greatly reduce the expected incidence of the disease (or even prevent its insurgence). However, owing to KZN hygienic conditions, interventions targeted to large communities are found to be far less effective than either a spatially homogeneous allocation of sanitation efforts or actions targeted to poorly sanitated communities, with the latter being the most effective interventions for relatively small values of sanitation effort (in particular with actions on piped water). Therefore, understanding the correlations between population distribution and hygienic conditions is fundamental to the planning of cost-effective sanitation campaigns.

## 6. Discussion and conclusions

In this work, we have proposed a mechanistic spatially explicit model for cholera epidemics combining different pathways of long- and short-range propagation of the disease, and we have applied it to a cholera outbreak occurred in the KZN province of South Africa during years 2000–2001. The novelty of our approach lies in the blending of different propagation mechanisms, namely hydrological transport and human-mediated dissemination of *V. cholerae* pathogens. At a regional scale, in fact, pathogen transport along river systems alone cannot explain the emergence of complex epidemic patterns, which include inter-catchment dispersal of *V. cholerae*. Other mechanisms have thus to be invoked. Here, we have explicitly analysed the role played by human mobility in the long-range dissemination of the disease. By extending an epidemiological model of reactive transport along networks [8,9], we have identified a second transport network for *V. cholerae* pathogens produced by the temporary displacements of susceptible individuals and asymptomatic infected carriers. Mobility patterns indeed proved to be not only a plausible, but indeed an actual and important driver of disease propagation. In particular, the proposed value of the shape parameter discounting the effect of distance between communities on human movement (*ξ* ≈ 1, as commonly found in epidemiological studies, e.g. [13,16]) supports the idea that mobility patterns are determined by directed movements along the fastest routes to target locations (while *ξ* ≥ 2 would correspond to diffusive-like movement patterns; see the relevant discussion in Xia *et al.* [13]).

The combination of local epidemiological dynamics, hydrological transport and long-distance *V. cholerae* redistribution may explain the main spatial and temporal patterns of cholera outbreak recorded in the KZN epidemics. Notice that the complexity of the proposed model is kept to a minimum, in that the model accounts for a few selected processes that are relevant to the spread of the disease on short timescales (less than 5 years, corresponding to the timescale of immunity loss; see Koelle *et al.* [33]). Other mechanisms not described here might possibly play important roles in the formation of spatio-temporal patterns of cholera propagation in the long run, especially in regions where the disease is endemic. In particular, the existence of hyperinfectious bacterial states (e.g. [34–36]), the long-term loss of immunity of recovered hosts [33] and the seasonality of environmental or climatic forcings [37–39] may all be significant ingredients of cholera dynamics on longer timescales than those investigated here. Note also that human mobility patterns other than gravity transmission (e.g. random movement, small-world connectivity, etc.) might be considered in our framework effortlessly.

Our analysis points out the importance of a spatially explicit approach to modelling large-scale cholera epidemics arising in regions characterized by heterogeneous environmental conditions. In fact, a proper spatial description of local sanitation conditions proved to be crucial for the model to accurately reproduce the epidemic patterns observed in the KZN region. A fully spatially explicit approach is also necessary whenever the timescale imposed by spreading processes (e.g. hydrological transport) is larger than that related to local epidemiological dynamics (e.g. disease transmission), as recently proved on a theoretical basis [9]. In this work, spatial interactions have been described within a multi-layer network framework. We deem this approach an effective and general tool to model the spatio-temporal evolution of cholera epidemics and, in general, of waterborne diseases characterized by multiple dispersal pathways. In a broader context, as the coexistence of multiple propagation pathways is not unique to epidemiology, the idea of a multi-layer network approach can be profitably extended to other disciplines, such as the study of harmful invasive species [40,41].

In conclusion, a comprehensive picture of the mechanisms that drive the evolution of cholera outbreaks at different time and spatial scales begins to emerge, with possible relevance to other waterborne diseases. Such a picture involves different pathways for pathogen dispersal. At present, the effectiveness of our framework relies on the availability of reliable epidemiological and demographic datasets, as well as of a detailed description of water use habits and sanitation conditions. Such requirements might be restrictive in contexts where cholera is non-endemic, or site-specific estimates of the relevant epidemiological parameters are lacking. Nevertheless, our work suggests that mechanistic models like the one presented here, properly tuned with the acquisition of a relatively small set of core epidemiological data, could be turned into general predictive/management tools to produce and analyse realistic epidemic scenarios. Such predictions are believed to be meaningful for contexts where cholera still exacts a heavy toll of human lives.

## Acknowledgements

The authors wish to thank the KwaZulu-Natal Department of Health, the provincial office of Statistics South Africa and the South African Department of Water Affairs for providing demographic and epidemiological data on the 2000–2002 KZN epidemics. The authors also wish to thank three anonymous reviewers for their comments on a previous version of this manuscript, and Dr Jasper A. Vrugt (University of California, IR, USA) for making available the source code of the DREAM and DREAM_{ZS} algorithms. L.M., E.B., L.R. and A.R. gratefully acknowledge funding from ERC Advanced Grant RINEC 22761 and SNF Grant 200021_124930/1. R.C. and M.G. acknowledge the support provided by Politecnico di Milano and Italian Ministry of Research (Interlink project no. II04CE49G8). I.R.-I. acknowledges the support of the James S. McDonnell Foundation through a grant for Studying Complex Systems (220020138).

- Received May 16, 2011.
- Accepted June 22, 2011.

- This Journal is © 2011 The Royal Society