Modelling cholera epidemics: the role of waterways, human mobility and sanitation

L. Mari, E. Bertuzzo, L. Righetto, R. Casagrandi, M. Gatto, I. Rodriguez-Iturbe, A. Rinaldo


We investigate the role of human mobility as a driver for long-range spreading of cholera infections, which primarily propagate through hydrologically controlled ecological corridors. Our aim is to build a spatially explicit model of a disease epidemic, which is relevant to both social and scientific issues. We present a two-layer network model that accounts for the interplay between epidemiological dynamics, hydrological transport and long-distance dissemination of the pathogen Vibrio cholerae owing to host movement, described here by means of a gravity-model approach. We test our model against epidemiological data recorded during the extensive cholera outbreak occurred in the KwaZulu-Natal province of South Africa during 2000–2001. We show that long-range human movement is fundamental in quantifying otherwise unexplained inter-catchment transport of V. cholerae, thus playing a key role in the formation of regional patterns of cholera epidemics. We also show quantitatively how heterogeneously distributed drinking water supplies and sanitation conditions may affect large-scale cholera transmission, and analyse the effects of different sanitation policies.

1. Introduction

Cholera is an enteric disease caused by the bacterium Vibrio cholerae. Although a large number of infected individuals develops just mild symptoms that do not require medical attention—or even no symptoms at all [1]—cholera may in many cases cause profuse diarrhoea, which in turn leads to progressive dehydration and, in the most severe cases, to death. In fact, cholera still represents a global threat to public health [2], especially in developing countries where infrastructures to provide access to safe water and basic sanitation are lacking. The most recent data provided by the World Health Organization (WHO) indicate in fact that more than 220 000 cases have been reported worldwide in 2009, with about 5000 cholera-related casualties and a case-fatality rate of 2.24 per cent. However, these figures might be even seriously underestimated, because of under-reporting, lack of surveillance and inconsistencies in case definitions [3]. Understanding and possibly predicting cholera outbreaks are thus crucial goals of both public health policies and humanitarian efforts, as also testified by the post-earthquake Haiti cholera outbreak. Started in late October 2010, the epidemic peaked during January 2011, with more than 344 000 reported cases and about 5400 deaths to date (data retrieved on 12 June 2011, available online at

Cholera infection occurs through the ingestion of food (e.g. raw seafood, fruits and vegetables) or water contaminated by V. cholerae. While this pathogen can be found as a natural member of the microbial communities of brackish coastal ecosystems, where it lives in association with zooplankton and phytoplankton [4,5], the main source of water contamination during disease outbreaks is represented by the faecal excretions of infected individuals [6], particularly in regions where sanitation conditions are inadequate [7]. Hydrological controls are argued to play a crucial role in the spread of the disease [8,9]. Riverine corridors can in fact propagate cholera pathogens from coastal to inland regions, and from inland epidemic sites to the neighbouring areas through active and passive transport mechanisms. Building on the spatially implicit, mean-field epidemiological model proposed by Codeço [10], Bertuzzo et al. [8] developed a spatially explicit framework to model the spreading of cholera epidemics through surface waters as a reactive transport process along networks, thus proposing that the spread of the disease is mediated by the connectivity provided by the hydrological network.

However, spatio-temporal patterns of cholera epidemics can also be affected by other factors, typically by human mobility. Susceptible people who need to travel or commute on a daily basis for educational, business or personal reasons may be exposed to cholera in (possibly infected) destination sites and take the disease back to the (possibly uninfected) communities where they regularly live. On the other side, moving infected individuals can be responsible for long-range dissemination of the bacteria to far-reaching water reservoirs. In fact, most people infected by cholera do not show any acute symptoms, yet they may shed bacteria into the environment through their faeces, thus potentially infecting other susceptibles [7]. Therefore, human mobility is expected to favour the emergence of the disease far from epidemic hotbeds, thus being a major determinant of epidemic spread [4]. This represents a key issue in the dynamics of any infectious disease [11], yet it has received relatively poor attention with specific reference to cholera epidemics.

The first step along this avenue is represented by the work by Righetto et al. [12], who analysed the connection between human mobility and cholera spreading via a reaction–diffusion approach accounting for diffusive movements of infected individuals and pathogen dispersion along a simplified, one-dimensional river network structure. However, mobility patterns of human pathogens are likely to be rather complex owing to the intertwining of short- and long-distance movements of people. These movements do typically cause just a temporary displacement of the hosts from their home locations, differently from what is implicitly assumed by diffusion models. Even more important, human mobility is neither expected to be limited by the structure of the underlying hydrological matrix, nor to be confined to the originating river basin. Patterns of mobility are in fact mostly related to the transportation network, e.g. the road network that connects communities of different importance and dimensions. A simple, yet powerful approach to describe such mobility patterns is the so-called ‘gravity transmission’, whose main ingredients are the sizes of local communities and the distances separating them [13]. Because of their effectiveness, gravity models from transportation theory [14] have already been applied in the epidemiological literature to describe the impact of human mobility on the emergence of a suite of human diseases, including influenza [11,15,16], human immunodeficiency virus [17] and measles [13,18,19]. They can in fact provide a satisfactory depiction of human movement among surrounding locations when data on actual mobility patterns are not available [16,18]. The recent Haiti cholera outbreak, which renewed interest on large-scale cholera modelling, has been the occasion for the first applications of gravity models to cholera epidemics [2022].

In this work we aim to assess, in particular, the specific role played by the short-term human mobility as a driver of cholera propagation complementary to the hydrological transport of pathogens via surface- and groundwaters. To this end, we add a specific model of the mobility of susceptible and/or infected individuals to the connectivity structure provided by the hydrological network [8,9], thus creating a two-layer network for disease propagation. We test this novel, two-layered network model against epidemiological data recorded during the cholera epidemic occurred in KwaZulu-Natal (KZN, South Africa) over the period 2000–2002. Specifically, we focus on the major outbreak (2000–2001), which involved the whole KZN province and totalled about 124 000 reported cases (figure 1). Because of the large spatial extent of the geographical region under analysis (92 100 km2 with about 8.5 million inhabitants), particular attention is devoted to study how drinking water supplies and sanitation conditions (which are rather heterogeneously distributed in the province; figure 2) may affect cholera transmission. Specifically, georeferenced data on the availability of piped drinking water supplies and of improved sanitation (from basic toilet facilities to wastewater and sewage collection networks preventing uncontrolled vibrio dispersal from stools to the environment through surface- or groundwaters), recorded during the 2001 South African census, are used to derive spatially distributed values for the relevant epidemiological parameters.

Figure 1.

Epidemiology of the 2000–2001 KZN cholera epidemics. (a) Spatial patterns of recorded cholera cases, (b) population densities and (c) yearly cholera incidence (defined as the ratio between the total number of recorded cases and local population abundances) in each health subdistrict. Temporal evolution of the epidemic outbreak through the whole KZN province, shown as the weekly count of reported cases (d) (cumulated cases (e)). Data from the KZN Health Department and the 2001 South African Census.

Figure 2.

Hygienic conditions of the KZN province. Spatial patterns of availability of (a) piped water and (b) toilet facilities. Data from the 2001 South African Census.

2. The model

We extend the spatially explicit network model proposed by Bertuzzo et al. [8,9] to include long-range dissemination of pathogens owing to short-term movement of susceptibles and of infected individuals who do not require medical attention (i.e. hospitalization). As in Bertuzzo et al. [8], we also account for hydrological transport of V. cholerae over the river network, which is described as an oriented graph in which nodes depict human settlements, each characterized by its local water reservoir, while edges portray hydrological interconnections among communities (figure 3a). Local cholera dynamics are described via a susceptible–infected–bacteria (SIB) model based on the work by Codeço [10].

Figure 3.

Two-layer network model. (a) River systems of the KZN province, with the three most populated basins (uMngeni, Thukela and Mfolozi) highlighted. (b) Human mobility patterns. For each node only the three most important outbound connections (as computed from the gravity model detailed in the text) are shown. Note the role played by Durban, the most populated town in the KZN province, in organizing the southmost part of the mobility network. Data from (a) the South African Department of Water Affairs and Forestry and (b) the 2001 South African Census.

2.1. A spatially explicit model of cholera epidemiological dynamics and spread via hydrological transport

Let Si(t) and Ii(t) be the local abundances of susceptible and infected individuals in each node i of the river network at time t, and let Bi(t) be the concentration of V. cholerae. Epidemiological dynamics and pathogen transport over the river network are described by the following set of ordinary differential equations [9]Embedded Image 2.1

As for epidemiological dynamics, the evolution of the susceptible compartment (first equation) is described as a balance between population demography and infections owing to contact with V. cholerae. The host population, if uninfected, is assumed to be at a demographic equilibrium Hi (the size of the ith local community), with μ being the human mortality rate. The parameter βi represents the site-dependent rate of exposure to contaminated water, andEmbedded Image is the probability of becoming infected owing to the exposure to a concentration Bi of vibrios (with K being the half-saturation constant; see figure 4a and [10]). The dynamics of the infected compartment (second equation) is represented as the balance between newly infected individuals and losses owing to recovery and natural/cholera-induced mortality, with γ and α being the rates of recovery and mortality owing to cholera, respectively. The evolution of the local concentration of vibrios that live free in the aquatic environment (third equation) assumes that bacteria are excreted by infected individuals and immediately diluted in a well-mixed local water reservoir Wi at a site-dependent rate pi. Free-living bacteria are also assumed to die at a constant rate μB.

Figure 4.

Functional relationships used in the epidemiological model. (a) Probability of infection f as a function of exposure to a given concentration of V. cholerae cells. (b) Water availability W as a function of local community size (c = 1, HT = max(H)/4). (c) Rate of exposure to contaminated water β as a function of piped water availability (ɛ = 2). (d) Contamination rate p as a function of access to toilet facilities (ϕ = 2).

As for hydrological transport, the spread of cholera over the river network is described as a biased random walk process on an oriented graph [23]. Specifically, we assume that vibrios have a mobility rate l and move between any two nodes of the hydrological network (say from i to j) with a probability Pij given byEmbedded Image where Pout (Pin) is the probability of moving along an outward (inward) edge and dout (din) is the outdegree (indegree) of node i, that is the number of outward (inward) edges. The bias of hydrological transport along the river can thus be defined as b = PoutPin = 2Pout − 1. The transport process is assumed to be conservative, i.e. ∑j=1d(i) Pij = 1, where d(i) = dout(i) + din(i) is the total degree of node i. Jump probabilities can also be used to account for suitable boundary conditions for vibrio transport at inlet and outlet nodes. Note, finally, that if l = 0, model (2.1) reduces to a set of n-disjointed local models (with n being the number of network nodes) equivalent to that proposed by Codeço [10].

2.2. The role of human mobility patterns in long-range cholera dissemination

While preserving the structure of the network model proposed by Bertuzzo et al. [8,9], we add a second pathway layer describing the effects of human mobility on epidemic dynamics. We assume that the nodes (human communities) of this second layer correspond to those of the hydrological layer, whereas edges are defined by connections among communities (figure 3b). We also suppose that susceptible and infected individuals can undertake short-term trips from the communities where they live towards other settlements for educational, business or personal reasons. While travelling or commuting, susceptible individuals can be exposed to V. cholerae and return as infected carriers to the settlement where they usually live. Similarly, infected hosts can disseminate the disease away from their home community.

Human mobility patterns are technically defined according to a gravity model in which individuals leave their original node (say i) with a probability m, reach their target location (say j) with a probability Qij and then come back to node i. The movement probability Qij is proportional to the ‘mass’ of the target node (its local community size) divided by some measure of the distance Dij between i and j, i.e.Embedded Image where ξ is a shape parameter discounting the effect of distance between nodes. This represents a simplified version of gravity transmission that assumes a linear relationship between movement fluxes and the ‘masses’ of both donor and recipient locations [15,17,19]. With these hypotheses in mind, model (2.1) can be modified as followsEmbedded Image 2.2where σ is the fraction of infected individuals who actually show symptoms requiring intense medical care (i.e. hospitalization) and therefore do not move. In the first two equations of system (2.2), the term within brackets represents the rate of new infections owing to exposure to V. cholerae during short-term trips, while the last term of the third equation describes long-distance water contamination owing to travelling asymptomatic individuals. If m = 0, model (2.2) does obviously reduce to model (2.1).

3. Application of the model to the KwaZulu-Natal case study

We apply the model described above to the cholera outbreak occurred in the KZN province of South Africa in 2000–2001. To this end, we make use of (i) local epidemiological and demographic data; (ii) data about availability and distribution of drinking water resources and toilet facilities; (iii) information on local hydrological networks. Specifically, the relevant epidemiological data (i) are provided by the KZN Health Department ( Data consist of a record of cholera cases including information about date and location (i.e. health subdistrict) of each hospitalized case. The dataset also includes the record of local population size for each health subdistrict (see again figure 1). Georeferenced data on the availability of piped drinking water and improved toilet facilities (ii) have been retrieved from the geographic information system (GIS) set up for the 2001 South African census (; see again figure 2). Hydrological data about the river networks of the KZN province (iii) are derived from the GIS provided by the South African Department of Water Affairs and Forestry (

In our model, perennial rivers and channel endpoints are considered as edges and nodes of the hydrological network, respectively (figure 3a). Demographic, epidemiological, hydrological and sanitary data have thus to be interpolated from health subdistricts to network nodes. Specifically, nearest neighbour interpolation has been used, with distances being computed from subdistrict centroids. Interpolated population abundances and Euclidean node-to-node distances have been used for the gravity model as well (figure 3b).

Water resources are assumed to be related to local population size according to the following relationEmbedded Image where c is a proportionality constant and HT is the population threshold below which water resources are considered to be independent of the size of the local community (figure 4b; see also the relevant discussion in Bertuzzo et al. [8]). Sanitary data can then be used to derive a plausible formalization for the spatially distributed parameters, namely exposure to contaminated water βi and contamination rate pi. As for exposure, we assume the following dependence upon the availability of piped waterEmbedded Image where βm is the maximum exposure rate, ωi is the local fraction of households without access to piped water and ɛ is a shape parameter (figure 4c). As for contamination rate, we assumeEmbedded Image where pm, τi and ϕ are the maximum contamination rate, the local fraction of households without access to improved toilet facilities and a shape parameter, respectively (figure 4d).

By introducing the dimensionless concentration BB/K, model (2.2) can thus be written asEmbedded Image andEmbedded Image where θ = pm/(Kc) is the aggregated contamination rate. A linear stability analysis of the model carried out for l = 0 and m = 0 [10] shows that, given an initial condition of the type Si(0) = Hi , Ii(0) > 0, Bi (0) = 0, an epidemic outbreak can occur in node i only ifEmbedded Image where Embedded Image is the local value of the basic reproductive number of the disease at site Embedded Image. From a technical perspective, Embedded Image is the condition for the disease-free equilibrium to be unstable. Therefore, if Embedded Image, the epidemic is expected to start growing at a rate Embedded Image corresponding to the dominant eigenvalue of the disease-free equilibrium, which can be readily derived with the same linear stability analysis used to compute Embedded Image and reads asEmbedded Image

The basic reproductive number and the initial growth rate of the disease can be used to estimate another important epidemiological parameter, namely the generation time Ti of the disease (also known as serial interval), defined as the time interval between the time of infection of a secondary infectee and the time of infection of its primary infector [24,25]. Assuming a delta distribution for generation times (i.e. that secondary infections always occur with a constant delay), the generation time can be computed [25] asEmbedded Image

Model simulations can be initialized by setting Si(0) = Hi and Ii(0) = 0 in almost all nodes i of the network (as the KZN province is not endemic for cholera, all individuals were potentially susceptible to the disease in 2000–2001), except for those nodes j where the first cases were reported. Specifically, we set Ij(0) = 1 and Sj(0) = Hj − 1 in the nodes where cholera cases were reported during the first week of the outbreak. As for pathogens, we set Bi(0) = Bj(0) = 0, thus assuming that no free-living bacteria were present in the water before the onset of the epidemic, which is a reasonable assumption for non-endemic countries [8]. For the same reason, we impose absorbing boundary conditions at the outlets of the river networks (while we set reflecting boundaries at the inlet nodes). In order to compare simulation results with epidemiological records, we have to evaluate the dynamics of reported cholera cases, i.e. those requiring hospitalization. According to the model, the dynamics of cumulative hospitalized hosts would be the solution of the following equationEmbedded Image which is initialized as I′i(0) = Ii(0) = 0 and I′j(0) = Ij(0) = 1.

4. Model parametrization and parameter estimation

Several model parameters can be reliably drawn from the literature, or else from demographic/epidemiological data (table 1). The mortality rate of the population (μ) can be computed as the inverse of the average human lifetime in the KZN region (about 60 years; [8]), hence μ = 4.6 × 10−5 (d−1). Following Codeço [10] and Bertuzzo et al. [8], we assume that people can be exposed to contaminated water or food at most once a day in the worst-case scenario (ω = 1), thus we set βm = 1.0 (d−1). The recovery rate γ from cholera can be evaluated as the inverse of the average duration of the disease in infected individuals, which is approximatively 5 days [10], therefore γ = 0.20 (d−1). Based on the count of lethal cholera cases recorded during the KZN epidemic, Bertuzzo et al. [8] estimated the value of the additional mortality rate owing to cholera as α = 4.0 × 10−4 (d−1). In the same study, it is suggested that the numerical value of the mortality rate μB of free-living vibrios in the Thukela basin (the largest watershed of KZN), i.e μB = 0.23 (d−1), which is well within the range usually reported in the literature [10]. Bertuzzo et al. [8] also estimated the numerical threshold HT for settlement size in the water-population relationship introduced above, based on data from the KZN province, i.e. HT = 2.9 × 104 (no. of individuals). By considering that about 25 per cent of infected individuals show symptoms, and that among these people just around 20 per cent develop profuse diarrhoea requiring medical attention [7], the fraction of symptomatic hospitalized individuals can be set at σ = 5.0 × 10−2.

View this table:
Table 1.

Model parametrization. Parameter values as derived from the literature (see text for the relevant references).

The remaining parameters cannot be derived from the existing literature. Although some of them could perhaps be evaluated by targeted field campaigns, the conceptual nature and the likely time variability of some others (for instance, exposure and contamination rates) may hardly result in directly observable identifications. This is specifically the case of the quantities θ, ɛ and ϕ, which are related to KZN sanitation conditions—which are studied here for the first time in a spatially explicit framework. The parameters l and Pout describing pathogen transport along the river network cannot be deduced from the existing studies either. Although Bertuzzo et al. [8] estimated both parameters for the Thukela basin, the proposed numerical values clearly reflect their assumption that V. cholerae moves solely along the river network—an assumption that is relaxed here by assuming that there exists different propagation pathways for the disease. For the same reason, no direct estimates for the parameters describing human movement (namely m and ξ) exist. All these parameters have thus to be numerically tuned through proper techniques.

Parameter estimation is performed using an optimization approach based on Markov chain Monte Carlo (MCMC) sampling, i.e. a family of methods allowing for the exploration of the posterior density function of a desired probability distribution (in our case, the joint probability distribution of the set of tuning parameters; see [26]). Specifically, we use the DREAM (differential evolution adaptive Metropolis) algorithm [27], an efficient implementation of MCMC that runs multiple different chains simultaneously to ensure global exploration of the parameter space, and adaptively tunes the scale and orientation of the jumping distribution using differential evolution (a genetic algorithm proposed by Storn & Price [28]) and a Metropolis–Hastings update step [29,30]. More specifically, we adopt the DREAMZS variant of the DREAM algorithm, which also makes use of sampling from past states visited by the Markov chains and of a snooker update step (in addition to the parallel update steps) to generate candidate points in each individual chain—thus reducing the number of parallel chains needed for an effective exploration of the posterior distribution while at the same time increasing the diversity of candidate points [31]. The goodness of each single simulation is computed as the residual sum of squares between weekly hospitalized cholera cases per hydrological unit as evaluated from the KZN epidemiological record and from the model. The algorithm is initialized with broad flat prior distributions for parameter values and is let running until convergence (about 15 000 iterations).

5. Simulation results

The best-fit model is able to reproduce fairly well the spatio-temporal dynamics of the KZN cholera epidemics (table 2 and figure 5). In fact, not only does the model provide a good match of the temporal evolution of the outbreak (figure 5c, coefficient of determination r2 = 0.89), but it also provides a satisfactory description of the spatial distribution of the reported cholera cases (figure 5d; r2 = 0.79). In particular, the model is able to correctly reproduce the diffuse spread of the epidemics in coastal regions, the relatively small incidence of the disease in the highly populated area around Durban (i.e. the eThekwini metropolitan municipality; figure 3b), the high number of cases reported in the Thukela basin (the largest basin in the region; figure 3a) and the patchy distribution of hospitalized cases in other inland regions. In addition, the model reproduces the fact that the highest values of cholera incidence are found in medium-sized local communities (figure 5d, inset), as already found by Bertuzzo et al. [8] with reference to the KZN epidemic.

View this table:
Table 2.

Model fitting. Best-fit parameter values and 90% credible intervals for the full model and the model without human mobility (m = 0). Credible intervals are computed by randomly sampling 1000 states of the Markov chains after convergence of the DREAMZS algorithm (see text for details).

Figure 5.

Simulation of the KZN cholera outbreak. Comparison between (a) observed and (b) generated spatial patterns of cumulated cholera cases (best-fit full model). Spatial linear interpolation of the node values has been performed in both panels. White dots in (a) show the location of the cases recorded during the first week of the epidemics, used as initial condition for model simulations. (c) Temporal evolution of the epidemic as recorded from field data (red) and simulated by the best-fit full model (blue). The shaded area shows the uncertainty of model simulations evaluated as the envelope of a set S of 200 randomly selected runs corresponding to parameter settings explored by the DREAMZS algorithm after convergence. (d) Comparison between observed and generated epidemic records catchment by catchment (best-fit full model). Dot size is proportional to catchment population (log scale). Error bars show the variability found within S. Inset: yearly cholera incidence versus local community size, as evaluated from data (red) and the best-fit full model simulation (blue). Incidence values for each network node have been grouped via a logarithmic binning based on local population size. Bins and error bars represent the mean value for the best-fit simulation and minimum/maximum values among the simulations in S. Parameter values as in tables 1 and 2 (full model).

As a result of the spatially heterogeneous distribution of population densities, water resources and sanitation conditions, all the epidemiological parameters introduced above (R0, r and T) are found to be highly variable in space. In particular, the basic reproductive number R0 (computed with l = 0 and m = 0) assumes values between 1.55 × 10−2 and 1.31 (5–95th percentiles), with a mean value of 0.321 (1.51 for the nodes where R0 > 1) and about 90 per cent of the network nodes characterized by R0 ≤ 1. This suggests that transport processes play a role of paramount importance in the development and the maintenance of the epidemic. As for the other epidemiological parameters, the initial growth rate of the epidemic takes values between 1.30 × 10−3 and 0.114 (d−1), with a mean value of 4.62 × 10−2 (d−1), while the generation time falls between 7.13 and 9.29 days, with a mean value of 8.51 days. Note that growth rates and generation times have been evaluated only for the nodes where R0 > 1. Note also that, quite interestingly, our estimates of the generation time are very close to those usually proposed for cholera epidemics (7–10 days; see Checchi et al. [32]).

To disentangle the impact of each model parameter on the intensity of the epidemic outbreak, we have performed a sensitivity analysis of the model outcomes with respect to variations in the parameter values. In particular, we allowed the parameters to vary one by one through repeated model runs (±20% variations with respect to the best-fit parameter set). We have then computed the variations of simulated yearly cholera incidence (i.e. the total number of reported cholera cases) evaluated in the whole area with respect to the best-fit simulation. The results of this sensitivity analysis (figure 6) show that the parameters quantifying the availability of clean water (βm and ɛ) and toilet facilities (θ and ϕ) are responsible for some of the largest variations of cholera incidence, thus showing that sanitation conditions do obviously play a key role in controlling the spread of the disease. Note that epidemiological dynamics and the ecology of V. cholerae do also matter, respectively, through the recovery rate γ and the mortality rate μB.

Figure 6.

Effects of parameter variations on the simulation of the best-fit full model. (a) Variations of total yearly cholera incidence produced by ±20% variations of the parameters reported in table 1. (b) As in (a), with reference to the tuning parameters reported in table 2 (±20% of the best-fit values obtained with the full model). Black (grey) refers to simulations in which the value of the relevant parameter has been increased (decreased) by 20%.

On the contrary, the variations of the human mobility parameters m and ξ considered in figure 6 produce a small effect on cholera incidence. However, this does not imply that human mobility can be neglected. The role played by human mobility in the formation of the spatio-temporal epidemic pattern can be in fact better understood by examining the performance of a model that does not account for it. To this end, we have fitted a model with m = 0 (i.e. model 2.1). Our results (table 2 and figure 7) show that in the absence of human movement, cholera cannot spread outside the hydrological catchments where the epidemic started. As a consequence, both the intensity (figure 7a) and the spatial extent (figure 7b) of the epidemic outbreak turn out to be remarkably underestimated. Therefore, human mobility is the actual driver of inter-catchment dissemination of cholera pathogens, thus proving instrumental to the understanding of large-scale patterns of cholera spread. Therefore, a model without human mobility cannot be expected to perform as well as a model accounting for human-mediated vibrio dissemination. Similar performances might perhaps be achieved in a model without mobility at the expense of modelling the ‘seeding’ of different outbreaks with the right timing in each basin—which would lead to a remarkably less parsimonious approach, as it would require the detailed knowledge of the initial condition of the epidemic, and would preclude the use of models like this as a predictive/management tool.

Figure 7.

The role of human mobility in the spread of cholera. (a) Temporal evolution of the epidemic as recorded from field data (red) and simulated by the best-fit model without human mobility (m = 0, blue line). The shaded area shows the uncertainty of model simulations and is evaluated as the envelope of a set S′ of 200 randomly selected runs corresponding to parameter settings explored by the DREAMZS algorithm after convergence. (b) Comparison between observed and generated epidemic records catchment by catchment (best-fit model without human mobility). Dot size is proportional to catchment population (log scale). Error bars show the variability found within S′. Inset: spatial distribution of cumulated cholera cases as predicted by the best-fit full model with m = 0.

Because of their effects on disease incidence (see again figure 6), sanitation conditions are ideal candidates as targets for possible proactive intervention strategies aimed at reducing the impact of cholera epidemics. As suggested by WHO, in fact, prevention is essential to avert cholera outbreaks, namely by expanding access to clean water resources and toilet facilities [3]. Model 2.2 can be effectively used to assess the impacts of interventions designed to reduce the fraction of households without access to piped water or toilet facilities. To this end, we assume thatEmbedded Image where Embedded Image (Embedded Image) is the fraction of households currently without access to clean water (toilet facilities) and uω (uτ) is the sanitation effort. As a proof of concept, we separately consider actions on piped water and toilets, and assume three different spatial allocations for sanitation efforts, i.e. (i) spatially homogeneous; (ii) targeted to poorly sanitated communities (Embedded Image or Embedded Image); (iii) targeted to large communities (Hi ≥ 10 000). Higher levels of sanitation efforts do obviously lead to larger portions of the population being subject to interventions. Our analysis (figure 8) shows that all such interventions may greatly reduce the expected incidence of the disease (or even prevent its insurgence). However, owing to KZN hygienic conditions, interventions targeted to large communities are found to be far less effective than either a spatially homogeneous allocation of sanitation efforts or actions targeted to poorly sanitated communities, with the latter being the most effective interventions for relatively small values of sanitation effort (in particular with actions on piped water). Therefore, understanding the correlations between population distribution and hygienic conditions is fundamental to the planning of cost-effective sanitation campaigns.

Figure 8.

Effects of preventive interventions on sanitation conditions. (a) Total yearly cholera incidence versus population involved in sanitation efforts aimed at improving access to piped water. Solid, dashed and dotted lines, respectively, represent spatially homogeneous interventions, interventions targeted to poorly sanitated communities (ω ≥ 0.5) and interventions targeted to large communities (Hi > 10 000). (b) As in (a), with reference to efforts aimed at improving access to toilet facilities.

6. Discussion and conclusions

In this work, we have proposed a mechanistic spatially explicit model for cholera epidemics combining different pathways of long- and short-range propagation of the disease, and we have applied it to a cholera outbreak occurred in the KZN province of South Africa during years 2000–2001. The novelty of our approach lies in the blending of different propagation mechanisms, namely hydrological transport and human-mediated dissemination of V. cholerae pathogens. At a regional scale, in fact, pathogen transport along river systems alone cannot explain the emergence of complex epidemic patterns, which include inter-catchment dispersal of V. cholerae. Other mechanisms have thus to be invoked. Here, we have explicitly analysed the role played by human mobility in the long-range dissemination of the disease. By extending an epidemiological model of reactive transport along networks [8,9], we have identified a second transport network for V. cholerae pathogens produced by the temporary displacements of susceptible individuals and asymptomatic infected carriers. Mobility patterns indeed proved to be not only a plausible, but indeed an actual and important driver of disease propagation. In particular, the proposed value of the shape parameter discounting the effect of distance between communities on human movement (ξ ≈ 1, as commonly found in epidemiological studies, e.g. [13,16]) supports the idea that mobility patterns are determined by directed movements along the fastest routes to target locations (while ξ ≥ 2 would correspond to diffusive-like movement patterns; see the relevant discussion in Xia et al. [13]).

The combination of local epidemiological dynamics, hydrological transport and long-distance V. cholerae redistribution may explain the main spatial and temporal patterns of cholera outbreak recorded in the KZN epidemics. Notice that the complexity of the proposed model is kept to a minimum, in that the model accounts for a few selected processes that are relevant to the spread of the disease on short timescales (less than 5 years, corresponding to the timescale of immunity loss; see Koelle et al. [33]). Other mechanisms not described here might possibly play important roles in the formation of spatio-temporal patterns of cholera propagation in the long run, especially in regions where the disease is endemic. In particular, the existence of hyperinfectious bacterial states (e.g. [3436]), the long-term loss of immunity of recovered hosts [33] and the seasonality of environmental or climatic forcings [3739] may all be significant ingredients of cholera dynamics on longer timescales than those investigated here. Note also that human mobility patterns other than gravity transmission (e.g. random movement, small-world connectivity, etc.) might be considered in our framework effortlessly.

Our analysis points out the importance of a spatially explicit approach to modelling large-scale cholera epidemics arising in regions characterized by heterogeneous environmental conditions. In fact, a proper spatial description of local sanitation conditions proved to be crucial for the model to accurately reproduce the epidemic patterns observed in the KZN region. A fully spatially explicit approach is also necessary whenever the timescale imposed by spreading processes (e.g. hydrological transport) is larger than that related to local epidemiological dynamics (e.g. disease transmission), as recently proved on a theoretical basis [9]. In this work, spatial interactions have been described within a multi-layer network framework. We deem this approach an effective and general tool to model the spatio-temporal evolution of cholera epidemics and, in general, of waterborne diseases characterized by multiple dispersal pathways. In a broader context, as the coexistence of multiple propagation pathways is not unique to epidemiology, the idea of a multi-layer network approach can be profitably extended to other disciplines, such as the study of harmful invasive species [40,41].

In conclusion, a comprehensive picture of the mechanisms that drive the evolution of cholera outbreaks at different time and spatial scales begins to emerge, with possible relevance to other waterborne diseases. Such a picture involves different pathways for pathogen dispersal. At present, the effectiveness of our framework relies on the availability of reliable epidemiological and demographic datasets, as well as of a detailed description of water use habits and sanitation conditions. Such requirements might be restrictive in contexts where cholera is non-endemic, or site-specific estimates of the relevant epidemiological parameters are lacking. Nevertheless, our work suggests that mechanistic models like the one presented here, properly tuned with the acquisition of a relatively small set of core epidemiological data, could be turned into general predictive/management tools to produce and analyse realistic epidemic scenarios. Such predictions are believed to be meaningful for contexts where cholera still exacts a heavy toll of human lives.


The authors wish to thank the KwaZulu-Natal Department of Health, the provincial office of Statistics South Africa and the South African Department of Water Affairs for providing demographic and epidemiological data on the 2000–2002 KZN epidemics. The authors also wish to thank three anonymous reviewers for their comments on a previous version of this manuscript, and Dr Jasper A. Vrugt (University of California, IR, USA) for making available the source code of the DREAM and DREAMZS algorithms. L.M., E.B., L.R. and A.R. gratefully acknowledge funding from ERC Advanced Grant RINEC 22761 and SNF Grant 200021_124930/1. R.C. and M.G. acknowledge the support provided by Politecnico di Milano and Italian Ministry of Research (Interlink project no. II04CE49G8). I.R.-I. acknowledges the support of the James S. McDonnell Foundation through a grant for Studying Complex Systems (220020138).

  • Received May 16, 2011.
  • Accepted June 22, 2011.


View Abstract