Perspectives on the basic reproductive ratio

J.M Heffernan, R.J Smith, L.M Wahl


The basic reproductive ratio, R0, is defined as the expected number of secondary infections arising from a single individual during his or her entire infectious period, in a population of susceptibles. This concept is fundamental to the study of epidemiology and within-host pathogen dynamics. Most importantly, R0 often serves as a threshold parameter that predicts whether an infection will spread. Related parameters which share this threshold behaviour, however, may or may not give the true value of R0. In this paper we give a brief overview of common methods of formulating R0 and surrogate threshold parameters from deterministic, non-structured models. We also review common means of estimating R0 from epidemiological data. Finally, we survey the recent use of R0 in assessing emerging diseases, such as severe acute respiratory syndrome and avian influenza, a number of recent livestock diseases, and vector-borne diseases malaria, dengue and West Nile virus.


1. Introduction

The basic reproductive ratio, R0, is a key concept in epidemiology, and is inarguably ‘one of the foremost and most valuable ideas that mathematical thinking has brought to epidemic theory’ (Heesterbeek & Dietz 1996). Originally developed for the study of demographics (Böckh 1886; Sharp & Lotka 1911; Dublin & Lotka 1925; Kuczynski 1928), it was independently studied for vector-borne diseases such as malaria (Ross 1911; MacDonald 1952) and directly transmitted human infections (Kermack & McKendrick 1927; Dietz 1975; Hethcote 1975). It is now widely used in the study of infectious disease, and more recently, in models of in-host population dynamics. Two excellent surveys of the tangled history of R0 can be found in Dietz (1993) and Heesterbeek (2002). An excellent overview of the demographic history can be found in Smith & Keyfitz (1977).

As a general definition, R0 is the expected number of secondary individuals produced by an individual in its lifetime. The interpretation of ‘secondary’, however, depends on context. In demographics and ecology, R0 is taken to mean the lifetime reproductive success of a typical member of the species. In epidemiology, we take R0 to mean the number of individuals infected by a single infected individual during his or her entire infectious period, in a population which is entirely susceptible. For in-host dynamics, R0 gives the number of newly infected cells produced by one infected cell during its lifetime, assuming all other cells are susceptible.

From this definition, it is immediately clear that when R0<1, each infected individual produces, on average, less than one new infected individual, and we therefore predict that the infection will be cleared from the population, or the microparasite will be cleared from the individual. If R0>1, the pathogen is able to invade the susceptible population. This threshold behaviour is the most important and useful aspect of the R0 concept. In an endemic infection, we can determine which control measures, and at what magnitude, would be most effective in reducing R0 below one, providing important guidance for public health initiatives.

The magnitude of R0 is also used to gauge the risk of an epidemic or pandemic in emerging infectious disease. For example, the estimation of R0 was of critical importance in understanding the outbreak and potential danger from severe acute respiratory syndrome (SARS) (Choi & Pak 2003; Lipsitch et al. 2003; Lloyd-Smith et al. 2003; Riley et al. 2003). R0 has been likewise used to characterize bovine spongiform encephalitis (BSE) (Woolhouse & Anderson 1997; Ferguson et al. 1999; de Koeijer et al. 2004), foot and mouth disease (FMD) (Ferguson et al. 2001; Matthews et al. 2003), novel strains of influenza (Mills et al. 2004; Stegeman et al. 2004) and West Nile virus (Wonham et al. 2004). The incidence and spread of dengue (Luz et al. 2003), malaria (Hagmann et al. 2003), Ebola (Chowell et al. 2004b) and scrapie (Gravenor et al. 2004) have also been assessed using R0 in recent literature. Topical issues such as the risks of indoor airborne infection (Rudnick & Milton 2003), bioterrorism (Kaplan et al. 2002; Longini et al. 2004), and computer viruses (Lloyd & May 2001) also rely on this important concept.

Ongoing theoretical work has extended R0 for a range of complex models, including stochastic and finite systems (Nasell 1995), models with spatial structure (Mollison 1995b; Lloyd & May 1996; Keeling 1999) or age-structure (Anderson & May 1991; Diekmann & Heesterbeek 2000; Hyman & Li 2000), and macroparasite models (Anderson & May 1991; Diekmann & Heesterbeek 2000). We note, however, that the practical use of R0 has been, for the most part, restricted to very simple deterministic systems. For comparison with this ‘field’ literature in epidemiology, we restrict our attention in the following sections to deterministic, unstructured microparasite models.

The purpose of this paper is to review the various methods currently in use for the derivation of R0, highlighting the difference between R0 and surrogate parameters with equivalent threshold behaviour. We then discuss methods commonly used to estimate R0 from incidence data. Finally, we give an overview of the recent use of R0 in assessing emerging and endemic disease. Our aim in this final section of the paper is to determine the usefulness of this endeavour: to what extent has estimating R0 informed public health measures?

2. Derivations of R0 from a deterministic model

The derivation of R0 from a non-spatial, deterministic model is fairly straightforward from first principles. The survival function method (§2.1) gives the ‘gold standard’ determination of R0, and is applicable even when non-constant transmission probabilities between classes (i.e. non-exponential lifetime distributions) are assumed. For models which include multiple classes of infected individuals, the next generation operator is the natural extension of this approach (§2.2). However, we note that the definition of R0 may have more than one possible interpretation in the multi-class system, as discussed below.

2.1 Survival function

The method we describe as the ‘survival function’ approach is, in essence, a first-principles definition of R0, and thus has a rich history of use. The approach is described in detail in Heesterbeek & Dietz (1996), who also give an interesting historical overview.

Consider a large population and let F(a) be the probability that a newly infected individual remains infectious for at least time a. This is called the survival probability. Also, let b(a) denote the average number of newly infected individuals that an infectious individual will produce per unit time when infected for total time a. Then, R0 is given by:Embedded Image(2.1)

As this expression yields R0 by definition, this approach will be appropriate for any model in which closed-form expressions can be given for the underlying survival probability, F(a), and the infectivity as a function of time, b(a). In particular, it is straightforward to handle situations in which infectivity depends on time, since infection, or other transmission probabilities between states, vary with time. Thus, this derivation of R0 is not restricted to systems described by ordinary differential equations (ODEs).

This method can also be naturally extended to describe models in which a series of states are involved in the ‘reproduction’ of an infected individual. As an example of the latter technique, consider epidemic modelling of malaria. An infected human may pass the infection to a mosquito, which may in turn infect more humans. This complete cycle must be taken into account in our derivation of R0, which we might expect to yield the total number of infected humans produced by one infected human. In general, if only two distinct infectious states are involved in such an infection cycle, F(a) can be defined as the probability that an individual in state 1 at time zero produces an individual who is in state 2 until at least time a. Similarly, b(a) is the average number of new individuals in state 1 produced by an individual who has been in state 2 for time a. In modelling malaria, F(a) could be the probability that a human infected at time zero produces an infected mosquito which remains alive until at least time a. In more concrete terms, F(a) would be the integral of the following product:Embedded Image(2.2)while b(a) would simply be the average number of humans newly infected by a mosquito which has been infected for time a. (Note that we could also take the infected mosquito as state 1, deriving an analogous expression which would yield the same value of R0.)

Unfortunately, derivations such as equation (2.2) become increasingly cumbersome as this method is extended to infection cycles involving three or more states (Hethcote & Tudor 1980; Lloyd 2001b; Huang et al. 2003). In these situations, the next generation operator offers an elegant solution, as described in the following section.

2.2 Next generation method

A rich history in the literature addresses the derivation of R0, or an equivalent threshold parameter, when more than one class of infectives is involved (Rushton & Mautner 1955; Hethcote 1978; Nold 1980; Hethcote & Thieme 1985).

The next generation method, introduced by Diekmann et al. (1990), is a general method of deriving R0 in such cases, encompassing any situation in which the population is divided into discrete, disjoint classes. The next generation operator can thus be used for models with underlying age structure or spatial structure, among other possibilities. For typical implementations, continuous variables within the population are approximated by a number of discrete classes. This approximation assumes that transmission probabilities between states are constant, or equivalently, that the distribution of residence times in each state is exponential.

The next generation operator is fully described in Diekmann & Heesterbeek (2000) and a number of salient cases are elucidated in van den Driessche & Watmough (2002). Recent examples of this method are given in Matthews et al. (1999), Porco & Blower (2000), Castillo-Chavez et al. (2002), Hill & Longini (2003) and Wonham et al. (2004).

In the next generation method, R0 is defined as the spectral radius of the ‘next generation operator’. The formation of the operator involves determining two compartments, infected and non-infected, from the model. In this section, we outline the steps needed to find the next generation operator in matrix notation (assuming only finitely many types), and then employ this method for a susceptible–exposed–infectious–recovered (SEIR) model and a model of malaria. (For a detailed explanation on the formation of the next generation operator when there are infinitely many types see pp. 95–96 of Diekmann & Heesterbeek (2000).)

Let us assume that there are n compartments of which m are infected. We define the vector Embedded Image, i=1,…,n, where xi denotes the number or proportion of individuals in the ith compartment. Let Embedded Image be the rate of appearance of new infections in compartment i and let Embedded Image where Vi+ is the rate of transfer of individuals into compartment i by all other means and Vi is the rate of transfer of individuals out of the i th compartment. The difference Embedded Image gives the rate of change of xi. Note that Fi should include only infections that are newly arising, but does not include terms which describe the transfer of infectious individuals from one infected compartment to another.

Assuming that Fi and Vi meet the conditions outlined by Diekmann et al. (1990) and van den Driessche & Watmough (2002), we can form the next generation matrix (operator) FV−1 from matrices of partial derivatives of Fi and Vi. Specifically,Embedded Imagewhere i, j=1,…,m and where x0 is the disease-free equilibrium. The entries of FV−1 give the rate at which infected individuals in xj produce new infections in xi, times the average length of time an individual spends in a single visit to compartment j. R0 is given by the spectral radius (dominant eigenvalue) of the matrix FV−1.

As an example, let us consider an SEIR model. Since we are concerned with the populations that spread the infection we only need to model the exposed, E, and infected, I, classes. Let us define the model dynamics using the following equations:Embedded Image(2.3)where μ is the per capita natural death rate, β is the efficacy of infection of susceptible individuals S, k is the rate at which a latent individual becomes infectious and γ is the per capita recovery rate. For this systemEmbedded Image(where λ is the birth rate of susceptibles) andEmbedded Imageand thusEmbedded Image(2.4)Note that this is also the value of R0 determined by the survivor function method.

For the second example, we consider a model of malaria. Let us describe the rate of change of the infected human, HI, and mosquito, MI, populations by the following equations:Embedded Image(2.5)

Infected humans are produced by the infection of susceptible humans, HS, by an infected mosquito with efficacy βMH. We assume that they die with natural death rate μH, die due to infection with rate σ and recover from the infection with rate α. Infected mosquitoes are produced when susceptible mosquitoes, MS, bite infected humans. We assume that this process has efficacy βHM and assume that infected mosquitoes can only leave the infected compartment by dying naturally with rate μM. For this system we find thatEmbedded ImageandEmbedded Image

Since V is non-singular we can determine V−1. Thus,Embedded Image(2.6)

For comparison, we also compute the value of R0 for this system using the survival method:Embedded Image(2.7)

The difference here is a matter of definition: the survival function gives the total number of infectives in the same class produced by a single infective of that class, while the next generation operator gives the mean number of new infectives per infective in any class, per generation. Values corresponding to the latter definition thus depend on the number of infective classes in the infection cycle. We note that the latter definition is widely accepted as standard in the biomathematics literature (e.g. Diekmann & Heesterbeek 2000), but the former definition has also been used extensively (Anderson & May 1991; Barbour & Kafetzaki 1993; Nowak & May 2000), and is still in standard use in epidemiology (Hagmann et al. 2003; Luz et al. 2003) and immunology (Huang et al. 2003).

3. Derivations of threshold criteria

As mentioned in §1, the most important feature of R0 is that it reflects the stability of the disease-free equilibrium. When R0<1, this equilibrium is stable and we predict that the pathogen will be cleared.

Surveying the recent literature, it quickly becomes apparent that a number of related quantities, all of which share this ‘threshold’ behaviour, are used as surrogates for R0. For example, R0n (n>0) will give an equivalent threshold, but does not give the number of secondary infections produced by a single infectious individual.

The methods outlined in the following section each derive, from a deterministic model, a quantity which shares this predictive threshold with R0. For some models, these methods will, in fact, yield the true value of R0, but this is by no means guaranteed. If a prediction of whether the pathogen will persist or be cleared is the only feature of interest, a threshold criterion is sufficient—however, these methods cannot be used to compare risks associated with different pathogens.

We outline three such threshold criteria below, giving examples where each is used in the literature.

3.1 Jacobian and stability conditions

A predictive threshold is often found through the study of the eigenvalues of the Jacobian at the disease-free equilibrium (for an overview see Diekmann & Heesterbeek 2000). This is a simple, widely used method for ODE systems. Using this method, a parameter is derived from the condition that all of the eigenvalues of the Jacobian have a negative real part. This can easily be done using the characteristic polynomial and the Routh–Hurwitz stability conditions.

The Jacobian method clearly allows us to derive a parameter that reflects the stability of the disease-free equilibrium. The parameter obtained in this way, however, may or may not reflect the biologically meaningful value of R0. An example where the Jacobian method does not yield R0 is described in detail in Diekmann & Heesterbeek (2000; exercise 5.43). Despite this caveat, the technique remains popular; recent uses of this criterion in the literature include Porco & Blower (1998); Murphy et al. (2002); Kawaguchi et al. (2004); Laxminarayan (2004) and Moghadas (2004). In Roberts & Heesterbeek (2003), it is suggested that if this threshold parameter does not have the same biological interpretation as the dominant eigenvalue of the next generation matrix, then it should not be called the basic reproductive ratio, nor denoted R0.

3.2 Existence of the endemic equilibrium

Similarly, we can often derive a condition based on parameter values such that when the condition holds, the endemic equilibrium exists, whereas when the condition is false, only the disease-free equilibrium exists. Mathematically, we are referring to a transcritical bifurcation, and we know that the condition must switch from being false to true at parameter values which give R0=1.

For example, consider the model of herpes simplex virus described in Blower et al. (1998). For simplicity, we can ignore drug resistance (i.e. p1=p2=0). This model then consists of three differential equationsEmbedded Imagewhere X is the susceptible population, QS represents those infected with the virus in the non-infection latent state, HS represents those infected with the virus in infectious state and N=X+QS+HS. (Other letters are positive parameters.) At equilibrium,Embedded ImageThus, either HS=0 (the disease-free equilibrium) orEmbedded Image

(the endemic equilibrium). It follows that the endemic equilibrium only exists whenEmbedded Imageand does not exist if the reverse inequality holds.1

Outbreaks of infectious periods are brief, but continue over the course of the patients' lifetime, with the virus quiescent at other times. This makes calculating R0 from other methods quite complicated.

3.3 Constant term of the characteristic equation

For more complex models, the characteristic equation may be of the formEmbedded Imagewith p1, p2,…,pn−1>0. In this special case, n−1 roots of the polynomial have negative real part. When p0=0, the nth root, or largest eigenvalue, is zero, when p0>0, all eigenvalues are negative, whereas when p0<0, the largest eigenvalue has positive real part. Thus, the stability is determined solely by the sign of the constant term of the characteristic equation.

For example, consider the multi-strain tuberculosis model described in Blower & Chou (2004). In eqn (6) in their appendix, their characteristic polynomial isEmbedded ImagewhereEmbedded Imageand all parameters non-negative. Note that Bi has the property that Bi>0 when Ci=0. For each strain i, the equation for Ci=0 is rearranged to produceEmbedded ImageEach R0(i) value has the property that R0(i)=1 when Ci=0, R0(i)<1 when Ci>0 and R0(i)>1 when Ci<0.

Calculating an R0(i) for each strain using the methods from previous sections is extremely difficult, as is calculating a formula for the endemic equilibrium. However, the Jacobian matrix at the disease-free equilibrium is relatively tractable, so an R0(i) for each strain can be calculated from the constant term. This method generally allows for the calculation of threshold criteria when other methods fail.

4. Estimations from epidemiological data

The previous sections addressed methods of formulating R0 in terms of the parameters of some deterministic model. In order to estimate the value of R0 from incidence data, however, we require numerical estimates of a number of these parameters. Typically, death rates and recovery rates are readily estimated; in contrast, the contact or transmission rate is difficult to determine from direct measures. For this reason, R0 is rarely estimated using formulae such as equations (2.6) and (2.7) above. We outline a number of alternative approaches for estimating R0 from available data in §§4.1–4.4. These approaches typically involve simplifying assumptions to reduce the number of unknown parameters. For more complete overviews of these techniques, we refer the reader to Mollison (1995a), Diekmann & Heesterbeek (2000) and Hethcote (2000).

4.1 Susceptibles at endemic equilibrium

This method assumes that an endemic equilibrium is attained and uses the prevalence of the infection at this equilibrium to estimate R0. Following Mollison's (1995a) derivation, we consider a single infected individual and note that the number of successful contacts (in which the infection is passed on) for that individual should be given by R0πs, where πs is the probability that a given contact is with a susceptible. At equilibrium, the average number of new infections per infected individual must be exactly one, allowing us to write R0=1/πs. Under the assumption of homogenous mixing, the unknown probability, πs, can be estimated as the fraction of the host population that is susceptible at the endemic equilibrium. This yields an extremely simple estimate of the basic reproductive ratio, which has been used extensively (see Anderson & May (1991) for review).

An interesting point here is that R0 reflects not only the behaviour of the system at the uninfected equilibrium (which is apparent by definition), but may also, under certain assumptions, reflect important features of the endemic equilibrium. Similar to other ODE methods, we must first assume that the host population is homogenous, that is, all hosts have intrinsically similar epidemiological properties, independent of age, genetic make-up, geography, and so on. We also assume mass-action transmission, specifically, that the number of contacts per infective is independent of the number of infectives. The accuracy of this estimate will clearly depend on the degree to which these assumptions hold; if infectivity or mortality vary with age, for example, the approximation suffers.

Mathematically, this method may seem unrealistic at first glance, as R0<1 would imply that the fraction of susceptibles is greater than one. This is because there is a transcritical bifurcation at R0=1 and the number of susceptibles of the ‘endemic’ equilibrium is actually negative. During this portion of the bifurcation diagram, the uninfected equilibrium is stable, and hence the initial condition ensures that negative individuals cannot be reached. Practically, this means that when R0<1 we would never find a population at the endemic equilibrium, and could not apply this method. (Note that when the assumption of mass-action transmission is relaxed, a backward bifurcation may occur at R0=1, and diseases with R0<1 may persist (Dushoff 1996; Dushoff et al. 1998).)

Recent examples of this method include Heesterbeek (2003) and Ferguson et al. (2001).

4.2 Average age at infection

A related approach, also based on the endemic equilibrium, is that R0 can be estimated as L/A, where L is the mean lifetime and A is the mean age of acquiring the disease (Dietz 1975). A derivation for this simple relation is also provided by Mollison (1995a) and Hethcote (2000); for further discussion, see Anderson & May (1991) and Brauer (2002). In brief, we must assume that all individuals are born susceptible, that after acquiring the disease they are no longer susceptible, that the population is at the endemic equilibrium (i.e. R0>1) and that homogenous mixing, particularly among age groups, occurs. While this strong set of assumptions might never be fully realized in a practical setting, the usefulness of this approach is clear since both L and A are readily measured. This method was recently used to estimate R0 for endemic canine pathogens (Laurenson et al. 1998).

4.3 The final size equation

While the previous two methods estimate R0 from the endemic equilibrium, the final size equation is applicable to closed populations only, where the infection leads either to immunity or death. In this situation, the number of susceptibles can only decrease and the final fraction of susceptibles, s(∞), can be used to estimate R0:Embedded Image

This was first recognized by Kermack & McKendrick (1927); for a detailed derivation and discussion, see Diekmann & Heesterbeek (2000), Hethcote (2000) and Brauer (2002). This estimate holds when the disease itself does not interfere with the contact process, or when contact intensity is proportional to population density.

4.4 Calculation from the intrinsic growth rate

Finally, R0 may be determined from the intrinsic growth rate of the infected population. This growth rate, often denoted r0, is the rate at which the total number of infectives, I, grows in a susceptible population, such that dI/dt=r0I. We note that this is an implicit definition of r0, and thus from a modelling perspective using r0 is seldom elegant.

Using incidence data, however, r0 can often be approximately measured from the growth rate of the infected class, and R0 can subsequently be estimated from r0. There are several possible problems with this approach: firstly, stochastic fluctuations in the early stages of the epidemic can obscure the measure of r0 (see Heffernan & Wahl in press); secondly, reporting inaccuracies are very likely to bias the incidence data. Finally, even when r0 can be measured with some confidence, the relationship between R0 and r0 is highly model dependent.

In the simplest possible models, when infectivity is constant throughout the infectious period, R0 can be estimated as 1+r0L, where L is the expected duration of the infectious period. (The ‘one’ is necessary in this expression because R0 reflects the total number of new infections, whereas the overall growth rate r0 includes the death of the founding individual.) For more complex models, the relation between r0 and R0 can be derived by expressing both in terms of the model parameters, exploiting that fact that the spectral radius of the Jacobian, evaluated at the disease-free equilibrium, gives r0. (This is apparent from the definition of r0.) We also note that r0 itself can be used as a threshold parameter, since R0<1 implies r0<0. Thus, the condition r0<0 is actually equivalent to the ‘Jacobian’ method described in §3.1.

As an example, consider Nowak et al. (1997) and Lloyd (2001a), who studied the within-host dynamics of viral disease. From standard models of viral dynamics, they find that the relationship between R0 and r0 isEmbedded Image(4.1)where a is the death rate of the infected cells and u is the clearance rate of the virions. If Embedded Image then the relation approachesEmbedded Image(4.2)Since 1/a is the expected lifetime of an infected cell, this expression is consistent with our previous approximation of R0.

This method proves useful since r0 can be readily estimated from viral load data, for in-host models, or from incidence data in epidemiology. A number of recent studies have used this approach, including Pybus et al. (2001) and Lipsitch et al. (2003).

5. Recent use of R0 in the epidemiology of microparasites

5.1 SARS and influenza

5.1.1 SARS

The emergence of SARS underscored the need for careful epidemiological modelling, in order to better understand and contain such novel pathogens. A number of models were developed to study SARS and to compute R0 for outbreaks in Hong Kong, Singapore and Canada.

Lipsitch et al. (2003) estimated R0 for the outbreaks in Canada and Singapore, including the effects of super-spreaders (infected individuals who directly infect a large number of people). The exponential growth rate of the cumulative number of cases in the epidemic was taken as an estimate for r0. R0 was then estimated by computing the largest eigenvalue of a linearized SEIR model (assuming no depletion of susceptibles), and expressing this spectral radius as a function of R0, the ratio of the infectious period to the serial interval, f, and the length of the serial interval, L. This technique yielded the following equation for R0:Embedded ImageR0 were approximately 2.2–3.6 for serial intervals of 8–12 days. The serial intervals were estimated from the data, but at the time were not well defined for SARS. A strength of this approach is that the various parameters of the SEIR model ‘collapse’, such that epidemiological estimates of only three parameters are necessary: r0, f and L. Although the usual problems of underreporting before an epidemic, overreporting during an epidemic and stochasticity are unavoidable in estimates of r0, Lipsitch et al. conducted thorough sensitivity analyses, concluding that R0 will still have a relatively low value. This suggests that the spread of SARS can be contained when proper control protocols are put into place.

Lipsitch et al. then extended the SEIR model to explore the effects of isolation of symptomatic cases and quarantine of asymptomatic contacts on the spread of the disease. They found that to reduce R0 from approximately 3 to 1, isolation and quarantine must reduce total infectiousness by at least two-thirds. Further analysis of these control policies enabled Lipsitch et al. to conclude that quarantine would impose a large burden on the population if SARS was allowed to spread over a long period with an R0>1 in a susceptible population. Individuals could be quarantined multiple times over the course of the infection or for very long periods of time. These conclusions offer useful guidance for public health initiatives, but as several parameters of this model are unknown, Lipsitch et al. were unable to give concrete estimates for the levels of quarantine and isolation necessary to decrease the value of R0 below one.

Chowell et al. (2003) developed a system of ODEs to describe the spread of SARS in the three geographical populations mentioned above. Their model includes two classes of susceptibility, low risk and high risk, and also includes two types of infected individuals, symptomatic and asymptomatic, which differ in their rate of diagnosis and mode of transmission. The main goal of this study was not to determine R0, but to estimate the diagnostic rate and isolation effectiveness for the three separate regions, with an emphasis on the Toronto outbreak. These two parameter values were estimated by first determining the exponential growth rates from SARS incidence data in all three regions and fitting the model to the data assuming that all of the other model parameters were roughly constant between regions. In a brief section the parameter estimates were used to calculate R0 using the next generation approach. R0 was 1.2 for Hong Kong, approximately 1.2 for Toronto and 1.1 for Singapore. A weakness of this model is that R0 depends on estimating many (approximately 10) model parameters. These estimates of R0 are comparable to those estimated by Lipsitch et al. (2003) when the latter group assumed the serial interval to be small, around 4 days. However, the serial interval in this study was taken to be between 7 and 10 days. This disparity was not discussed in detail.

Using the same model, Chowell et al. (2004a) conducted sensitivity analyses for R0, quantifying the effects of changes in the model parameters. They found that the transmission rate and the relative infectiousness after isolation have the largest effect on R0. They also found that it is unlikely that the implementation of a single control measure will reduce R0 below one. The practical conclusion of this work is that control measures that affect the diagnostic rate, relative infectiousness after isolation and the per capita transmission rate should be implemented.

In another study (Riley et al. 2003), R0 was determined by fitting a stochastic mathematical model to incidence data for SARS. Riley et al. developed a stochastic, compartmental metapopulation model capturing both spatial variability and the growth dynamics at the early stages of the epidemic. Using data from the Hong Kong epidemic, Riley et al. determined probability distributions for transitions between the model compartments of susceptible, latent, infectious, hospitalized, recovered and deceased individuals. R0 was calculated using multiple realizations of the model to be approximately 3.

Riley et al. also found that the SARS control measures were effective and, most importantly, concluded that the Hong Kong epidemic was under control by early April. This conclusion was made by determining R0 when control measures were implemented. An advantage of this approach is that multiple realizations of the model can generate predicted case incidence time-series, quantifying any reduction in the transmission rates after control measures are in place. However, this complex model relies heavily on the quality of the data. Another drawback of this model is that the effects of superspreaders were not included.

Lloyd-Smith et al. (2003) developed a stochastic model of a SARS outbreak in a community and its hospital. The goal of this model was to evaluate contact precautions, quarantine and isolation as containment procedures while assuming a particular value of R0. Using a value of R0≈3 for the Hong Kong and Singapore outbreaks they found that isolation alone could control the spread of SARS if it met very stringent requirements. However, they concluded that the control measures that were most successful were limiting contact between people in hospitals and decreasing the number of contacts between people inside and outside of the hospital.

Summarizing the results above, we can conclude that the estimated value of R0 for SARS is relatively low, suggesting that the epidemic can be controlled. We can also conclude that the control policies studied are most effective when used in combination. These conclusions are reassuring and give direction to public health initiatives. These results should be viewed with some caution, however, as the data used in these studies are limited, the models are complex, and aspects of the virulence and persistence of SARS that might affect public health initiatives have not yet been addressed.

5.1.2 1918 Pandemic influenza.

Mills et al. (2004) used mortality data to estimate R0 for the 1918 influenza pandemic in 45 cities in the USA. Interestingly, this approach relied on none of the mathematical techniques described in previous sections; instead, the number of susceptibles, incident infections and infectious hosts were estimated using a discrete time simulation. Using a case fatality proportion of 2%, the total number of deaths was estimated and this was compared with ‘excess’ mortality data, that is, the number of deaths in 1918 above the median for 1910–1916. A value of R0 was determined which minimized the sum of squared differences between the simulated and observed data. The median estimate for R0 was 2.9.

It is interesting to note that in this study, one of the most careful and recent investigations of R0 in the literature, the authors relied on a very simple simulation and least-squares fitting, rather than any more sophisticated mathematical approaches. The advantage of the simulation is that the many assumptions which must be made are explicit, and their effects can be examined individually, as these authors have done in extensive supplementary material. In all cases, the sensitivity analyses predicted that the overall conclusion of the work—that R0 was approximately 3–4—was robust.

As noted by the authors, various possible sources of downward bias, including heterogenous mixing, intervention measures, and the depletion of susceptibles, are ignored in this approach. To correct for this, for each city, the two weeks in which the growth rate of mortality data was highest were also fit separately; this increased the median estimate of R0 to 3.9. It seems likely, however, that any heterogenous mixing and intervention measures were in place during these two weeks of rapid epidemic growth as well, since these weeks were not always the first weeks of the epidemic. Thus, this ‘extreme’ estimate of R0 is only the most extreme value that can be observed from the data, under the same assumptions regarding lack of control measures and homogenous mixing. The extent to which any control measures were in place and their mitigation of R0 was not addressed.

The aim of the study was to evaluate the risk of an impending pandemic from a novel strain of influenza. The results suggest that control of such a pandemic will be possible, given the ‘modest’ reproductive number of the 1918 strain. From a statistical point of view, however, R0 for the 1918 pandemic was a single observation of an extreme value, and it is very difficult to predict the magnitude of a single future extreme value drawn from the same distribution. Thus, the conclusions only hold under the assumption that a future influenza strain will be ‘similarly’ infectious. Nonetheless, it is important to have demonstrated that even for the worst influenza pandemic in recent history, R0 was probably not large relative to other diseases.

5.1.3 Avian influenza.

Stegeman et al. (2004) quantified between-flock transmission characteristics of high-pathogenicity avian influenza, a virus in the Netherlands that led to the culling of 30 million birds in 2003. R0 was calculated as the product of the infectious period at flock level and the transmission rate at flock level; however, neither parameter was measured directly. Instead, the infectious period was estimated as the period between the moment of detection and the moment of culling, plus 4 days. The transmission probability of the stochastic SEIR model was estimated by means of a generalized linear model. An estimate of the variance of R0 was used to calculate the confidence interval for the period of infection and the transmission probability. A variety of potential control measures were evaluated.

The results of this study estimated that R0 reached as high as 6.5 in some regions and was decreased to 1.2 after the outbreak. Although R0 still exceeded one, between-flock transmission nevertheless decreased significantly after the outbreak. This discrepancy between the calculated value of R0 and the ultimate course of the epidemic suggested that control measures designed to reduce the transmission rate were inadequate. It was instead hypothesized that containment of the epidemic was probably owing to the reduction in the number of susceptible flocks caused by culling rather than the reduction of the transmission rate by other control measures. From these observations, it was suggested that effective control in the future could be achieved only by depopulation of the whole affected area.

5.2 Livestock disease

5.2.1 Bovine spongiform encephalopathy (BSE).

Bovine spongiform encephalopathy affects populations of cattle and other livestock and may pose a threat to human health. A number of models of BSE have been analysed; these models include key transmission routes and evaluate the efficacy of various control policies.

Ferguson et al. (1999) developed a model to describe the spread of BSE. The goal of this paper was to demonstrate how different assumptions regarding the infectivity of BSE affect R0. Two models of infectivity that represent epidemiological extremes were considered: the first assumes that infectivity rises exponentially with a growth coefficient of two per year throughout the incubation period of BSE; the second assumes that infectivity is constant during this time. Using the next generation approach, Ferguson et al. estimated that R0≈10–12 for the first case and that R0≈2–2.5 for the second. These values were determined using a back calculation model (see Gail & Rosenberg 1992) to estimate the force of infection of BSE in Great Britain between 1980 and 1996. The transmission coefficient of BSE was estimated using a model for infectivity as a function of incubation stage.

Ferguson et al. also determined the effect that the 1988 ban on MBM (recycling of animals into ruminant-based meat and bone meal) had on R0. They found that, for both cases of infectivity, R0 was reduced to a value of approximately 0.15. This result has important implications as it shows that the spread of BSE can be controlled for the extreme cases of infectivity, implying that this will be true for all intermediate models. These estimates of R0 also suggest that BSE will not become endemic in the UK. A drawback of this model is that it assumes that underreporting of BSE cases does not exist after 1988. This assumption can result in a lower value of R0. Also, the effects of clustering were not modelled; instead, homogenous mixing was assumed. However, Ferguson et al. concluded that this would have only a minor effect on the conclusions of the study.

In a more recent study by de Koeijer et al. (2004), R0 was calculated for BSE assuming five different transmission routes: horizontal, vertical, diagonal (the disease can be spread to other animals close by during a birth), feed-based transmission and infectious material in the environment (use of MBM as fertilizer). Separating the infected population into two classes of infected individuals, those that are infected from birth and those that become infected by all other routes, de Koeijer et al. determined the expected number of new infections during the whole infectious period for both classes. These expressions were then used to formulate the next generation matrix to determine R0. Using parameter estimates from BSE data from the United Kingdom and the Netherlands, values for R0 were determined for separate outbreaks in 1986, 1991, 1995 and 1998. The estimated values of R0 were approximately 14 and 0.7 in 1986 for the United Kingdom and the Netherlands, respectively, whereas R0 values were far less than unity in later years when control measures were in effect.

This study also attempted to quantify the impact of the control policies in use. They found that there are three major control measures: a feed ban on MBM to cattle, optimization of the rendering process (how cattle feed is made, temperature, etc.) and removing and incinerating any materials that increase the risk of contracting BSE. They also found that, in order to reduce R0 to a value less than unity, at least two of the three control measures should be applied. However, the authors stated that even when all three control measures are in place, infection routes other than via feed will remain difficult to control, and therefore, R0 cannot be reduced to zero. This is not a serious concern, as they find that R0 is only 0.06 when transmission via feed has been eliminated. In this study, then, the primary use of R0 was as a measure of the efficacy of control measures, with the goal of predicting control measures that reduce R0 to below unity. A drawback of this model is that calculating R0 relied on estimating many model parameters using BSE data and procedures that have high uncertainty. This resulted in a very wide confidence interval around R0. The effects of clustering were also ignored.

5.2.2 Scrapie.

Matthews et al. (1999) developed a model of scrapie transmission within a single flock of sheep. The model includes both horizontal and vertical transmission, as well as genetic variation in susceptibility. R0 was calculated through the next generation operator.

Using parameters for a single, well-studied flock of Chevriot sheep, an estimate of 3.9 was obtained for R0 in a natural outbreak of scrapie between 1970 and 1982. We note, however, that the detailed parameters needed for this estimate, including the initial frequencies of the susceptible and resistant alleles, are not likely to be routinely available.

The real importance of this study, however, is in the accompanying sensitivity analyses. R0 is found to vary little with the vertical transmission rate, but is sensitive to the horizontal transmission rate. Thus, measures reducing the latter are recommended. Similarly, slaughter of preclinically infected animals is able to reduce R0 by over 90%. This paper thus encourages using early diagnostic tests as effective control measures. Finally, this model allows genetic control measures to be evaluated, and predicts that inbreeding may increase R0 if the susceptibility allele is recessive. Although the precise value of R0 may be impossible to determine in a given flock, this study demonstrates the use of R0 as an important predictor of the efficacy of control measures.

In a more recent study by Gravenor et al. (2004), the estimated flock-to-flock value of R0 for scrapie in Cyprus was between 1.4 and 1.8. This model uses a four-compartment ODE system, and evaluates R0 using the survival function. The model is then fitted to weekly incidence data to estimate three unknown parameters.

This study also investigates the impact of interventions, estimating both the epidemiological impact and the cost of each intervention. The usefulness of each control measure, however, is gauged not by changes in R0, but by estimating the total number of farms affected by the epidemic. The estimate of R0 in this paper is thus somewhat peripheral to the main conclusions of the work.

5.2.3 Foot and mouth disease.

Determining the magnitude of R0 for FMD has also proved important, guiding policies for culling and vaccination, the two major control measures implemented for FMD.

Ferguson et al. (2001) determined R0 for FMD by considering contact tracing data and the number of susceptibles at equilibrium. They found that R0≈4.5 and that is reduced to approximately 1.6 when control measures were implemented. Also, by developing a model of differential equations to describe FMD dynamics and fitting this model to R0 values over time, they were able to conclude that slaughtering on all farms within 24 h of case reporting (without necessarily waiting for laboratory confirmation) can significantly slow the epidemic. However, they found that even these improvements in slaughter times did not reduce R0 below one. They concluded that it is necessary to consider other interventions, especially those capable of rapidly controlling infections established in multiple regions.

Ring culling and vaccination were also explored using the model. Ferguson et al. concluded that both are highly effective strategies if implemented rigorously, but that this may be very costly. The high initial value of R0 estimated in this study confirmed that FMD is highly transmissible, and estimates of R0 were essential in determining which control measures might be effective against this pathogen.

Matthews et al. (2003) extended previous models of FMD by defining an optimal control policy. This policy included removing newly discovered infected holdings and the pre-emptive removal of holdings deemed to be at enhanced risk of infection. Matthews et al. employed a simple SIR model to determine the magnitude of the effect of different control policies on a chosen value of R0. They found, not surprisingly, that the level of control required to minimize the number of animals removed increases with R0. They also found that non-zero levels of control can optimize the outcome of the epidemic even when R0<1. In this case, the impact of the control measure was assessed using the fraction of animals removed.

Extending their model to a metapopulation, Matthews et al. concluded that a greater level of control is needed in this case, but most importantly, they found that to minimize losses to livestock populations, R0 should be only sufficiently reduced; there is a tradeoff between the amount by which R0 can be reduced and the fraction of animals removed. The key points which emerge are that total losses are not highly sensitive to small variations in the control effort around the optimal values, and that losses increase only gradually as control effort increases beyond the optimal value. They concluded that some leeway is acceptable in practice, but that over-control is generally safer than under-control when trying to avoid large losses to the population. Similar arguments were also applied for variation in R0; that is, over-control should be implemented if there is any uncertainty or variability in the value of R0.

5.3 Vector-borne disease

5.3.1 Dengue.

Luz et al. (2003) used R0 to evaluate the risk of dengue fever outbreaks in Rio de Janeiro, and to assess possible control measures. R0 was calculated from the survival function, assuming two spatial compartments with high and low vector density, respectively. Latin hypercube sampling of probability density functions was used to explore the effects of uncertain parameter values.

The goal of this paper was not so much to calculate an accurate value of R0, but to assess which of the many unknown parameter values are most important to the model. Luz et al. concluded that field estimates of mosquito mortality and the incubation period of dengue in mosquitoes are of critical importance. We note that although dengue is a vector-borne disease and multiple classes of infectives are defined in the model, the definition of R0 used here is the number of infected humans per infected human, not the square root of this value as would be obtained by the next generation operator.

5.3.2 Malaria.

Although quantifying the incidence and spread of malaria has an extremely rich history (Garrett-Jones 1964), work in characterizing R0 for malaria is ongoing. A recent paper investigates the incidence of malaria on an island in the Gulf of Guinea with a population of 6000 (Hagmann et al. 2003).

The paper estimates the ‘vectorial capacity’ of malaria (Garrett-Jones 1964), that is, the rate at which future human infections arise from a currently infective human host. This capacity C is estimated using maximum likelihood fits to observed age-prevalence data, and R0 is predicted as a function of C. We note once again that the value of R0 thus obtained corresponds to the definition provided in §2.1, not that of §2.2. The paper also reports detailed incidence data, stratified by age, sex, residence of patient and grade of malarial infection. Finally, a detailed survey on the use of mosquito nets, dwelling types, etc., was conducted; fully 17% of the population participated in the survey.

The low value of R0 obtained in this study (1.6) was used to justify the overall conclusion of the work that malaria can probably be eliminated from the island through simple control measures. However, calculating R0 was otherwise incidental; arguably the most important findings in this study were obtained through the detailed surveying and reporting of incidence and demographic data.

5.3.3 West Nile virus.

Wonham et al. (2004) derived a system of ODEs to describe the behaviour of West Nile virus. Their model consisted of susceptible, infectious, recovered and dead birds, and larval, susceptible, exposed and infectious mosquitoes. The next generation method was used to calculate R0 from this model in order to evaluate the ability of the virus to invade the system. The calculated value of R0 was then interpreted biologically as the square root of the product of (i) the disease R0 from mosquitoes to birds and (ii) the R0 from birds to mosquitoes. Each of these R0 values was further analysed as a product of disease transmission and infectious lifespan in case (i) and the product of the transmission probability, the number of initially susceptible mosquitoes per bird that survive the exposed period and the bird's infectious lifespan in case (ii). R0 was then used to establish a threshold mosquito level, above which the virus will invade a constant population of susceptible mosquitoes.

The R0 value derived was then used to evaluate public health policy markers. Two such policies were evaluated: mosquito control and bird control. It was demonstrated that a small increase in mosquito mortality can lead to a disproportionately large increase in the outbreak threshold. More surprisingly, however, R0 was used to show that reducing crow densities would have the opposite effect and actually enhance disease transmission (unless extremely low densities limited mosquito biting rates). Thus, R0 was used to show that reducing the initial mosquito population below the calculated threshold would have prevented the West Nile outbreak for New York in 2000. Conversely, bird control would have had the opposite effect.

6. Discussion

Our review of the practical use of R0 has focused, largely, on literature from a 2 year period, 2003 and 2004. The number of papers included here—and our review was by no means exhaustive—testifies to the current relevance of this important concept.

The methods used to calculate R0 from incidence data vary. Model fitting using standard optimization techniques is often used to estimate parameters, which are then used to determine R0 by either the survival function or next generation methods. Estimating the initial growth rate, r0, has also been widely used. For multiple classes of infectives (e.g. vector-borne disease), we find examples both where R0 is defined per generation, and examples where it is defined per infection cycle (see §2.2). Owing to the usual limitations in using real data, we note that models typically used ‘in the field’ are simple, deterministic and non-structured (but see Ferguson et al. 1999; Lloyd-Smith et al. 2003; Matthews et al. 2003; and Riley et al. 2003 for counter examples).

The basic reproductive ratio for emerging or endemic pathogens described above has been estimated for two main purposes. First, R0 is estimated in order to gauge the relative risk associated with a pathogen. These estimates are then used to compare the transmissibility of the disease to other well-known (and better understood) pathogens. Unfortunately, some time is needed to accrue sufficient incidence data for these estimates of R0, and typically, R0 is only quantified after the epidemic has run its course, or is at least well established. The degree to which R0 for one emerging infectious agent might be predictive of future novel pathogens is questionable (Mills et al. 2004). Furthermore, a numerical estimate of R0 for a specific disease does not, in and of itself, inform public health measures. These values are instead used to justify severe or costly control measures (e.g. FMD; Ferguson et al. 2001; Matthews et al. 2003), or less severe, more sustainable measures (e.g. malaria on Principe; Hagmann et al. 2003).

Evaluating these control measures reveals the second, and more important, use of R0 in the recent literature. In most of the studies outlined above, R0 is evaluated both before and after a putative control measure is applied, with the aim of determining which measures, at what magnitudes and in what combinations, are able to reduce R0 to a value less than one. The results of these efforts have clearly offered useful practical guidelines: in some cases the results are counter-intuitive (e.g. West Nile virus: Wonham et al. 2004), in many cases they are sobering.

Although R0 offers a simple, universal measure of control efficacy, it is important to note that using R0 for this purpose ignores other important issues, such as the timing of secondary infections, or the negative impact of control measures on the population. For example, it is possible that some patterns of quarantine may be roughly equivalent in their effect on R0, but may have different effects on the growth rate of the epidemic. Matthews et al. (2003) discuss the trade-off between reducing R0 and culling as few animals as possible; Lipsitch et al. (2003) discuss similar trade-offs between reducing R0 and burdening the population with excessive quarantine. These studies suggest that R0 may not always be the best overall measure of control efficacy. In contrast, the total mortality or morbidity, the total number of affected farms and other such measures may offer more practical indicators of control success, and can be balanced against the associated costs (e.g. Gravenor et al. 2004). We argue that R0 may be somewhat overused in evaluating control measures, presumably because it is more readily calculated than these alternative indicators, and is widely recognized and understood.

For host–pathogen interactions, R0 stresses the role of the pathogen. An alternative, more host-centred characterization has been suggested by Bowers (2001). Nicknamed the basic depression ratio, D0 measures the degree to which the infected host population is depressed below its uninfected equilibrium level. Consideration of both R0 and D0 allows modelling of the complex trade-offs in the evolution of host–pathogen interactions.

When control is targeted at specific subgroups, R0 is not a good indicator of the required control effort, and the type-reproduction number, T, has been suggested instead (Roberts & Heesterbeek 2003; Heesterbeek & Roberts in press). This quantity is equivalent to R0 in homogeneous populations, but in heterogeneous populations it singles out the control effort required to achieve eradication when control is targeted towards a particular host type (or subset of types), rather than the population as a whole, assuming the other types cannot sustain an epidemic by themselves. In many cases, T is easier to formulate than R0 and both share the same threshold behaviour.


This work followed from a mini-symposium on the same topic at the Society for Mathematical Biology Annual Meeting, Ann Arbor, Michigan, 2004. We are indebted to Hans Heesterbeek and an anonymous referee for a number of insightful suggestions; we also thank Sally Blower, Erin Bodine, Romulus Breban, Elissa Schwartz, Raffaello Vardavas and David Wilson for valuable discussions. We are also grateful to the Natural Sciences and Engineering Research Council of Canada and to the Ontario Ministry of Science, Technology and Industry for their support.


  • Note that this derivation of R0,E differs from that in Blower et al. (1998) due to a missing σ in eqn. (7), p. 678 of that manuscript.

    • Received November 3, 2004.
    • Accepted March 29, 2005.


View Abstract