## Abstract

Understanding the influence of non-susceptible hosts on vector-borne disease transmission is an important epidemiological problem. However, investigation of its impact can be complicated by uncertainty in the location of the hosts. Estimating the risk of transmission of African horse sickness (AHS) in Great Britain (GB), a virus transmitted by *Culicoides* biting midges, provides an insightful example because: (i) the patterns of risk are expected to be influenced by the presence of non-susceptible vertebrate hosts (cattle and sheep) and (ii) incomplete information on the spatial distribution of horses is available because the GB National Equine Database records owner, rather than horse, locations. Here, we combine land-use data with available horse owner distributions and, using a Bayesian approach, infer a realistic distribution for the location of horses. We estimate the risk of an outbreak of AHS in GB, using the basic reproduction number (*R*_{0}), and demonstrate that mapping owner addresses as a proxy for horse location significantly underestimates the risk. We clarify the role of non-susceptible vertebrate hosts by showing that the risk of disease in the presence of many hosts (susceptible and non-susceptible) can be ultimately reduced to two fundamental factors: first, the abundance of vectors and how this depends on host density, and, second, the differential feeding preference of vectors among animal species.

## 1. Introduction

A large body of ecological and epidemiological studies has highlighted the profound effects of spatial distributions of living organisms on population and disease dynamics (see [1], and references therein). This issue has also raised considerable interest outside the scientific community; inaccurate knowledge of spatial host distribution is regarded as a central problem for health authorities, especially in the presence of a sudden outbreak of disease when control measures need to be quickly implemented. As such information is often only partially available, developing mathematical tools that overcome the limited predictive capacity due to uncertainty in host distribution is a key scientific goal [2–4]. To this end, one study examined the specific case of foot-and-mouth disease spreading between farms where spatial clustering is ignored [1]. Tildesley *et al.* [1] showed that, if their model is carefully parametrized to match epidemic behaviour, then assuming that farms are randomly located within a region is sufficient to determine optimal control measures. The approach relies on an artificial parametrization that incorporates the complex effects of the spatial structure in the data [1] and was relevant for a disease where spatial ecological variability is not directly important; despite this, the approach is particularly appealing in the absence of precise demographic data. However, there are important examples where additional information, which can be used as a proxy for host locations, is available. Human travelling statistics have been assessed by analysing the circulation of bank notes in the USA [5]; spatio-temporal changes in population density have been measured by quantifying anthropogenic light from satellite imagery [6]; mobile phone technology has been used to collect data on social networks and behavioural data to explore the networks of transmission [7]. The spatial distribution of horses in Great Britain (GB) is another intriguing example, as the National Equine Database (NED) recorded only the address of the owners and not the actual location of the horses. Mapping owner addresses as a simple proxy for horse location is tempting but previous studies have shown that such an approach introduces an important source of error resulting in a spatial distribution biased towards large urban areas [8]. Errors in an unrealistic equine map could be further amplified in a disease risk map in the presence of strong environmental dependencies. This is the case, for instance, when the epidemiological parameters depend on the temperature, and therefore location, or when the presence of non-susceptible hosts for disease influences transmission.

Estimating the risk of African horse sickness (AHS) in GB is an illuminating case that exemplifies these issues, also being one of the most important equine diseases. AHS is a highly fatal, viral disease (the mortality can be as high as 90%) caused by the African horse sickness virus (AHSV), which is closely related to bluetongue virus (BTV). Similar to BTV, AHSV is transmitted and amplified by *Culicoides* biting midges. The outbreaks of bluetongue in 2006 [9,10] and the recent incursion of Schmallenberg virus [11,12] for ruminants are clear examples of incursions of novel species of arbovirus into Europe with important economic consequences [13–15]. Thus, the incursion and spread of BTV-8 in northern Europe has greatly increased the concerns of the GB equine industry, since AHSV could similarly be introduced, with potentially devastating economic and welfare consequences for the equine population and associated industries.

The risk of AHS spread depends on a range of both epidemiological and entomological factors, many with strong environmental dependencies. These include the temperature dependence of key virological parameters, the ecology of *Culicoides* species and how their abundance depends on host densities and the influence of non-susceptible hosts (e.g. ruminants, whose location in GB is known) kept in proximity to equine hosts; all these factors are spatially dependent. Thus, reliable data for the spatial distribution of horses in GB are of fundamental importance to assess the risk of the disease.

The role of non-susceptible hosts in mitigating or amplifying disease is poorly understood, although the idea of deploying a preferred host to protect man or animals from insect-borne disease has been suggested before [16]. However, the influence of non-susceptible hosts is complicated by two potential, but contrasting effects: a dilution effect, whereby *Culicoides* exhibit a feeding preference for a non-susceptible, non-equid host; and an amplification effect, whereby increased vertebrate host densities result in increased vector abundance. It is essential to disentangle these processes when assessing the risk of a disease.

Here, we address these issues by developing a credible distribution of horses in GB that can be used to re-assess, in the light of current knowledge, the risk of AHSV spread in GB and the efficacy of potential control measures. To this end, we combined NED and land-use data in a Bayesian framework, developing an algorithm to infer a realistic distribution for the location of horses. Using this inferred distribution of horses, we estimated the spatial and temporal variation in risk by computing the basic reproductive number *R*_{0} (the average number of secondary cases arising from the introduction of a single infected individual to an otherwise susceptible population) [17,18]. In particular, we explore the impact of non-susceptible vertebrate hosts of the risk of transmission and the efficacy of vaccination.

## 2. Material and methods

### 2.1. Developing a credible national distribution of horses in Great Britain

Previous studies [8,19,20] have shown that the distribution of NED-registered horse owners does not mirror the distribution of locations where the corresponding horses are kept. In particular, a survey of NED-registered owners (1009 samples) provided complete postcode records for owners and their corresponding horse locations [19], and revealed an inverse relationship between built-up land use and the proportion of horses kept at the same postcode as owners' addresses [8]. Data from the same survey also showed that the distribution of the horse-owner distances was well described by a power-law distribution, irrespective of the local values of built-up coverage (see §3.1).

Heavy-tailed distributions are compatible with the reasonable assumption that suitable horse premises in the neighbourhood of the owner's address are the most preferred locations, although cases of large owner–horse separations are not precluded. Based on this information, we combined available NED and land-use data [21] in a Bayesian approach to develop a plausible national distribution of horses in GB.

#### 2.1.1. Mechanistic model

A mechanistic model was formulated that provided the conditional probability of a horse being kept at position **r** = (*x*,*y*) when we know the owner's location . The modelling approach combines: (i) the empirical, inverse relationship between built-up land use and the proportion of horses kept at the same postcode as the owners' addresses [8]; and (ii) a fat-tailed spatial kernel allowing a non-negligible probability of large owner–horse separations. Accordingly, the probability that a horse is kept at the location **r** = (*x*,*y*) when we know the owner's location can be written as
2.1where *δ* is Dirac delta function. The probability *P*_{O}(**r**) reflects the suitability of a location for keeping horses (e.g. rural area with the presence of stables) and is related solely to land use. In particular, *P*_{O}(**r**_{O} ) can be interpreted as the probability that a horse is kept at the owner's location . If the horse is not kept at the same location as the owner, gives the probability that the horse is located at position **r** = (*x*,*y*) at a Euclidean distance from the owner's location. This is modelled as the joint probability
2.2where is the probability that the horse and owner locations are a given distance, , apart. The constant *A* is fixed by the constraint: , where *𝒟* − **r**_{O} represents the entire spatial domain (i.e. GB except the owner's location). This constraint ensures that each horse is associated with one and only one location (although it is possible that more than one horse may be kept at the same location, as occurs in stables).

Complete postcode records for 1009 owners and their horses were available from a survey of NED-registered owners [19]. These were used to estimate the distribution of the distances between the two. A visual inspection of this empirical distribution, which is related to *P*_{sep}, revealed a large scatter of the data. This suggests that, although 92 per cent of horses resided within 10 km of their owners [19], there is a non-negligible probability of large horse–owner separations comparable to length scales at country level (e.g. the owner might reside in southern England and the horse in Scotland). This behaviour can be captured by modelling the spatial kernel *P*_{K} as a fat-tailed distribution,
2.3Underlying this choice is the expectation that suitable horse locations in the neighbourhood of the owner's address are favoured.

Data on built-up coverage (the fraction of built-up surface) [21] were obtained for each of the 1009 postcode records for owners and their horses from the NED-registered owner survey. The probability, *P*_{O}, that a horse is kept at the owner's location, **r**_{O}, given the local value of built-up coverage *u*(**r**_{O}) was modelled as
2.4where *N*_{in} is a normalization constant to ensure when integrated over the entire domain (i.e. GB). The parameter *α* represents the fraction of horses kept at the owner′s location (according to the empirical NED-registered owner survey, *α* = 0.7).

#### 2.1.2. Parameter estimation

Parameters in the model were estimated using Bayesian inference. In this case, the likelihood for the model parameters (*d*_{0}, *σ*) is
2.5where *D* represents the set of observed data, *N*_{obs} is the number of records from the NED-registered owner survey, and and **r**_{i} are the positions of the owner and horse, respectively, for the *i*th record. We assumed non-informative prior distributions for all parameters.

The parameters *d*_{0} and *σ* were estimated by using a Markov chain Monte Carlo (MCMC) approach to generate samples from the joint posterior distribution for the parameters (figure SB-1, electronic supplementary material). However, the parameter *λ* was fixed at its least-squares estimate (*λ* = 8.76) to reduce computational cost (figure SB-2, electronic supplementary material). One chain of 95 000 iterations was run using the MCMCpack package in R [22], with the first 5000 iterations discarded to allow for burn-in of the chain. Convergence of the chain was monitored visually and using standard diagnostics. Posterior estimates (mean and 95% credible interval (CI)) for the parameters are *d*_{0} = 1071 (95% CI 208–2772) and *σ* = 2.82 (95% CI 2.37–3.45).

#### 2.1.3. A map of the locations of horses in Great Britain

The NED provides the number of owners present at 8670 locations (here we identify each location by the index *i* and the associated number of owners by ). There were up to 10 667 owners at any one location, with the highest number of owners occurring in the Newmarket postcode sector, resulting in a density of a few hundred horses per square kilometre. For each owner location, corresponding horse locations were generated according to the conditional probability in equation (2.1) using the Metropolis algorithm. The simulation was implemented by using the R package MCMC [23], sampling (at least) every 1000 iterations. Considering the large number of simulations required (each of the 8670 different locations requires an independent simulation) diagnostic analysis was done for a randomly selected sample of simulations.

### 2.2. Basic reproductive number *R*_{0}

The transmission model underlying the current host–vector model is similar to that described previously for BTV [18,24]. The basic reproductive number *R*_{0} was calculated by using the next-generation matrix (NGM) approach [17]. For AHS, the NGM has elements, *k*_{hl}, given by the expected number of infections of type *h* (either horse (H) or vector (V)) arising from a single infected individual of type *l*, so that
2.6Two elements of the NGM are straightforward to derive, because there is little or no direct transmission between horses (i.e. *k*_{HH} = 0) or between vectors (i.e. *k*_{VV} = 0). The two elements representing transmission from vector to horse (*k*_{HV}) or horse to vector (*k*_{VH}) are computed as follows.

*Transmission from vector to horse.* Once a vector takes an infected blood meal, it must complete the extrinsic incubation period (EIP), i.e. latent period, before it becomes infectious. The EIP is assumed to follow a gamma distribution with mean 1/*ν* and variance 1/(*n*_{V}*ν*^{2}) [25], where *n*_{V} is the scale parameter (table 1). If the vector mortality rate is *μ*, the probability that a vector survives the EIP (and so becomes infectious) is . Following completion of the EIP the vector will survive for 1/*μ* days, during which time it will bite susceptible horses *a**ϕ* times per day (here *a* is the reciprocal of the time interval between blood meals and *ϕ* is the proportion of bites on horses), and a proportion, *b*, of these bites will result in an infected horse. Consequently, the expected number of infected horses arising from a single infected vector is given by
2.7

*Transmission from host to vector*. The duration of viraemia in horses (assumed to indicate infectiousness) was assumed to follow a gamma distribution with mean 1/*r*_{H} and variance 1/(*n*_{H}*r*_{H}^{2}), where *n*_{H} is the scale parameter (table 1). If disease-associated mortality occurs at a rate *d*_{H}, the mean duration of infectiousness is given by . During this time period, a host is bitten by susceptible midges on average *ma**ϕ* times per day (here *m* is the vector-to-host ratio), a proportion, *β*, of which become infected. Hence, the expected number of infected vectors arising from a single infected horse is given by
2.8Some linear algebra shows that the dominant eigenvalue of the NGM (i.e. *R*_{0}) is
2.9which, on substituting the expressions for *k*_{HV} and *k*_{VH}, yields
2.10The vector-to-host ratio and the proportion of bites were calculated as
2.11where *H*_{H}, *H*_{C} and *H*_{S} are the population of horses, cattle and sheep; *N*_{V} is the number of vectors representing the abundance of *Culicoides*, *σ*_{C} and *σ*_{S} are measures of vector preference for cattle and sheep compared with horses: if , vectors feed preferentially on horses, otherwise they feed preferentially on cattle/sheep. In the non-spatial analysis, we assumed that only one type of non-susceptible host was present. In this case, the proportion of bites reduces to: , where *H _{L}* is the population of non-susceptible hosts and

*σ*

*the corresponding vector preference. The risk of AHS is potentially influenced by the presence of other non-susceptible animals, such as goats and wild ruminants. However, the impact of these animals is expected to be negligible in GB because of their limited abundance (approx. 88 000 and 500 000–600 000, respectively, while the number of cattle and sheep is 10 and 36 million, respectively; see www.archive.defra.gov.uk).*

_{L}Plausible ranges rather than point estimates were considered for most epidemiological and virological parameters, which constitute *R*_{0} (table 1). Where possible, estimates applicable to GB and AHS were used; otherwise, data for other species and countries were used (for details, see table 1). Replicated Latin hypercube sampling (LHS) was used to explore the parameters influencing the basic reproduction number, *R*_{0} (see [18] and references therein). The LHS results were used to compute the mean and maximum values for *R*_{0}. Results are based on 500 replicates in non-spatial cases and 100 replicates in the spatial cases.

### 2.3. Seasonal maps for *R*_{0}

The spatial distribution of *R*_{0} was calculated at the same level of resolution as the NED owners' data, i.e. postcode sectors, an abbreviated form of address (e.g. CB8 9) used in GB. There are approximately 9000 postcode sectors in GB, each containing approximately 3000 addresses (see www.ons.gov.uk). The size of postcode sectors varies, ranging from 2864 km^{2} in a low-populated region in the Scottish Highlands to approximately 0.001 km^{2} in most of the densely populated sectors of London.

Temperature data for 2006 were used, because this was an exceptionally warm year, with all GB regions recording their warmest rolling 12-month period. Monthly averaged mean temperatures were obtained from the BADC/MIDAS database (see http://badc.nerc.ac.uk/view/badc.nerc.ac.uk__ATOM__dataent_ukmo-midas). Seasonality in vector activity was obtained from an analysis of data from a network of 12 suction traps in England, covering a variety of habitat types [34] (table 1).

## 3. Results

### 3.1. A credible national distribution of horses in Great Britain and its impact on risk predictions

The distribution of owners (figure 1*a*) strongly mirrors urban coverage (figure 1*b*); in particular, in two highly urbanized locations (City of Westminster and one of the Greater London boroughs) where the density of owners is exceptionally high (more than 2000 owners per square kilometre), although the actual horse density is low in these locations. Re-distribution of the NED data according to the algorithm developed here appears to correct this source of bias (figure 1*c*) with the re-distributed horse population more evenly spread out towards rural areas and the exceptionally high densities in urban settlements being removed. The output of the correction algorithm was compared with the sample from the NED survey, which showed that it is governed by the same statistics (figure 2).

The highly clustered distribution of owners results in a sparse distribution of *R*_{0} with many postcode sectors having values less than 1 (figure 3*a*). A more realistic distribution of horses, however, leads to a more even distribution and, most importantly, more locations where *R*_{0} > 1 as shown in the insets (figure 3*b*).

### 3.2. Temporal and spatial variations of the risk of an outbreak of African horse sickness in Great Britain

Figure 4 shows spatial variation of mean *R*_{0} and the locations where *R*_{0} > 1 from January to December based on temperature data for 2006 (maximum values are shown in figure SC-1, electronic supplementary material). It is evident that the risk of AHS reflects the spatial distribution of temperature (figure 1*d*), where, for instance, higher values for *R*_{0} occur in warmer regions such as southeast England. It is also driven by the seasonally varying activity of *Culicoides* (which is lower in August than in July and September, see table 1; [34]). For instance, *R*_{0} in August is lower than that in September despite the average mean temperature being similar (15.3°C in August and 15.6°C September). Furthermore, the distribution of horses influences the magnitude of *R*_{0}. For example, the lowest values of *R*_{0} occur in the London area, where the number of horses is small and the number of livestock negligible (figure 1*e*,*f*) despite the high temperatures.

#### 3.2.1. Influence of non-susceptible hosts and effect of vector abundance on *R*_{0}

The influence of non-susceptible hosts on the basic reproductive number, denoted here as *R*_{0}^{NSH}, is essentially driven by the feeding preference *σ*_{L}, the vector abundance *N*_{V} and its dependency on host population size. In table 2, we present the analytical expression for *R*_{0}^{NSH} and the conditions leading to *R*_{0}^{NSH} < 1 for different scenarios. The simplest scenario (regimen I) corresponds to the case when the population of *Culicoides* midges is not altered by the introduction of an alternative vertebrate host. Underlying this choice is the assumption that the key factor in the ecology of *Culicoides* midges is land use. This assumption is likely to be unrealistic as one would expect that the abundance (owing to survival and active search) of *Culicoides* increases with the resource available (e.g. linearly in regimen II and as a power law in regimen III). Regimen IV represents the more general case when the population of *Culicoides* depends on an arbitrary function of the total host population. As an illustrative case, we considered a scenario in regimen II with *H*_{H} equids and *H _{L}* non-susceptible hosts and calculated the basic reproductive number in the presence of all hosts relative to that in the presence of horses only, i.e. the ratio of . When this ratio is below 1, then it is advantageous to keep non-susceptible hosts in the proximity of horses. As shown in figure 5

*a*, two distinct regions ( and ) can be identified. The extension of each region is delimited by the vector preference

*σ*

_{L}and the ratio of non-susceptible hosts to horses (

*H*/

_{L}*H*

_{H}). The existence and shape of such regions depends on the particular regimen for vector abundance. If the condition is satisfied then the ratio of non-susceptible hosts per horse leading to the extinction of the disease can be estimated (figure 5

*b*). These conditions depend solely on the basic reproductive number

*R*

_{0}in the absence of alternative hosts, the number of equids, the number of non-susceptible hosts and the vector preference

*σ*

_{L}. The effect of vector abundance on the basic reproductive number, in the presence of horses only, can be readily investigated from the analytical solution displayed in table 2 with the condition

*H*= 0.

_{L}The influence of alternative hosts in the spatial case is highlighted by the map of *R*_{0}^{NSH} in July for two contrasting values of the vector feeding preference: high preference towards cattle and sheep and high preference towards horses (figure 6).

#### 3.2.2. Impact of vaccination

Vaccination is a principal control measure for AHS [30]. For a perfect vaccine which renders the host immune to infection, the fraction of vaccinated horses required to reduce (vaccination coverage) depends on the basic reproductive number as , where and *R*_{0} are the basic reproductive numbers is the presence and the absence of vaccinated horses, respectively [35]. For an imperfect vaccine, for instance one that reduces the transmission rate or the mean duration of viraemia, this ideal vaccination coverage must be rescaled by an appropriate reduction factor depending on the parameters affected by the vaccine (see the electronic supplementary material). However, this critical vaccination coverage ought to be seen as an upper limit, as it is based on the assumption that the host and vector populations mix uniformly. Figure 7*a* shows the value of as a function of the proportion of vaccinated horses and the reduction factor in the transmission rate. The top-right region in parameter space is characterized by and therefore extinction of the epidemic. Here, the effect of vaccination is assumed to reduce the probability of transmission (either from host to vector *b* or from vector to host *β*) and the mean duration of viraemia by 50 per cent.

Vaccination is likely to affect other virological parameters, including a reduction in the rate of mortality of the host. As a host will live longer, this might increase the risk of an epidemic if not compensated by a reduction, for example, in the transmission rate and/or the mean duration of viraemia. To this end, we considered a thought experiment in which vaccination reduces only the transmission rate and the mortality rate. This is illustrated in figure 7*b*, which displays the ratio of the basic reproductive number *R*_{0}^{Vacc} (for 100% of the vaccinated horses) when compared with *R*_{0} in the absence of vaccinated horses. For a more realistic case, when vaccination is also assumed to reduce the mean duration of viraemia, the effect could persist but the region of parameter space leading to an increase in *R*_{0} is smaller (figure 7*c*).

## 4. Discussion

The current work has addressed a number of issues in spatially explicit epidemic modelling of vector-borne disease, exemplified by the important case of AHS, relating to host location, the evaluation of dilution effects when vectors may feed on both susceptible and non-susceptible hosts (a key factor that is frequently ignored) and the effect of vaccination.

The work highlights the importance of a credible host distribution when assessing the risk of a disease and the impact of control. Previous studies have shown that the distribution of NED-registered horse owners does not mirror the distribution of locations where the corresponding registered horses are kept [8,19,20]. We showed that mapping owner addresses as a simple proxy for horse location underestimates the risk of an outbreak of AHS in GB. To prevent this problem, a correction algorithm was implemented to infer a more realistic distribution of the equine population in GB. The correction algorithm was built over the empirical dependence between spatial separation and land use combined with NED data in a Bayesian framework. Inferring spatial knowledge from related information used as a proxy is an increasingly common approach [5–7]. The approach formulated in the present study provides an additional tool for this class of problems.

Combining the new host spatial distribution with existing national data on ambient temperatures at different times of the year, seasonal abundance of *Culicoides* and the distribution of other host species (especially cattle and sheep) resulted in a meaningful spatio-temporal assessment of the risk of AHS in GB. The modelling framework was built on contributions by Lord *et al.* [36–38], Backer & Nodelijk [27] and Gubbins *et al.* [18]. An important finding, in agreement with [18], was that the risk of AHS is strongly affected by temperature, being higher in warmer regions or warmer years. This is a particular source of concern as climate change has also been associated with alterations of *Culicoides* distributions and consequently their associated diseases [39].

The risk is also driven by the seasonally varying abundance of *Culicoides*. Using sensitivity analysis, Lord *et al.* [37] identified *Culicoides* population size as one of the most important factors in determining whether or not an epidemic occurred and in influencing the size of the epidemic. A key problem is that measurements of vector abundance and distribution are affected by the methodologies used (e.g. UV light/suction trap, CO_{2} trap, animal-baited drop trap). Direct collection of *Culicoides* from animals is considered the most reliable method for measuring biting rate [40–42]. However, owing to a paucity of such data, determining vector abundance and vector-to-host ratios accurately remains challenging.

In the particular case of *Culicoides*, the vector-to-host ratio is known anecdotally to vary by several orders of magnitude according to a wide variety of factors. Also, most studies do not investigate how the abundance of *Culicoides* is affected by the densities of available hosts. For example, two putative vectors of BTV (*Culicoides dewulfii* and *Culicoides chiopterus*), and potentially AHSV, develop as larvae in cattle dung [43,44] and, hence, would only be expected to come into contact with horses through overlapping host populations. The hypothesis that *Culicoides* abundance is density dependent is supported by recent findings that *Culicoides* abundance was significantly higher at trap locations with a high density of cattle in the locality [34]. Another study suggested that catches in light traps increase linearly with sheep numbers, at least for small host numbers [45]. Although the particular design of this experiment (low number of sheep, the presence of only one host, no habitat variations, measurements based on light trap catches) prevents robust generalizations, the findings are compatible with the common assumption of a fixed vector-to-host ratio [18,37,38].

Previously, Lord *et al.* [36] calculated *R*_{0} under different hypothetical relationships between vector population dynamics and either host or vector density, though assuming no vector preference for different hosts. Here, we have shown that knowledge of *Culicoides* abundance alone is not sufficient to discriminate whether the presence of non-susceptible hosts is beneficial or not and information on the feeding preference is essential. Despite a growing body of research that has focused on feeding patterns of midges [16,46–49], reliable measurements suitable for use in epidemiological studies are still scarce. The probability of taking a blood meal on a particular animal depends not only on its attractiveness but also on the numeric availability of a host. A common limitation in these studies is that knowledge of host abundance is only approximate.

To the authors' knowledge, the joint effects of *Culicoides* abundance and feeding preferences have not been rigorously investigated. If robust, accurate measurements of these factors were available, the current framework could be readily used to assess whether or not the proximity of non-susceptible hosts is likely to reduce or increase the risk of an AHS epidemic. In the absence of reliable measurements, we explored different hypotheses on the abundance of *Culicoides*. A key focus of the current work was to explore *R*_{0} and its dependence on (i) vector abundance and its relationship with the density of susceptible and non-susceptible host species and (ii) vector-feeding preferences between hosts. This allowed us to provide quantitative estimates for this potential dilution effect.

A variety of vaccines have been developed to prevent AHSV infection (see [50], and references therein). These include inactivated and live attenuated virus vaccines, virus-like particles produced from recombinant baculoviruses, a recombinant vaccinia-vectored vaccine and a DNA vaccine. Polyvalent cell culture attenuated vaccines are still routinely used for protective immunization of horses in sub-Saharan Africa to achieve sufficient protection against all nine AHSV serotypes. However, the simultaneous administration of multiple vaccine strains can result in interference during vaccine virus replication, possibly resulting in incomplete immunity [51,52]. For example, a recent study has shown that immunized horses in an AHS endemic area were infected with AHSV over a 2 year period [53]. Our results suggest that incomplete immunity with a reduction in the mortality rate of the horses might lead to an increase in the basic reproduction number (figure 7). It is conceivable that in a more realistic case, e.g. when vaccination sensibly reduces the mean duration of viraemia, this risk could be negligible. However, this emphasizes the need for accurate measurements of all virological parameters for live attenuated vaccines. By contrast, inactivated vaccines, such as the recombinant canarypox virus-vectored vaccine described by Guthrie *et al.* [50], results in a suppression of viraemia and no risk of transmission, and these are promising vaccine candidates for use in non-endemic areas, such as Europe.

In the event of a vaccination campaign, a key epidemiological parameter is the proportion of vaccinated horses required to generate herd immunity. For a perfect vaccine, such vaccination coverage is given by 1 − 1/*R*_{0}^{2} but for an imperfect vaccine this threshold must be rescaled using an appropriate reduction factor depending on the parameters affected by the vaccine. In general, this leads to predictions that the required level of vaccination coverage is high (in figure 7*a*, with *R*_{0} for unvaccinated horses equal to 2.6, the critical coverage is 85%). Such a high prediction for vaccine coverage is not surprising; for example, Lord *et al.* [38] estimated that the prevention of 50 per cent of epidemics required 75 per cent coverage of horses and donkeys or 90 per cent coverage of horses only.

In the current work, spatial clustering, e.g. horses kept in livery yards, was not incorporated in the model. At the resolution used, this is expected to have little impact since estimations of the range of the spatial movement of *Culicoides* [54] are comparable with the typical sizes of the postcode sectors; the choice also captures the expectation that movement of *Culicoides* is reduced in highly urbanized areas (i.e. smaller areas of the postcode sectors) as streets act as barriers to disease vectors [55].

In the present model, the movement of horses was not included, despite their potential impact on transmission. To the authors' knowledge, data on horse movements in GB are limited, and modelling horse movement between countries has been proved to be challenging owing to large uncertainty in model inputs [56]. More importantly, one of the first actions following confirmation of AHS in the UK would be a movement restriction zone [57] and possibly a national movement ban on all equids. These considerations led to the choice of focusing our analysis on a local measure of disease risk (*R*_{0} at a particular location and time).

In summary, we have shown how it is possible to address the problem of inaccurate spatial demographic data by exploiting the partial information available. Here, combining NED and land-use data in a Bayesian approach, we developed an algorithm to infer a realistic distribution for the location of horses. Based on such a credible distribution of the host, we explored the impact of using inaccurate maps of equine distribution in predicting risk. In addition, we have clarified the role of non-susceptible hosts by showing that the risk of disease in the presence of many hosts (susceptible and non-susceptible) can be ultimately reduced to two fundamental factors: (i) abundance of vectors and how this depends on host density and (ii) differential feeding preference among animal species. Our results here identify key measurements needed for a better understanding of the elusive role of non-susceptible hosts.

## Acknowledgements

The authors acknowledge the generous support received for their work in this area from the Horserace Betting Levy Board. G.L.I. is funded by the Ecosystem Services for Poverty Alleviation Programme (ESPA). The ESPA programme is funded by the Department for International Development (DFID), the Economic and Social Research Council (ESRC) and the Natural Environment Research Council (NERC). J.L.N.W. is supported by the Alborada Trust and the RAPIDD Program of the Science and Technology Directorate, US Department of Homeland Security and Fogarty International Center. S.G. is supported by the BBSRC. J.R.N. is supported by contributions to the Animal Health Trust's Equine Infectious Disease Service from the Horserace Betting Levy Board, Racehorse Owners' Association and Thoroughbred Breeders' Association. Horse population data were provided by the National Equine Database: NED Ltd. Cattle and sheep data were provided by RADAR Archive (DEFRA). Land-use data were provided by the NERC—Centre for Ecology and Hydrology Database, copyright NERC—Centre for Ecology and Hydrology. All rights reserved. Temperature data were provided by the Met Office 2006. This work is based on data provided through EDINA UKBORDERS with the support of the ESRC and JISC and uses boundary material which is copyright of the Crown and the Post Office.

- Received February 28, 2013.
- Accepted March 25, 2013.

© 2013 The Authors. Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0/, which permits unrestricted use, provided the original author and source are credited.