## Abstract

The objective of this study was to develop and parametrize a mathematical model of the sensitivity of pooled sampling of faeces to detect *Salmonella* infection in pigs. A mathematical model was developed to represent the effect of pooling on the probability of *Salmonella* isolation. Parameters for the model were estimated using data obtained by collecting 50 faecal samples from each of two pig farms. Each sample was tested for *Salmonella* at individual sample weights of 0.1, 0.5, 1, 10 and 25 g and pools of 5, 10 and 20 samples were created from the individual samples. The highest test sensitivity for individual samples was found at 10 g (90% sensitivity), with the 25 g test sensitivity equal to 83%. For samples of less than 10 g, sensitivity was found to reduce with sample weight. Incubation for 48 h was found to produce a more sensitive test than incubation for 24 h. Model results found increasing sensitivity with more samples in the pool, with the pools of 5, 10 and 20 being more sensitive than individual sampling, and the pools of 20 being the most sensitive of those considered.

## 1. Introduction

In 2003, there were 16 343 laboratory confirmed cases of human salmonellosis in the UK (Anon 2004), but it is recognized that there is considerable under-reporting and the true figure within the population is probably in excess of 50 000 cases yr^{−1} (Anon 2000). In 1999–2000, an abattoir survey revealed that approximately 23% of finisher pigs slaughtered in Britain carried *Salmonella* in their caecal contents. The most frequent serotype isolated from pigs is *Salmonella typhimurium* (Cook *et al*. 2003*a*; Davies *et al*. 2004; Veterinary Laboratories Agency 2004), which was identified in 2205 (14%) of the human cases in Great Britain, making it the second most common cause of human disease (Anon 2004). *Salmonella typhimurium* is a non-host-adapted serovar and was also isolated from cattle, sheep and poultry in Great Britain and, therefore, the proportion of human infections attributable to pigs has not been accurately estimated.

In response to the potential threat to public health posed by *Salmonella* infection in pigs, the British Pig Executive launched the Zoonoses Action Plan *Salmonella* Monitoring Programme (ZAP) in June 2002. ZAP was based on a Danish *Salmonella* control programme (Sorensen 2003) and utilizes a meat juice mix-ELISA (MJ ELISA (enzyme-linked immunosorbent assay)) system to detect antibodies against group B and C_{1} *Salmonella* (van der Heijden 2001). Farms are assigned a ZAP score of 1 (less than 65% of samples positive), 2 (65–85% of samples positive) or 3 (greater than 85% of samples positive) and those that receive a ZAP score of 2 or 3 must act to reduce the prevalence of MJ ELISA-positive pigs or face loss of their quality assured status (Armstrong 2001). The aim of ZAP is ‘to reduce the prevalence of *Salmonella* in assured pigs at slaughter by 25%’ in 3 years (British Pig Executive 2003). A similar programme was instigated in Denmark in 1995 and is believed to have contributed to a reduction in human cases of salmonellosis over the following 5 years (Hald & Andersen 2001; Krarup 2002).

A pilot study was conducted in which farms were randomly allocated to a control or intervention group (Cook *et al*. 2003*b*). The latter adopted a farm level enhanced hygiene and biosecurity programme while the former maintained usual practices. On all farms, pooled pen faecal samples were collected at monthly intervals and cultured for *Salmonella*. The impact of the intervention was assessed by comparing the pen incidence rate between intervention and control farms. Pooled pen samples offer several advantages to individual *per rectum* samples: (i) pig welfare is not compromised by capture, restraint and obtaining the sample; (ii) *Salmonella* excretion is intermittent and a negative sample may be obtained from an infected individual pig; and (iii) there is a reduced cost, since sampling is quicker and farm staffs can feasibly trained in sample collection. The validity of pooled sampling for detection of *Salmonella* in pigs has not been established and there is an urgent need to do so, since advice to individual farmers and to the industry as a whole will be based on these results.

Despite its importance, there is little work in the literature on quantitative estimates of the sensitivity of pooled pen faecal samples for *Salmonella* in pigs. A study by Funk *et al*. (2000) examined the impact of sample weight on the sensitivity of the test for individual samples. However, this study was based in the US and the results may not be directly applicable to Great Britain since there are differences in the methods used for isolation. For example, Funk *et al*. (2000) used direct selective enrichment whereas the Veterinary Laboratories Agency (VLA) method uses buffered peptone water (BPW) for pre-enrichment before selective enrichment. Furthermore, existing theoretical models of pooled sampling may not be appropriate as they either do not consider imperfect test sensitivity (Sacks *et al*. 1989; Evers & Nauta 2001) or they assume that the sensitivity of the diagnostic test applied to the pooled sample is independent of the number of infecteds in the pool (Abel *et al*. 1999; Cowling *et al*. 1999). This assumption may not be valid for pooled pen faecal sampling as there are unpredictable effects from mixing samples. Firstly, there is a dilution effect from combining *Salmonella*-positive and *Salmonella*-free samples. Secondly, there may be various inhibitory factors in one or more of the individual samples that are combined in a pool that affect growth of *Salmonella*. These include the presence of other organisms that compete for nutrients and that may release metabolic products that inhibit growth of *Salmonella*, e.g. colicins (Harvey & Price 1974). Moulds may produce antibacterial substances and yeasts may produce alcohols through fermentation. Copro-antibodies and cytokines secreted into the gut by some infected pigs may also be present at varying concentrations and some samples may contain bacteriophages that can kill or damage *Salmonella*. As the culture of a sample progresses, the medium becomes progressively more acidic and this, too, inhibits *Salmonella* growth (Blackburn 1993; Busse 1995; Feder *et al*. 1998; Davies *et al*. 2000; Pangloli *et al*. 2003; Chen *et al*. 2004).

The objective of this project was to produce a mathematical model of the sensitivity of pooled pen faecal samples for *Salmonella* in pigs. The model was parametrized using the data collected and analysed as a part of this study. The optimal number of individual faeces to include in a pooled pen sample was also investigated.

## 2. Materials and methods

### 2.1 Outline of bacteriological methods

Pen faecal samples were pre-enriched in BPW (Merck) at 37 °C for 18 h and selectively enriched in Diasalm agar plates (Merck) for 48 h at 41.5 °C. Samples from this were inoculated at 24 and 48 h onto a Rambach agar plate (Merck) for 24 h at 37 °C and suspect *Salmonella* colonies were subjected to a slide agglutination test using a range of typing sera and to the minimum phenotypic criteria for identification to *Salmonella* species (Davies *et al*. 2001). A subculture of each confirmed *Salmonella* isolate was submitted for full serotyping and phage typing, where applicable.

### 2.2 Mathematical model for sensitivity of testing individual faecal samples for *Salmonella*

We assume, as in Cannon & Nicholls (2002), that *Salmonella* is clustered in faeces at the rate of *C* clusters g^{−1}, and the number of clusters is Poisson distributed. Once the sample has been mixed in BPW the *Salmonella* organisms become homogenously distributed and multiply. The final concentration of *Salmonella* organisms in the BPW depends upon:

the growth rate of the serotype;

the growth rate of other organisms;

the amount of inhibitory substances;

the carrying capacity of the BPW.

The probability of detecting *Salmonella* depends on the final concentration in the BPW and its further growth in the selective Diasalm enrichment culture and Rambach plating agar. In this model, we assume that the number of inhibitory factors described above is directly proportional to the sample weight. Therefore, we assume that the probability of *Salmonella* detection in selective media, denoted *τ*, depends on the ratio of the number of *Salmonella* clusters in the faecal sample and the sample weight (representing the inhibitory factors), according to the formula(2.1)The above formula describes an increasing probability of detection as the ratio of the number of clusters in the sample to the sample weight increases, up to a maximum probability of detection of 1.

The probability of detecting *Salmonella* in an individual faecal sample of weight *w*, denoted *η*(*w*) is given by(2.2)Given that an individual faecal sample contains *Salmonella*, the above formula gives the probability that the sample will test positive. This model can be adapted to determine the probability of detecting *Salmonella* in pooled samples.

### 2.3 Mathematical model for pooled faecal samples

We assume an equal mass of faeces is collected from each pig. We assume further that the distribution of the number of *Salmonella* clusters g^{−1} is the same for all infected pigs, irrespective of serotype and faecal consistency, and that in a pooled faecal sample it is directly proportional to the prevalence of infected faeces in the sample. Therefore, with a prevalence of *π* the pool-level sensitivity is equivalent to the individual-level sensitivity with *Cπ* in place of *C*, and the sample weight equal to the total pooled sample weight. However, while this approach enables a mathematical formula for the pool-level sensitivity to be derived, there is little known about the values that the parameters should take.

### 2.4 Estimating the parameters

Two experiments were performed in the study. Brief details are given below and a flow chart describing both experiments is given in figure 1.

#### 2.4.1 Experiment 1: study of clustering

Faecal samples from 100 individuals were collected, 50 each from two sources, which we denote farm A and farm B. Each faecal sample was divided into 0.1, 0.5, 1, 10 and 25 g samples and tested for *Salmonella*. Individual samples were not homogenized prior to testing, since this would destroy the clusters and would not reflect reality where individual samples would be pooled on the farm.

Since each sample was tested at five different weights, with each sample testing either positive or negative at each weight, there are, in total, 2^{5}=32 different outcomes for each sample. Therefore, the data arise from a multinomial distribution, where the individual-level sensitivity, *η*, at each sample weight, *w*_{i}, is given in equation (2.2).

The probability of negative test results at all five sample weights, assuming an overall prevalence of true positive samples of *π*, is given byand the probability of each outcome for at least one positive test out of the five is given bywhere *j*=0 if test *i* was negative and 1 if test *i* was positive.

The parameter estimates for *C* and *ρ* were obtained from the data using a Markov chain Monte Carlo (MCMC) method (Gelman *et al*. 1995) in WinBugs 3.1 (code available from the authors on request). Since we had no firm knowledge of the values that *C* and *ρ* should take, vague priors were set for *C* and *ρ*, which were each assumed to be gamma distributed (mean=1, variance=10^{3}).

#### 2.4.2 Experiment 2: study of pooling

In order to provide validation of the pooling model and to obtain further data for parameter estimation of *C* and *ρ*, a set of pooled samples was made up from the 100 individual samples. A total of 210 pools were formed, 70 pools each of the following pool types: 5×5 g samples, 10×2.5 g samples and 20×1.25 g samples. Since the material from farm A and farm B was collected at different times, two sets of 35 pools were formed, one set from each source. Individual faecal samples were randomly allocated to these pools, without regard to whether *Salmonella* had been identified in them. For each pool, samples were randomly selected without replacement so that each sample could be selected only once in each pool, but were able to be used in several different pools.

Each pooled sample had a weight of 25 g. The sensitivity of a pooled faecal sample test, of which *π*_{pool} out of *n*_{pool} samples are truly positive for *Salmonella*, is assumed to be equal to the individual sample model with the number of clusters g^{−1} scaled by *π*_{pool}/*n*_{pool}. The pool-level sensitivity for a faecal sample with *π*_{pool} positives out of *n*_{pool} samples, *η*, is therefore assumed to beThe probability of *x*-test positives out of *n*_{pool} pools containing *π*_{pool} truly positive samples is binomially distributed with parameters *η*(*π*_{pool}, *n*_{pool}) and *n*_{pool}. We assumed that a proportion, *ϕ*, of the true positive samples was not detected by the individual-level testing, so that the number of false negative samples in each pool was binomially distributed with parameters *p*=*ϕ* and *n*=the number of samples in the pool−the number of known positives in the pool. Therefore, *ϕ*_{pool}=the number of identified positives in the pool+the binomial sample of the number of false positives. A binomial model was fitted to the pooled testing data using an MCMC method in WinBugs 3.1. The priors for *C* and *ρ* were obtained from the posterior estimates obtained from fitting the individual sample model, with *C* and *ρ* each assumed to follow a normal distribution.

### 2.5 Investigation of the optimal pooling strategy

Using the parameters estimated from experiments 1 and 2 in equation (2.2) enables the estimation of the pool-level sensitivity given the total sample weight, the number of samples, and the prevalence of pigs infected with *Salmonella* in a pen. The expected test sensitivity of a 25 g pooled pen sample from a pen of 30 pigs is calculated when the number of faecal samples in the pool is varied between 1 and 30. Assuming that the number of samples per pool, the number of *Salmonella*-positive pigs in the pen, and the number of pigs in the pen are given by *n*_{pool}, *π*_{pool} and *n*_{pen}, respectively, the expected pool-level sensitivity, *η*_{pool}, is given bywhere *η* is the pool-level sensitivity for a 25 g sample for a given proportion of positive samples in the pool. *P*(*π*_{pool}=*j*) is the probability that there are *j* positive faeces in the pooled pen sample, which is calculated from a hypergeometric probability density function assuming there a total of *π*_{pen} positive faeces out of a total of *n*_{pen} faeces in the pen, thereby assuming that there is one fresh faecal sample per pig. It is possible to scale-up the number of positive faeces and total faeces in the pen by the expected number of individual faecal samples per animal per day (or relevant period). This will not affect the expected value of the pool-level sensitivity, but it would affect estimates of the variance of the pool-level sensitivity.

## 3. Results

### 3.1 Clustering results

Of the 100 individual faecal samples cultured for detection of *Salmonella*, at sample weights of 0.1, 0.5, 1, 10 and 25 g, 44 were positive for at least one sample weight after 24 h of incubation, and 48 were positive after 48 h incubation. The increase in detected positives was largely due to the increased sensitivity of the 25 g sample test, which increased from 35 positives after 24 h incubation to 40 positives after 48 h incubation. The corresponding number of 10 g sample positives increased from 42 to 43. There was no observed increase in the number of detected positives for lower sample weights. The number of positives and estimates of the individual-level sensitivity at each sample weight are given in table 1, assuming that all *Salmonella*-positive faeces were detected by at least one of the tests.

For the estimation of the parameters in WinBugs, 30 000 iterations were performed, with a burn-in of 4000. The fit of the model for the sensitivity of testing individual faecal samples to the individual sample data with 48 h incubation time is given in table 1. This shows that the model can broadly reproduce the observed increase of individual-level sensitivity with increasing sample weights. The median value of *C*=4.4 (2.5 and 97.5 percentiles: 2.7, 7.9) and *ρ*=0.4 (2.5 and 97.5 percentiles: 0.2, 0.8).

### 3.2 Pooling results

#### 3.2.1 5×5 g pools

The individual sample results with 48 h of incubation were used to determine the number of positives in each pool. Of the 18 pools with no positive samples by any of the individual 48-h tests, 4 gave positive tests (table 2), indicating that there were some positive samples which were not detected by the individual sample tests. Of the remaining 52 pools, 39 gave positive tests after 24 h of incubation and 41 gave positive tests after 48 h of incubation. The distribution of the number of positive samples in each pool and the number of the positive pools which tested positive after 48 h of incubation are given in table 2.

#### 3.2.2 10×2.5 g pools

Of the 6 pools with no positive samples by any of the individual sample 48-h incubation tests, 1 tested positive after 48 h of incubation (table 2). Of the remaining 64 pools, 44 were positive after 24 h of incubation and 49 were positive after 48 h of incubation (table 2).

#### 3.2.3 20×1.25 g pools

Of the 68 pools containing at least one positive sample, 45 were positive after 24 h of incubation, and 55 were positive after 48 h of incubation (table 2).

### 3.3 Estimation of pool-level sensitivity parameters

The median value of *C*=7.3 (2.5 and 97.5 percentiles: 5.3–9.8) and median value of *ρ*=0.55 (2.5 and 97.5 percentiles: 0.40, 0.73). The resulting fit of the model to the pooled sample data is given in figure 2, and shows that the pool-level sensitivity increases with the number of positive samples in the pool. This means that work by Abel *et al*. (1999) and Cowling *et al*. (1999) would not be applicable in this case, as they assume that the pool-level sensitivity is independent of the number of positives in the pool, i.e. the pool sensitivity would be assumed to be equal to 83% (table 1) for all pools with at least one positive.

The estimated proportion of false negative individual samples, *ϕ*, inferred from assuming binomially distributed false positives and from the positive results from pools with no individual-level test positives, equalled 6%, i.e. there are most likely to be three positives missed by the individual-level testing.

## 4. Comparison of pooling strategies

The expected pool-level sensitivity of a 25 g sample from a pen of 50 pigs when the number of faecal samples in the pool is varied between 1 and 25 is given in figure 3. This represents the pooling scenario in experiment 2. The graph demonstrates the predicted sensitivity for three scenarios: that the number of infected pigs was 41 (as in farm A), 7 (as in farm B) or approximated to the national prevalence. This national distribution has a mean of 25%, approximately the estimate of *Salmonella* prevalence in a recent abattoir survey (Davies *et al*. 2004). A beta distribution with parameter values of *α*=2.1 and *β*=6.5 was assumed, thus yielding estimates of 2.5 and 97.5 percentiles of 3.8 and 56%, respectively.

The beta-distributed prevalence resulted in a test sensitivity of approximately 67% with 20 samples in the pool. Results for all three values of prevalence indicate an increase in pool-level sensitivity as the number of individual faeces included in the pool is increased, especially for the first five individual faeces. This is due to the increased probability of capturing positive faeces in the pool as the number of individual faeces is increased.

## 5. Discussion

The individual-level test results showed a higher sensitivity for the 10 g sample than the 25 g sample (table 1). The difference between the 25 and 10 g sample sensitivity was significant after 24 h incubation (*p*=0.02) but not after 48 h (*p*=0.63). This is an unexpected finding and further experiments are needed to clarify whether this was a random event in this study or whether it is related to the procedures used. Other authors have reported a detrimental effect of increasing sample weights if a large number of competing micro-organisms are present (Leifson 1936; Harvey & Phillips 1955). Inhibition of *Salmonella* growth in the 25 g samples is also suggested by the higher rate of positives after 48 h of selective enrichment with these samples. This finding conflicts with the EU reference method for *Salmonella* testing, ISO 6579:2002, which specifies a 25 g sample and 24 h selective enrichment. In this study, we took subsamples from a faecal mass. Since clusters of *Salmonella* are not evenly distributed within an affected faecal mass, subsamples may not always have *Salmonella* present. This variation could have been reduced by homogenizing the faecal mass before obtaining subsamples (Cannon & Nicholls 2002).

Results from this pooling study indicate close to 100% sensitivity for pools with greater than 50% prevalence of positive faecal samples. This was higher than that observed for the individual 25 g samples (approximately 80%). In this study, we made the assumption that every sample contained an equal amount of *Salmonella*-inhibiting factors. To our knowledge, very little research has been done on bacterial growth inhibiting properties in a faecal sample. It can be argued that since local immunological factors such as copro-antibodies and cytokines are usually associated with infection in the animal, samples from infected pigs would contain more inhibiting factors than samples from a healthy pig. Other micro-organisms may have adapted to the presence of *Salmonella* in infected pigs and thus be more competitive to *Salmonella* than organisms from non-infected pigs. If this is the case, diluting samples from infected pigs with samples from non-infected pigs may increase the possibility of isolating *Salmonella*. This may explain the higher sensitivity in pooled samples compared to individual samples, but more work is needed to elucidate these speculations.

More work is also needed to confirm some of the results of the study, including comparison of 10 and 25 g samples from a wider range of sources. Pools that include a larger number of individual samples may lead to definition of an optimum pool size. It should also be taken into consideration that it is not always possible to collect individual faecal samples which are unaffected by environmental contamination, and impractical to carefully pool large numbers of them in equal proportions within the pool. Thus, it is also necessary to transfer the theoretically identified optimum sample and pool sizes to on-farm properties and practical use, including the use of naturally pooled floor faeces. The model estimates a pool-level sensitivity of approximately 70% for a mean pen prevalence equal to an approximate estimate of prevalence of the national herd (25%). Further work is needed to verify that the estimates of pool-level sensitivity are applicable to the pooling practices that are carried out in the field. Another useful aspect would be to develop a defined approach for mixing large amounts of material from a group of animals and generating a subsample for testing. This approach would be likely to further enhance the sensitivity of detection with a minimum number of samples.

The most frequent serotypes isolated from pigs in Great Britain are *S. typhimurium* and *Salmonella derby*. However, by chance in this study, the main serotypes were *Salmonella reading* (farm A) and *Salmonella enteritidis* (farm B). These serotypes have been infrequently identified in British pigs and *S. enteritidis* is typically associated with poultry. Current research will enable us to identify farms that are infected with *S. typhimurium* and *S. derby* serotypes and select samples from these for further experiments on pooling. This would help test the assumption that the number of *Salmonella* clusters g^{−1} is independent of serotype, and ensure that the model is fully representative of the serotypes occurring in Great Britain.

The relationship between the concentration of *Salmonella* and the probability of a positive test result was assumed to have only one unknown parameter (equation (2.1)). In reality, a more complex model may represent this relationship more precisely. For example (Jordan *et al*. 2004), a four-parameter Gompertz model was found to produce a better fit than a two-parameter logistic model for detection of *Salmonella* as the concentration of organisms increased using an immunomagnetic separation method followed by culture. The main objective of the study was to produce and parametrize a model for the sensitivity of pooled pen faecal sampling for *Salmonella* in pigs. The fit of the model to the pooled sample data (figure 2) shows that this has been achieved for this small-scale study, and that the use of a more complex model would not be justified for this study. These results will be valuable in the design and implementation of future studies of *Salmonella* infection in pigs, and crucial in interpreting current surveillance data.

This study has focussed on the sensitivity of pooled faecal samples for detection of *Salmonella* in pigs. However, pooled pen sampling is used for surveillance of *Salmonella* in other species (Kivela *et al*. 1998) and of other diseases (Wagner *et al*. 2002), and so the model presented here is likely to have wider applicability.

## Acknowledgments

We are grateful to Defra for funding this work and would like to thank our colleagues at the VLA for their assistance. Finally, this work depended upon the kind co-operation of the two farms involved.

## Footnotes

- Received February 4, 2005.
- Accepted June 1, 2005.

- © 2005 Crown Copyright