It is anticipated that the next generation of computational epidemic models will simulate both infectious disease transmission and dynamic human behaviour change. Individual agents within a simulation will not only infect one another, but will also have situational awareness and a decision algorithm that enables them to modify their behaviour. This paper develops such a model of behavioural response, presenting a mathematical interpretation of a well-known psychological model of individual decision making, the health belief model, suitable for incorporation within an agent-based disease-transmission model. We formalize the health belief model and demonstrate its application in modelling the prevalence of facemask use observed over the course of the 2003 Hong Kong SARS epidemic, a well-documented example of behaviour change in response to a disease outbreak.
In this paper, we lay a mathematical foundation for the representation of individual health decision making within agent-based simulations of contagious disease transmission and discuss the key questions and challenges in implementing this framework.
A number of recent studies involving mathematical and computational epidemiological models have illustrated the substantial impact that individual and group behaviour have on the epidemic process . Most of these previous efforts either evaluate the immediate impact of public health policy on disease transmission [2,3] or retrospectively explain the impact of behaviour modification policies on historical outbreaks [4–6]. A few have begun to dynamically model individual decision making and study its effect on the course of an epidemic [7,8].
It is anticipated that the next generation of computational epidemic models will incorporate both infectious disease and dynamic behaviour change so that agents within a simulation will not only transmit infection to one another, but will also have situational awareness and a decision algorithm allowing them to make decisions regarding their behaviour [1,9]. Such models will require dynamic feedbacks between events in the simulated community, personal behaviour, and disease incidence—information upon which the agents can base their decisions. This paper outlines the groundwork for such a model, presenting a mathematical framework for implementing a well-known psychological model of individual decision making, the health belief model (HBM), into a classical agent-based disease transmission model. We formalize the HBM and demonstrate its applicability in modelling the prevalence of facemask use observed over the course of the 2003 Hong Kong SARS epidemic, a well-documented example of behaviour change in response to a disease outbreak .
The HBM, initially proposed by Irwin Rosenstock and subsequently expanded by others, is one of a number of models from the health psychology literature that represents individual behaviour as motivated by several core, well-defined constructs (belief variables) [11,12]. The original formulation of the HBM held that adoption of a health promoting behaviour was motivated by four constructs: (i) an individual's perceived susceptibility (likelihood of contracting the disease), (ii) perceived severity (the adverse consequences of contracting the disease), (iii) perceived barriers to behaviour adoption (the social, physical or psychological difficulties to adopt the behaviour), and (iv) perceived benefits of the behaviour (how effective the behaviour will be in protecting the individual) . Other constructs such as cues to action (a trigger prompting the individual to take action) and self-efficacy (one's confidence in his or her ability to execute the behaviour) were added later, but in this paper, we focus on the four constructs of the original formulation .
Recent public belief surveys during the Hong Kong SARS epidemic, H5N1 influenza outbreak and H1N1 influenza pandemic found that behaviours such as facemask wearing, increased hand-washing, avoidance of crowded places and vaccination acceptance correlate significantly with some constructs of the HBM [13–15]. Other health behaviour models such as the theory of planned behaviour and the theory of reasoned action share some components of the HBM and represent directions in which our mathematical approach could be extended .
2. Model framework
The general behaviour decision algorithm within an agent-based simulation is represented in figure 1. Contextual information external to the agent, such as the lethality of the disease, its prevalence, and the media coverage, affect the agent's evaluation of the HBM constructs in the belief updating step. The agent then compares the numerical output of his personal HBM to a pre-specified threshold to determine whether he adopts the behaviour in the behaviour decision step. The feedback from behaviour change to contextual information reflects the effect of the individual's behaviour on the course of the epidemic, e.g. a substantial percentage of the population engaging in a health promoting behaviour might be sufficient to dampen the epidemic or end it altogether [17,18]. The specifics of the behaviour change algorithm will depend in large part on the nature of the disease and the behaviour under consideration.
3. Mathematical expression of health belief model
Several authors have suggested techniques for mathematically expressing the constructs of the HBM. These suggestions include linear regression , multiplicative models  and logistic regression . The logistic framework has several advantages: (i) it is well-suited to binary classification ; (ii) binary variables are reasonable descriptors of the construct states and are simple to model; and (iii) it is a standard convention in the public health literature to describe the relationships between belief constructs and behaviour with logistic models, facilitating model parametrization [15,23,24].
The logistic model, written in terms of the odds ratios (OR) of behaviour associated with each HBM construct, takes the form: 3.1
Here, i = 1, … , 4 denotes the four HBM constructs. ORi indicates the relative odds of the behaviour when the corresponding belief construct is ‘high’ relative to when it is ‘low’. xi is a binary variable representing the state of the corresponding HBM construct, with a value of 1 indicating a ‘high’ state of the HBM construct and a value of 0 indicating a ‘low’ state. OR0 functions as a calibration constant by defining the probability of the behaviour when all xi variables are in the ‘low’ state.
Equation (3.1) gives a value p(behaviour) between 0 and 1. Using a binary classification model , behaviour is determined as ‘engages in behaviour’ if p(behaviour) ≥ 0.5, ‘does not engage in behaviour’ otherwise. Framing the HBM as proposed in equation (3.1) allows for a very parsimonious representation of the probability of behaviour, with straightforward parametrization using standard logistic regression on public belief and behaviour survey data. By stratifying the ORs across different population segments (such as age or gender), individual or demographic heterogeneity may be programmed into the model. Individual variability can be represented by defining distributions across the ORs from which individuals sample upon model initiation.
The logistic model represents the behaviour decision as a function of a set of measured states of the HBM constructs. To be useful in a dynamic simulation, these constructs must be responsive to changes within the simulation, denoted ‘contextual information’. Furthermore, the relationship between contextual information and the constructs of the HBM should be empirical. This is a fundamental challenge of behaviour modelling, as there has been very little research on the factors that influence an individual's HBM belief constructs. We propose some possible relationships between contextual information and the HBM constructs, but recognize the acute need for further research in this area.
3.1. Perceived susceptibility
The existence and prevalence of reported cases of an epidemic disease in the country of residence has been demonstrated to correlate with higher personal perceived susceptibility to that illness [25–27]. Therefore, it is reasonable to hypothesize that both the most recent state and the cumulative history of the epidemic affect individual perceived susceptibility. Research in behavioural economics and risk perception has shown that the evaluation of a risk is determined in part by the perceived salience of the risk in the recent past, suggesting a cognitive discounting of less recently experienced risks . Additionally, the documented tendency of the media to report the cumulative number of illnesses during an epidemic, which may not accurately communicate a waning epidemic, lends support to the inclusion of the cumulative number of illnesses in the model for updating susceptibility .
Equation (3.2) presents a model for updating perceived susceptibility that captures these three components: recent prevalence of disease, cumulative epidemic history and discounting of past illnesses. This behaviour may be captured by defining ct as the number of new illnesses at time t and st as the weighted sum of ct over time. st is evaluated relative to the agent's threshold λ, a calibration parameter ideally chosen from data. When st ≥ λ, the agent perceives his susceptibility to be high; when st < λ, perceived susceptibility is low. To represent individual heterogeneity, small variations in individual values of λ can be introduced for each agent upon model initiation. 3.2
In a closed form, 3.3
This equation defines a weighted sum over the history of the outbreak. δ ≤ 1 is a time discounting rate signifying that the agent pays less attention to the number of illnesses at previous time steps and the most attention to the most recent status of the outbreak. For δ = 1, the entire history of the outbreak is weighted equally, and st represents the cumulative sum of the number of cases. For δ = 0, only yesterday's illnesses are remembered. An intermediate value of δ represents a combination of instantaneous and cumulative illness counts.
3.2. Perceived severity
3.2.1. Option 1: updating based on estimated lethality
Modelling perceived severity as a function of the ongoing official estimate of the case lethality rate of the disease may be suitable for situations in which a high-profile disease is concentrated in a relatively small geographical area. This was the case during the Hong Kong SARS outbreak, where the public closely followed the media and government reports and was well-informed of the current status of the disease and the number of fatalities [10,29,30]. In Hong Kong, the estimated case fatality rate of SARS was found to be a significant predictor of facemask usage .
Perceived severity is modelled in equation (3.4) by comparing the ratio of cumulative deaths to cumulative illnesses (as tracked within the simulation) to a threshold α. This definition of the case fatality ratio represents the dynamic information that is available to the public as an outbreak unfolds. The initial number of deaths may lag behind illnesses during the first days of an emerging disease. Our simple metric effectively captures this influence, allowing agents to respond to the most recent public knowledge of lethality, which may well change as the epidemic progresses and the ratio of deaths to illnesses stabilizes. 3.4
3.2.2. Option 2: updating perceived severity based on news coverage
There is substantial empirical support for a relationship between the frequency of news coverage surrounding a risk and the perceived severity of that risk among the public [31–34]. Shih et al.  found that media coverage of epidemics is largely event-oriented with spikes in coverage being prompted by factors such as reports of new cases, the announcement of government policies and new scientific findings.
For this option, perceived severity is modelled as a function of the number of news stories on day t of the epidemic, where the parameter τ is the threshold of high severity: 3.5
For modelling epidemics in the recent past, actual time series of news articles can be retrieved from the web using tools such as Google News (http://news.google.com/). For prospective modelling, it is necessary to simulate the news coverage endogenously. Following Wei et al. , the number of news stories can be modelled as a damped exponential form with two parameters, b0 and b1. y(t), shown in equation (3.6), gives the cumulative number of news stories t days after some initiating event, while its time derivative (equation (3.7)) approximates the number of stories on day t. 3.6and 3.7
Wei et al.  provide guidelines for parametrizing the news model from the observed or expected cumulative quantity of coverage , and the number of stories on the first day after the event and yielding , . They showed this model to be applicable to news coverage of a wide range of disasters, including the 2003 SARS outbreak in Guangzhou, China. By adding together several copies of the media model under different parametrizations, we expand the framework to allow the model to include additional peaks (equations (3.8) and (3.9)) as was observed for the 2003 SARS outbreak. The constant parameters ta and tb (in units of days) are offsets for the ‘start time’ of the respective news cycles. tb might be set several months into the future, representing a spike in media coverage coincident with vaccine development or a second infectious wave. 3.8and 3.9
To demonstrate this approach, we collected news stories on the 2003 Hong Kong SARS outbreak using Google News searches for keywords ‘Hong Kong SARS’. Figure 2 shows the best fit to equation (3.9), with the relevant parameters given in table 1. The second peak in figure 2, occurring in December 2003, may have been precipitated by concurrent suspected SARS and H5N1 influenza cases in Hong Kong and China in addition to progress made towards a SARS vaccine [37,38]. This option would be useful if, for example, a simulation included a lag between the detection of an outbreak and the release of a suitable vaccine.
A limitation of this approach is the lack of ability to distinguish between positive and negative media coverage. However, several studies have demonstrated that, while negative media coverage outweighs positive coverage in its effect on risk perception, the total quantity of coverage also has a significant effect through the accessibility heuristic: the salience of a risk in the daily media greatly outweighs its objective danger in influencing individual risk perception [31,33].
3.3. Perceived benefits
Although perceived benefits of health-protective behaviours are a well-documented determinant of behaviour adoption [15,23,24], it is unclear how likely they are to change over the course of an epidemic. Perceived benefits have been shown to respond positively to concerted governmental information campaigns in some contexts [39,40]. Yet in other situations, perceived benefits may not be strongly affected by any dynamic influence during the epidemic. During the 2003 Hong Kong SARS outbreak, for example, numerous government messages endorsed the use of facemasks, yet over this time, there was no measurable change in the population's opinions of the benefits of wearing masks . During the 2009 H1N1 outbreak in the United Kingdom, researchers surveyed respondents about health perceptions and behaviours, and recorded whether each respondent had read a recently distributed government leaflet describing personal protective behaviour . They found no significant predictive effect of having read the document on having adopted a health-protective behaviour.
Perceived benefits are thus context-dependent, and their inclusion in a behaviour model should be determined on a case basis. For facemask wearing during the 2003 SARS outbreak, we did not find evidence of any significant change in the population's perceived benefits over the course of the Hong Kong outbreak. For that reason, in this paper, we treat perceived benefits as a static variable that is not updated and therefore has no influence on the decision to wear a facemask (ORben = 1.0). If evidence warranted updating perceived benefits with media coverage or governmental announcements, we suggest a threshold formulation similar to that proposed for perceived susceptibility.
3.4. Perceived barriers
Perceived barriers may be composed of multiple factors, not all of which are usually explored in public opinion studies. Two such factors, price and availability, may present barriers in some contexts. However, during the Hong Kong SARS outbreak, neither factor was found to present a significant obstacle to facemask wearing .
A possible factor affecting perceived barriers of facemasks is the social acceptability of wearing one in public. Although this has not been extensively studied, there are anecdotal reports suggesting that social stigma is a primary barrier to facemask wearing. The reverse was true during the Hong Kong SARS epidemic, when some survey respondents cited fear of discrimination if they chose not to wear a facemask in public . For example, a published report of two researchers' travel experiences during the SARS epidemic observed a number of fellow airline passengers on a multi-stop flight purchasing and wearing facemasks after having observed a high level of mask usage in the Bangkok airport .
We propose modelling barriers to facemask wearing as a function of the number of individuals the agent observes wearing a mask. This could be taken as a global count of facemask prevalence, where the agent is aware, via media coverage, of the overall prevalence of facemasks in the society simulation. Alternatively, the agent's social network could come into play: 3.10The perceived barriers experienced by individual i are scored ‘low’ (corresponding to a higher likelihood of wearing a mask), if the weighted sum of the members of an agent's social network wearing a facemask exceeds a threshold η. Here, the weight γij signifies the ‘social influence’ of individual j upon individual i. Two individuals who are family or close friends will have a higher γ coefficient than would two individuals who are relative strangers. The function ϕ(j) equals 1 if individual j wears a facemask, and equals 0 otherwise. The parameter η represents the threshold at which the individual's perceived barriers change from high to low.
4. Model of facemask wearing during the Hong Kong SARS outbreak
During the 2003 global SARS outbreak, Hong Kong emerged as the epidemiological and media focus of the disease. SARS permeated every aspect of life in the region. Facemask wearing became ubiquitous in Hong Kong during the first weeks of the epidemic and through its decline. This was documented in a longitudinal survey that recorded facemask wearing increasing from 10–15% in late March 2003 to 85–90% in mid-April .
The dynamics of facemask wearing in Hong Kong provide us with a case study for exploring model parameters. We developed an agent-based behaviour model in C++ and calibrated it to Hong Kong facemask wearing. The model simulation was run for 60 daily time steps from 18 March 2003 to 16 May 2003, with each of 10 000 agents making a daily decision, using the algorithm outlined in this paper, whether to wear a facemask that day. The order in which agents acted was randomized at each time step to approximate synchronous agent action.
In this exploratory analysis: (i) perceived severity was determined by the dynamic case lethality of disease (rather than intensity of media coverage). (ii) We did not include a dynamic epidemic simulation, instead reading in World Health Organization records of Hong Kong SARS incidence and fatality from a file . Agents within the model made behaviour decisions in direct response to the historically recorded events. This allowed us to decouple the behavioural and epidemic processes and explore behaviour adoption in isolation. (iii) Network effects were disabled; each agent calculated his perceived barriers by observing the total number of individuals in the simulation wearing a facemask, rather than just members of the agent's own social network (i.e. all γij = 1 in equation (3.10)).
4.1. Model parametrization
The two HBM constructs for which data were available from survey results were perceived susceptibility and perceived severity. Mean and standard deviation for both OR were estimated across several values from the literature [15,23,24] using the one-factor fixed effects model . The remaining parameters of the model (OR0, ORbar, δ, λ, α, η) are calibration parameters for which literature estimates were unavailable. ORben was set to 1.0 in accordance with our assumption that perceived benefits were not correlated with mask wearing. Agent heterogeneity was incorporated by allowing each individual parameter to vary across a distribution; a mean and standard deviation were specified for every parameter of interest, from which agents sampled their values using log-normal distributions for the OR and normal distributions for δ, λ, α and η. Negative random samples or, for δ, those that exceeded 1, were re-sampled.
Although the data on facemask wearing do not extend beyond 12 May 2003, it is clear anecdotally that facemask wearing eventually declined later in the year . We do not extend our model to these later dates for lack of data, but remark that the model is capable of capturing declining dynamics with properly chosen parameters. As the quantity of media coverage and new SARS cases decline, those agents more sensitive to the susceptibility and severity influences will be the first to remove their facemasks, thereby decreasing the overall prevalence of facemask wearing in the community and decreasing the barriers threshold for other agents.
By carefully examining the implications of different values for the parameters δ, λ, α and η, and by evaluating equation (3.1) under various HBM belief states, we estimated plausible parameter ranges (table 2) across each parameter and searched the parameter space to estimate the best least-squares fit between our model and the recorded prevalence of facemask wearing during the epidemic . The high dimensionality and non-convexity of the search space precluded an exhaustive parameter search; instead, we focused on what we considered to be a reasonable search range. The best fit in our search space is shown in figure 3 over a 10 000 iteration Monte Carlo simulation.
Table 2 lists the parameter distribution moments producing figure 3. The tightness of the Monte Carlo simulation is attributable to its threshold nature, which makes the model sensitive only to parameter deviations that shift an agent's calculation of equation (3.1) above or below the 0.5 threshold. Construct updates that do not result in crossing the p(behaviour) threshold have no effect on the agent's behaviour decision.
4.2. Sensitivity analysis
The sensitivity of the model to individual parameters is shown in table 3. Here, the sum of squared residuals (SSR) resulting from varying the mean of each model parameter by plus and minus 10 per cent of its optimal value is shown.
The parameter δ, which calibrates perceived susceptibility by specifying the relative weight of cumulative and instantaneous disease cases (equation (3.2)), has by far the greatest influence on model behaviour. Figure 4 shows the resulting model fit upon decreasing δ by 10 per cent of its optimal value, making the agents much more sensitive to the instantaneous disease count and less so to the cumulative epidemic history. This figure reveals the importance of the cumulative epidemic history in our model's ability to accurately reproduce the observed facemask dynamics. Under the optimal fit, agents have a long memory of the SARS epidemic (a high δ), a finding that is consistent with the prolonged ubiquity of SARS in the Hong Kong public consciousness following the outbreak [45,46].
This work represents the first demonstration of a quantitative HBM suitable for incorporation in an agent-based epidemic simulation. The HBM has its share of skeptics [47–49], but for the purpose of this paper, establishing a mathematical framework for modelling decision making, the four-construct HBM presents a reasonable starting point for future work. Ultimately, inclusion of behavioural decision making into agent-based simulations requires a compromise between psychological realism and model tractability.
Table 3 and figure 4 demonstrate the sensitivity of the model to δ, and hence to the specific manner in which agents evaluate the relative weight of cumulative and instantaneous disease counts in making their decisions. Any future work building upon our framework should carefully consider the implications of this weighting, and parametrize it empirically to the extent possible.
An important limitation of our SARS analysis is the decoupling of behaviour and disease dynamics. We imposed this constraint in order to study behaviour parametrization in isolation, but recognize the significant role that feedback between these two processes could play. Dynamic interaction between behaviour and disease has been explored in other contexts and represents an important direction in which our model could be extended [8,17].
The HBM is an imperfect model of decision making. Emotion has been cited as a significant determinant of behaviour during recent epidemics, yet does not fit well into the HBM framework [29,50,51]. The HBM, which assumes a rational actor unswayed by emotion, may be less appropriate for situations that violate these assumptions. Emotional and affective factors are increasingly recognized as a primary determinant of behaviour . The presence of visceral cues, which might include the fear of an unpredictable and poorly understood pathogen, has been shown to have a significant effect on decision making [53–55].
The least well understood element of our framework is the belief updating component. Although the correlations between protective health behaviour and the HBM constructs are well documented, there has not been significant research into how these psychological constructs dynamically respond to external influences during an epidemic. Research is needed into the relationship between external contextual influences and internal perceptions, especially those external influences that may be explicitly represented within an agent-based simulation, such as disease prevalence, network effects, information flows, neighbour actions and beliefs, and government and media messages. There is an acute need for empirically validated relationships between such drivers and the model constructs.
This paper provides a guide for further empirical research. For example, the form and parametrization of equations (3.2) and (3.3) (perceived susceptibility) and equations (3.4) and (3.5) (perceived severity) should be closely examined through a carefully designed study conducted during an epidemic. Longitudinal public health surveys repeated over the course of a high-publicity public health event can be designed to support calibration, verification of model output and exploration of the relationships between HBM constructs and temporal contextual factors, such as policy messages, media focus and case-reporting metrics . An empirical study tailored to parametrize our representation of the HBM could be optimized by taking into account the model sensitivity discussed in table 3, to closely focus on those determinants (the susceptibility threshold and discounting) predicted to most strongly affect behaviour.
It is important to stress the value of conducting parametrization research during an ongoing epidemic, rather than prospectively studying participants' anticipated behaviours. Research in behavioural economics and psychology has consistently shown the significant inaccuracy with which individuals predict future behaviour, both in a health behaviour context and in general [56,57]. Care must be taken in using such hypothetical behaviour predictions in an epidemic disease and behaviour model. This limitation makes parametrization of our framework more empirically challenging, and underscores the need for researchers to prepare in advance for, and move quickly during, an event such as the H1N1 pandemic or the SARS outbreak.
Several alternatives to the HBM include the theory of reasoned action, the theory of planned behaviour, and the transtheoretical model . A modelling approach such as ours could be used to compare the predictive power of these different decision models in replicating observed health behaviour trends. By carefully defining and parametrizing drivers of the model constructs, correlating the constructs with behaviour, and evaluating the ability of the model to reproduce observed behavioural trends, these different formulations of health behaviour can be rigorously evaluated for their utility in health behaviour modelling.
We are grateful to Benoît Morel, Steven Albert and Joshua Epstein for their helpful comments. We also wish to thank Donald Burke for suggesting this topic and providing financial support. We acknowledge the Master's thesis of John Kraemer, which served as inspiration for this research. This research was funded by the NSF Graduate Research Fellowship Programme, Carnegie Mellon University, Department of Engineering and Public Policy and the University of Pittsburgh School of Public Health. The content of this manuscript does not necessarily represent the views of the funders.
- Received May 25, 2011.
- Accepted June 30, 2011.
- This Journal is © 2011 The Royal Society