Comparing Agent-Based and Differential Equation Models

更新时间:2023-04-24 16:34:01 阅读量: 实用文档 文档下载

说明:文章内容仅供预览,部分内容可能不全。下载后的文档,内容与下面显示的完全一致。下载之前请确认下面内容是否您想要的,是否完整无缺。

Heterogeneity and Network Structure in the Dynamics of Diffusion: Comparing Agent-Based and Differential Equation Models

Hazhir Rahmandad hazhir@794f30c8e518964bce847c37

Virginia Tech, Falls Church, VA 22043

John Sterman jsterman@794f30c8e518964bce847c37

MIT Sloan School of Management, Cambridge MA 02142

Revision of August 2007

Forthcoming

Management Science

We thank Reka Albert, Joshua Epstein, Rosanna Garcia, Ed Kaplan, David Krackhardt, Marc Lipsitch, Nelson Repenning, Perwez Shahabuddin, Steve Strogatz, Duncan Watts, Larry Wein, members of the MIDAS Group and 2006 MIDAS workshop, the associate editor and referees, and seminar participants at MIT, the 2004 NAACSOS conference and 2004 International System Dynamics Conference for helpful comments. Ventana Systems and XJ Technologies generously provided their simulation software and technical support. Financial support provided by the Project on Innovation in Markets and Organizations at the MIT Sloan School.

1

Heterogeneity and Network Structure in the Dynamics of Diffusion: Comparing Agent-Based and Differential Equation Models

Abstract

When is it better to use agent-based (AB) models, and when should differential equation (DE) models be used? Where DE models assume homogeneity and perfect mixing within compartments, AB models can capture heterogeneity across inpiduals and in the network of interactions among them. AB models relax aggregation assumptions but entail computational and cognitive costs that may limit sensitivity analysis and model scope. Because resources are limited, the costs and benefits of such disaggregation should guide the choice of models for policy analysis. Using contagious disease as an example, we contrast the dynamics of a stochastic AB model with those of the analogous deterministic compartment DE model. We examine the impact of inpidual heterogeneity and different network topologies, including fully connected, random, Watts-Strogatz small world, scale-free, and lattice networks. Obviously deterministic models yield a single trajectory for each parameter set, while stochastic models yield a distribution of outcomes. More interestingly, the DE and mean AB dynamics differ for several metrics relevant to public health, including diffusion speed, peak load on health services infrastructure and total disease burden. The response of the models to policies can also differ even when their base case behavior is similar. In some conditions, however, these differences in means are small compared to variability caused by stochastic events, parameter uncertainty and model boundary. We discuss implications for the choice among model types, focusing on policy design. The results apply beyond epidemiology: from innovation adoption to financial panics, many important social phenomena involve analogous processes of diffusion and social contagion.

Keywords: Agent Based Models, Networks, Scale free, Small world, Heterogeneity, Epidemiology, Simulation, System Dynamics, Complex Adaptive Systems, SEIR model

2

Spurred by growing computational power, agent-based modeling (AB) is increasingly applied to physical, biological, social and economic problems previously modeled with nonlinear differential equations (DE). Both approaches have yielded important insights. In the social sciences, agent models explore phenomena from the emergence of segregation to organizational evolution to market dynamics (Schelling 1978; Levinthal and March 1981; Carley 1992; Axelrod 1997; Lomi and Larsen 2001; Axtell, Axelrod, Epstein and Cohen 2002; Epstein 2006; Tesfatsion 2002). Differential and difference equation models, also known as compartmental models, have an even longer history in social science, including innovation diffusion (Mahajan, Muller and Wind 2000) and epidemiology (Anderson and May 1991).

When should AB models be used, and when are DE models appropriate? Each method has strengths and weaknesses. The importance of each depends on the model purpose. Nonlinear DE models can easily encompass a wide range of feedback effects, but typically aggregate agents into a relatively small number of states (compartments). For example, innovation diffusion models may aggregate the population into categories including unaware, aware, in the market, adopters, and so on (Urban, Hauser and Roberts 1990; Mahajan et al. 2000). However, within each compartment people are assumed to be homogeneous and well mixed; the transitions among states are modeled as their expected value, possibly perturbed by random events. In contrast, AB models can readily include heterogeneity in inpidual attributes and in the network structure of their interactions; like DE models, AB models can be deterministic or stochastic and can capture feedback effects.

The granularity of AB models comes at some cost. First, the extra complexity significantly increases computational requirements, constraining the ability to conduct sensitivity analysis. A second cost of agent-level detail is the cognitive burden of understanding model behavior. Linking the behavior of a model to its structure becomes more difficult as model complexity grows. Finally, limited time and resources force modelers to trade off disaggregate detail and the breadth of the model boundary. Model boundary here stands for the richness of the feedback structure captured endogenously in the model (Meadows and Robinson 1985, Sterman 2000). For example, an agent-based demographic model may portray each inpidual separately but assume exogenous fertility and mortality; such a model has a narrow boundary. In contrast, an aggregate model may lump the entire population into a single compartment, but model fertility and mortality as functions

3

of food per capita, health care, pollution, norms for family size, etc., each of which, in turn, are modeled endogenously; such a model has a broad boundary. DE and AB models may in principle fall anywhere on these dimensions of disaggregation and scope. In particular, there is no intrinsic limitation that prevents AB models from incorporating behavioral feedback effects or encompassing a broad model boundary. In practice, however, where time, budget, and computational resources are limited, modelers must trade off disaggregate detail and breadth of boundary. Choosing wisely is central in selecting appropriate methods for any problem.

The stakes are large. Consider potential bioterror attacks. Kaplan, Craft, and Wein (2002) used a deterministic nonlinear DE model to examine smallpox attack in a large city, comparing mass vaccination (MV), in which essentially all people are vaccinated after an attack, to targeted vaccination (TV), in which health officials trace and immunize those contacted by potentially infectious inpiduals. Capturing vaccination capacity and logistics explicitly, they conclude MV significantly reduces casualties relative to TV. In contrast, using different AB models, Eubank et al. (2004) and Halloran et al. (2002) conclude TV is superior, while Epstein et al. (2004) favor a hybrid strategy. The many differences among these models make it difficult to determine whether the conflicting conclusions arise from relaxing the perfect mixing and homogeneity assumptions of the DE (as argued by Halloran et al. 2002) or from other assumptions such as the size of the population (ranging from 10 million for the DE model to 2000 in Halloran et al. to 800 in Epstein et al.), other parameters, or boundary differences such as whether capacity constraints on immunization are included (Koopman 2002; Ferguson et al. 2003; Kaplan and Wein 2003). Kaplan and Wein (2003) and Kaplan, Craft and Wein (2003) show that their DE model closely replicates the Halloran et al. AB results when simulated with the Halloran et al. parameters, including vaccination rates, population and initial attack size, concluding that parameterization accounts for the different conclusions, not differences in mixing and homogeneity.

Here we carry out controlled experiments to compare AB and DE models in the context of contagious disease. We choose disease diffusion for four reasons. First, the dynamics of contagion involve important characteristics of complex systems, including positive and negative feedbacks, time delays, nonlinearities, stochastic events, and inpidual heterogeneity. Second, network topologies linking inpiduals are important in the diffusion process (Davis 1991; Watts and

4

Strogatz 1998; Barabasi 2002; Rogers 2003), providing a strong test for differences between the two approaches. Third, the DE paradigm is well developed in epidemiology (for reviews see Anderson and May 1991 and Hethcote 2000); AB models also have a long history (e.g. Abbey 1952) and have recently gained momentum (for reviews see Newman 2002, 2003 and Watts 2004).

Finally, diffusion is a fundamental process in perse physical, biological, social and economic settings. Many diffusion phenomena in human systems involve processes of social contagion analogous to infectious disease, including word of mouth, imitation and network externalities. From the diffusion of innovations to rumors, financial panics and riots, contagion-

like dynamics, and formal models of them, have a rich history in the social sciences (Bass 1969; Watts and Strogatz 1998; Mahajan et al. 2000; Barabasi 2002; Rogers 2003). Insights into the advantages and disadvantages of AB and DE models in epidemiology can inform understanding of diffusion in many domains of concern to social scientists and managers.

Our procedure is as follows. We develop a stochastic AB version of the classic SEIR model, a widely used nonlinear deterministic DE model. The DE pides the population into four compartments: Susceptible (S), Exposed (E), Infected (I), and Removed (R). In the AB model, each inpidual is separately represented and must be in one of these four states. Both the AB and DE models use the same parameters. Therefore any differences in outcomes arise only from relaxing the restrictive assumptions of the DE model. In practice, DE modelers add compartments to capture heterogeneity in inpiduals and their contact networks, for example, disaggregating by biological or behavioral attributes (e.g., differences in age or contact frequencies), or by location (as in patch models; see e.g. Riley 2007). Here we use the classic SEIR model to maximize potential differences between the two approaches. We run the AB model under five widely used network topologies (fully connected, random, small world, scale-free, and lattice) and test each

with homogeneous and heterogeneous inpiduals. We compare the resulting diffusion dynamics

on a variety of metrics relevant to public health, including cumulative cases, peak prevalence, and the speed the disease spreads (the time available for health officials to respond).

The most obvious difference between the models we compare is that, for given parameters, the stochastic AB model generates a distribution of outcomes, while the deterministic DE generates a single path representing the expected trajectory under the mean-field approximation for

5

contacts between infectious and susceptible inpiduals. Further, due to chance events, the epidemic never takes off in some realizations of the stochastic model. Deterministic models, whether DE or AB, cannot generate this mode of behavior. Capturing outcome variability in the DE paradigm requires moving to a stochastic compartment model, an intermediate method between deterministic models and the full stochastic AB representation. More interesting are differences due to network topology and inpidual heterogeneity. On average, diffusion slows as contact networks become more tightly clustered compared to the DE. On average, heterogeneity accelerates the initial take-off, as highly connected inpiduals quickly spread the disease, but reduces overall diffusion as these same inpiduals quickly exit the infectious pool.

In a second set of tests, we also examine the ability of the DE model to capture the dynamics of each network structure in the realistic situation where parameters are poorly constrained by biological and clinical data. Epidemiologists often estimate potential diffusion, for both novel and established pathogens, by fitting models to the aggregate data as an outbreak unfolds (Dye and Gay 2003; Lipsitch et al. 2003; Riley et al. 2003). Calibration of innovation diffusion and new product marketing models is similar (Mahajan et al. 2000). We mimic this practice by treating the AB simulations as the “real world” and fitting the DE model to them. On average, the fitted models closely match the inpidual AB realizations under all network topologies and heterogeneity conditions tested. However, the estimated parameters are biased in the highly clustered and heterogeneous cases. Further, the ability to fit such data does not imply that the AB and calibrated DE models will respond to policy interventions in the same way, demanding caution in their use. When different models yield different inferences about policies it is important to identify the assumptions responsible to guide data collection, to improve the models and to select the most appropriate model for the purpose at hand.

The implications of the differences across models depend on the purpose of the analysis. Here we focus on the policy context. Policymakers face a world of time pressure, inadequate data and limited knowledge of parameters such as pathogen virulence, transmissibility, incubation latency, treatment efficacy, etc. Further, the appropriate boundary for analysis is often unclear: resources for vaccination and treatment may be limited; an outbreak, whether natural or triggered by bioterror, may alter the behavior of the public and first-responders, endogenously disrupting the

6

contact networks that feed back to condition disease spread through processes of risk amplification and attenuation (Kasperson et al. 1988, Glass and Schoch-Spana 2002). Hence we consider whether the differences in mean behavior between DE and AB models are large relative to the uncertainties policymakers face. We also consider how these differences in mean behavior might affect the assessment of the costs and benefits, and hence the optimality, of policies.

The mean behavior of different models may be significantly different in the statistical sense, yet be small relative to the variation in output caused by uncertainty about parameters, model boundary, and stochastic events (McCloskey and Ziliak 1996). For example, consider the variability in outcomes generated by a stochastic AB model. Each realization of the model will differ: some exhibiting fast diffusion, some slow; some with many inpiduals afflicted, some with fewer, depending on the chance nature of contacts between infectious and susceptible inpiduals. An ensemble of many simulations generates the distribution of possible epidemics, but only one will be observed in a particular outbreak. Several questions may now be asked.

One important question is whether the expected values of key metrics differ in different models. For example, does the mean value of peak prevalence under a scale-free network differ from the value generated by the corresponding deterministic compartment model? By running a sufficiently large number of simulations sampling error can be made arbitrarily small, and any differences in the mean behavior of the models will be highly statistically significant.

Another question is whether the differences among means are significant from the point of view of policymakers seeking appropriate responses to a potential outbreak. Models with similar “base case” behavior can have similar or different responses to policies, and, conversely, models with different base behaviors may nevertheless yield the same inferences about policy impacts. Differences in policy response across models can be statistically significant yet small relative to uncertainty in parameters, network structure, inpidual attributes, and model boundary. Policy-makers must assess the practical significance of each model assumption given the likely range of outcomes generated by all sources of uncertainty, not only uncertainty caused by random events.

The results document a number of differences between the DE and mean AB dynamics. The results, for both the base-case and calibrated DE models, also show that the differences between the deterministic compartment model, with its assumptions of homogeneous inpiduals

7

and perfect mixing, and the mean behavior of the stochastic AB models are often small compared to the variability in AB outcomes caused by chance encounters among inpiduals, at least for the public health metrics examined here. However, cost/benefit assessments of policy interventions, and hence the optimal policy, can depend on network structure and model boundary, underscoring the importance of sensitivity analysis across these dimensions.

The next section reviews the literature comparing AB and DE approaches. We then describe the models, the design of the simulation experiments, and results, closing with implications and directions for future work.

A spectrum of aggregation assumptions: A

B and DE models should be viewed as regions in a space of modeling assumptions, not as incompatible modeling paradigms. Aggregation is one dimension of that space. Models can range from lumped deterministic differential equations (also called deterministic compartmental models), to stochastic compartmental models, in which the continuous variables of the DE are replaced by counts of discrete inpiduals, to event history models, where the states of inpiduals are tracked but their network of relationships is ignored, to models with explicit contact networks linking inpiduals (e.g., Koopman et al. 2001; Riley 2007).

A few studies compare A

B and DE models. Axtell et al. (1996) call for “model alignment” or “docking” and illustrate with the Sugarscape model. Edwards et al. (2003) contrast an AB model of innovation diffusion with an aggregate model, finding that the two can perge when multiple attractors exist in the deterministic model. In epidemiology, Jacquez and O'Neill (1991) and Jacquez and Simon (1993) analyze the effects of stochasticity in inpidual-level SIS and SI models, finding some differences in mean behavior for small populations. However, the differences practically vanish for homogeneous populations above 100. Similarly, Gani and Yakowitz (1995) examine deterministic approximations to stochastic disease diffusion processes, and find a high correspondence between the two for larger populations. Greenhalgh and Lewis (2001) compare a stochastic model with the deterministic DE version in the case of AIDS spread through needle-sharing, and find similar behavior for those cases in which the epidemic takes off.

Heterogeneity has also been explored in models with different mixing sites for population subgroups. Anderson and May (1991, Chapter 12) show that the immunization fraction required to quench an epidemic rises with heterogeneity if immunization is implemented uniformly but falls if

8

those most likely to transmit the pathogen are the focus of immunization. Ball et al. (1997) and Koopman et al. (2002) find expressions for cumulative cases and epidemic thresholds in stochastic SIR and SIS models with global and local contacts, finding that the behavior of deterministic and stochastic DE models can perge for small populations, low basic reproduction rates (R0), or highly clustered contact networks where transmission occurs in mixing sites such as schools and offices. Keeling (1999) formulates a DE model that approximates the effects of spatial structure when networks are highly clustered. Chen et al. (2004) develop AB models of smallpox, finding the dynamics generally consistent with DE models. In sum, AB and DE models of the same phenomenon sometimes agree and sometimes perge, especially when compartments contain smaller populations. Multiple network topologies and heterogeneity conditions have not been compared, and the practical significance of differences in mean behavior relative to uncertainties in stochastic events, parameters and model boundary has not been explored.

Model Structure: The SEIR model is a deterministic nonlinear differential equation model in which all members of a population are in one of four states—Susceptible, Exposed, Infected, or Removed. Contagious inpiduals can infect susceptibles before they are “removed” (i.e., recover or die). The exposed compartment captures latency between infection and the emergence of symptoms. Depending on the disease, exposed inpiduals may become infectious before symptoms emerge, and can be called early-stage infectious. Typically, such inpiduals have more contacts than those in later stages because they are asymptomatic.

SEIR models have been successfully applied to many diseases. Additional compartments are often introduced to capture more complex disease lifecycles, diagnostic categories, therapeutic protocols, population heterogeneity and mixing patterns, birth or recruitment of new susceptibles, loss of immunity, etc. (see Anderson and May 1991 and Murray 2002). In this study we maintain the standard assumptions of the classic SEIR model (four stages, fixed population, permanent immunity). The DE implementation of the model imposes several additional assumptions, including perfect mixing and homogeneity of inpiduals within each compartment and mean field aggregation (the flows between compartments equal the expected value of the sum of the underlying probabilistic rates for each inpidual). To derive the differential equations, consider the rate at which each infectious inpidual generates new cases:

9

10 c is *Prob (Contact with Susceptible)*Prob (Transmission|Contact with Susceptible) (1) where the contact frequency c is is the expected number of contacts between infectious inpidual i and susceptible inpidual s; homogeneity implies c is is a constant, denoted c IS , for all inpiduals i, s. If the population is well mixed, the probability of contacting a susceptible inpidual is simply the proportion of susceptibles in the total population, S/N. Denoting the probability of transmission given contact between inpiduals i and s, or infectivity, as i is (which, under homogeneity, equals i IS for all i, s), and summing over the infectious population yields the total flow of new cases generated by contacts between the I and S populations, c IS *i IS *I*(S/N). The number of new cases generated by contacts between the exposed and susceptibles is formulated analogously, yielding the total Infection Rate, f ,

!

f = (c ES i ES E + c IS i IS I)(S/N). (2) To model emergence and recovery, consider these to be Markov processes with certain transition probabilities. In the classic SEIR model each compartment is assumed to be well mixed so that the probability of emergence (or recovery) is independent of how lon

g an inpidual has been in the E (or I) state. Denoting the inpidual hazard rates for emergence and recovery as ε and δ, the mean emergence time and disease duration are then 1/ε and 1/δ, respectively. Summing over the E and I populations and taking expected values yields the flows of emergence and recovery:

e =εE and r = δI .

(3) The full model is thus:

f dt dS !=, e f dt dE !=, r e dt dI !=, r dt dR =. (4) Equation (3) implies the probabilities of emergence and recovery are independent of how lon

g an inpidual has been in the E or I states, respectively, and results in exponential distributions for the residence times in these states. Exponential residence times are not realistic for most diseases, where the probability of emergence and recovery is initially low, then rises, peaks and falls. Note, however, that any lag distribution can be captured throug

h the use of partial differential equations, approximated in the ODE paradigm by adding additional compartments within the exposed and infectious categories (Jacquez and Simon 2002). For simplicity we maintain the assumption of a single compartment per disease stage of the classic SEIR model.

The AB model relaxes the perfect mixing and homogeneity assumptions of the DE. Each inpidual j ∈ (1, …, N) is in one of the four states S, E, I, or R. The inpidual state transitions

f[j], e[j], and r[j] equal 1 at the moment of infection, emergence, and recovery, respectively, and 0 otherwise, and depend on inpidual attributes such as contact frequencies and on the chances of interaction with others as specified by the contact network. The aggregate flows f,e, and r over any interval dt are the sum of the inpidual transitions during that interval. The online supplement details the formulation of the AB model and shows how the DE model can be derived from it by assuming homogeneous agents and applying the mean-field approximation.

A central parameter in epidemic models is the basic reproduction number, R0, the expected number of new cases each contagious inpidual generates before removal, assuming all others are susceptible. The base case parameters yield R0 = 4.125 (Table 1), similar to diseases like smallpox, R0≈ 3–6 (Gani and Leach 2001), and SARS, R0≈ 2-7 (Lipsitch et al. 2003; Riley et al. 2003). The base value provides a good opportunity to observe potential differences between DE and A

B models: diseases with R0 < 1 pose little risk to public health, while those with R0 >> 1, e.g., measles, cause a severe epidemic in (nearly) any network. The AB models use the same infectivities and expected residence times, and we choose inpidual contact frequencies so that mean total contact rates in each network and heterogeneity condition are the same as the DE model. We set the population N = 200, all susceptible except for two randomly chosen exposed inpiduals. Though small compared to settings of interest in policy design, e.g., cities, the effects of random events and network type are likely to be more pronounced in small populations (Gani and Yakowitz 1995), providing a stronger test for differences between the DE and AB models. A small population also reduces computation time, allowing more extensive sensitivity analysis. The DE has 4 state variables; computation time is negligible for all N. The AB model has 4N state variables and must also track interactions among the N inpiduals, implying that computation time can grow at rates up to O(N2), depending on the contact network. We report sensitivity analysis of R0 and N below. The supplement includes the models and full documentation. Experimental design: We vary both the network structure of contacts among inpiduals and the degree of inpidual heterogeneity in the AB model and compare the resulting dynamics to the DE. We implement a full factorial design with five network structures and two heterogeneity

11

conditions. In each of the ten conditions we generate an ensemble of 1000 simulations of the AB model, each with different realizations of the random variables determining contacts, emergence, and recovery. Since the parameters in each simulation are identical to the DE model, differences in outcomes can only be due to differences in network topology, heterogeneity among inpiduals, or the discrete, stochastic treatment of inpiduals in the AB model.

Network topology: The DE model implemented here assumes perfect within-compartment mixing, implying any infectious inpidual can meet any susceptible inpidual with equal probability. Realistic networks are far more sparse and clustered. We explore five different network structures: fully connected, random (Erdos and Renyi 1960), small-world (Watts and Strogatz 1998), scale-free (Barabasi and Albert 1999), and lattice (where contact only occurs between neighbors on a ring). We parameterize the model so that all networks (other than the fully connected case) have the same mean number of links per node, k = 10 (Watts 1999).

The fully connected network corresponds to the perfect mixing assumption of the DE model. The random network is similar except people are linked with equal probability to a subset of the population. To test the network most different from the perfect mixing case, we also model a one-dimensional ring lattice with no long-range contacts. With k = 10 each person contacts only the five neighbors on each side. The small world and scale-free networks are intermediate cases with many local and some long-distance links. These widely-used networks characterize a number of real systems (Watts 1999; Barabasi 2002). We set the probability of long-range links in the small world networks to 0.05, in the range used by Watts (1999). We build the scale-free networks using the preferential attachment algorithm (Barabasi and Albert 1999) in which the probability a new node links to existing nodes is proportional to the number of links each node already has. Preferential attachment yields a power law for the probability that a node has k links, Prob(k)

=αk?γ. Empirical studies typically show 2 ≤γ≤ 3; the mean value of γ in our experiments is 2.6.

The fully connected and lattice networks are deterministic, so every simulation of these cases has the same network governing contacts among inpiduals. The Erdos-Renyi, small world, and scale-free cases are random networks. Each simulation of these cases uses a different realization of the network structure. In realistic networks the links among inpiduals change through time even as overall topology can remain stable (e.g., Kossinets and Watts 2006),

12

introducing mixing that brings the AB model closer to the assumptions of the compartment model. To maximize the differences between the AB and DE conditions, however, we assume the network realization in each simulation is fixed. The supplement details the construction of each network. Inpidual Heterogeneity: Each inpidual has four relevant characteristics: expected contact rate, infectivity, emergence time, and disease duration. In the homogeneous condition (H=) each inpidual is identical with parameters set to the values of the DE model. In the heterogeneous condition (H≠) we vary inpidual contact frequencies.

Heterogeneity in contacts is modeled as follows. Given that two people are linked (that they can come into contact), the frequency of contact between them depends on two factors. First, how often does each use their links, on average: some people are gregarious; others shy. Second, time constraints may limit contacts. At one extreme, the frequency of link use may be constant, so that people with more links have more total contacts per day, a reasonable approximation for some airborne infections and easily communicated ideas: a professor may transmit an airborne virus or a simple concept to many people with a single sneeze or comment, (roughly) independent of class size. At the other extreme, if the time available to contact people is fixed, the chance of using a link is inversely proportional to the number of links, a reasonable assumption when transmission requires extended personal contact: the professor can only tutor a limited number of people each day. We capture these effects by assigning inpiduals different propensities to use their links,

λ[j], yielding the expected contact frequency for the link between inpiduals i and j, c[i,j]: c[i,j]=κ*λ[i]*λ[j]/ (k[i]*k[j])τ(5) where k[j] is the total number of links inpidual j has, τ captures the time constraint on contacts, and κ is a constant chosen to ensure that the expected contact frequency for the population equals the value used in the DE model. In the H= condition λ[j] = 1 for all j and τ = 1 so that expected contact frequencies are equal for all inpiduals, independent of how many links each has. In the H≠ condition λ[j] is a random variable and τ = 0: inpiduals have different contact rates and those with more links have more contacts per day. We use a uniform distribution, λ[j] ~ U[0.25, 1.75]. Calibrating the DE Model: In practice the parameters determining R0 are often poorly constrained by biological and clinical data. For emerging diseases such as vCJD, BSE and avian flu data are

13

not available until the epidemic has already spread. Parameters are usually estimated by fitting models to aggregate data as an outbreak unfolds; SARS provides a typical example (Dye and Gay 2003; Lipsitch et al. 2003; Riley et al. 2003). Because R0 also depends on contact networks that are often poorly known, models of established diseases are commonly estimated the same way (e.g., Gani and Leach 2001). To mimic this protocol we treat each realization of the AB model as the “real world” and estimate the parameters of the DE to yield the best fit to the cumulative number of cases. We estimate infectivity (i ES and i IS), and incubation time (1/ε) by nonlinear least squares in a large set of inpidual AB realizations (see the supplement). Results assess whether calibrated compartment models can capture the behavior of heterogeneous inpiduals in realistic settings with different contact networks.

Results: For each experimental condition we examine three measures relevant to public health. The maximum symptomatic infectious population (peak prevalence, I max) indicates the peak load on public health infrastructure including health workers, immunization resources, hospitals and quarantine facilities. The time from initial exposure to the maximum of the infected population (the peak time, T p) measures how quickly the epidemic spreads and therefore how long officials have to deploy those resources. The fraction of the population ultimately infected (the final size, F) measures the total burden of morbidity and mortality. To illustrate, figure 1 compares the base case DE model with a typical simulation of the AB model (in the heterogeneous scale-free case). The sample scale-free epidemic grows faster than the DE (T p = 37 vs. 48 days), has similar peak prevalence (I max = 27%), and ultimately afflicts fewer people (F = 85% vs. 98%).

In this study we focus on the practical significance of differences between the mean output of AB and DE models. Specifically, we explore whether the differences among models are large relative to the variability in outcomes for which policymakers should plan and whether the differences alter the choice of optimal policies. To begin, we conservatively consider outcome variability arising only from stochastic interactions among inpiduals. Specifically, suppose policymakers planning for a possible outbreak know with certainty mean infectivity, incubation period, disease duration, network type, and all other parameters conditioning contagion and diffusion, and that these characteristics are unaffected by the course of the epidemic. In short, assume policymakers possess a perfect agent-based model of the situation, and lack only

14

knowledge of which inpiduals will, by chance, encounter each other at any moment and transmit the disease. As an example, suppose the contact network is characterized by a scale-free degree distribution with known parameters, and that inpiduals are heterogeneous in their behavior (but with known distribution). For the hypothetical disease we examine, prevalence peaks on average after 44 days at a mean of 23.9% of the population. In the deterministic compartment model with the same parameters, prevalence peaks after 48 days at 27.1% of the population. Given the large sample of AB realizations, these differences are statistically significant (p < .001), but they are not practically significant. Unobservable stochastic interactions among inpiduals means policymakers, to be, for example, 95% confident resources will be sufficient, must plan to handle an epidemic peaking between 4 and 75 days after introduction, with peak prevalence between 4% and 31.5% of the population. Of course, the deterministic model yields a single trajectory representing the expected path under the mean-field approximation. No responsible policymaker should plan for the mean epidemic without considering uncertainty. To assess the range of outcomes arising from the random nature of inpidual interactions, policymakers using compartment models would have to estimate the impact of uncertainty by, for example, moving to a stochastic DE representation. Such a model would be computationally efficient relative to the full AB model, but would still assume within-compartment mixing and homogeneous agents.

Policymakers should also consider how model assumptions affect the optimality of interventions. Consider, for example, a quarantine policy. Quarantine should be implemented if its benefit/cost ratio (e.g. the value of QALYs or DALYs saved and avoided health costs relative to the costs of quarantine implementation), is favorable and higher than that of other policy options (including no action). Two models may yield similar estimates of epidemic diffusion, yet respond differently to policies. In such cases the differences between the models may be of great practical significance even if their base case behavior is similar. We provide an example below.

Figure 2 shows the symptomatic infectious population, I, in 1000 AB simulations for each network and heterogeneity condition. Also shown are the mean of the ensemble and the trajectory of the base case DE model. Table 2 reports results for the fitted DE models; Tables 3-4 compare the means of T p, I max, and F for each condition with both the base and fitted DE models. Except for the lattice, the DE and mean AB dynamics are qualitatively similar. Initial diffusion is driven

15

by positive feedback as contagious inpiduals spread the infection. The epidemic peaks when the susceptible population is sufficiently depleted that the mean number of new cases generated by contagious inpiduals is less than the rate at which they are removed from the contagious pool.

Departures from the DE model increase from the connected to the random, scale free, small world, and lattice structures (Figure 2; tables 3-4). The degree of clustering explains some of these variations. In the fully connected and random networks the chance of contacts in distal regions is the same as for neighbors. The positive contagion feedback is strongest in the connected network because an infectious inpidual can contact everyone else, minimizing local contact overlap. In contrast, the lattice has maximal clustering. When contacts are localized in a small region of the network, infectious inpiduals contact their common neighbors repeatedly. As these people become infected the chance of contacting a susceptible and generating a new case declines, slowing diffusion on average, even if the total susceptible population remains high.

In the deterministic DE model there is always an epidemic if R0 > 1. Due to the stochastic nature of interactions in the AB model, it is possible that no epidemic occurs or that it ends early if, by chance, the few initially contagious inpiduals recover before generating new cases. As a measure of early burnout, table 3 reports the fraction of cases where cumulative cases remain below 10%. (Except for the lattice, the results are not sensitive to the 10% cutoff. The online appendix shows the histogram of final size for each network and heterogeneity condition.) Early burnout ranges from 1.8% in the homogeneous connected case to 6.8% in the heterogeneous scale-free case. Heterogeneity raises the incidence of early burnout in each network since there is a higher chance that the first cases will have few contacts and recover before spreading the disease. Network structure also affects early burnout. Greater contact clustering increases the probability that the epidemic burns out in a local patch of the network before it can jump to other regions, slowing diffusion and increasing the probability of global quenching.

Heterogeneity results in smaller final size, F, in all conditions: the mean reduction over all

ten conditions is 0.10, compared to a mean standard deviation across all conditions,

!", of 0.19.

Similarly, heterogeneity reduces T p in all conditions (by a mean of 9.5 days, with

!" = 26 days).

Maximum prevalence also falls in all conditions (by 1.5%,

!" = 5.1%). In the H≠ condition high-

contact inpiduals tend to become infected sooner, causing, on average, faster take-off compared

16

to the H= case (hence earlier peak times). These inpiduals are also removed sooner, reducing mean contact frequency, and hence the reproduction rate, among those who remain compared to the H= case. Subsequent diffusion is slower, peak prevalence is smaller, and the epidemic ends sooner, yielding fewer cumulative cases.

Consider now the differences between the DE and AB cases by network type.

Fully Connected: The fully connected network corresponds closely to the perfect mixing assumption of the DE. As expected, the base DE model closely tracks the mean of the AB simulations. In the H= condition, T p, I max, and F in the base DE model fall well within the 95% confidence interval defined by the ensemble of AB simulations. In the H≠ case, T p and I max also fall within the 95% range, but F lies just outside the range encompassing 95% of the ensemble. Random: The random network behaves much like the connected case. The DE values of T p and I max fall within the 95% outcome range for both heterogeneity conditions. The value of F in the DE falls outside the 95% range for both H= and H≠, because the sparse contact network means more people escape contact with infectious inpiduals compared to the perfect mixing case. Scale-Free: The scale free network departs substantially from perfect mixing. Most nodes have few links, so initial takeoff is slower, but once the infection reaches a hub it spreads quickly. The base DE values of T p and I max fall well within the 95% outcome interval for both heterogeneity conditions. However, as the hubs are removed from the infectious pool, the remaining nodes have lower average contact rates, causing the epidemic to burn out at lower levels of diffusion; the 95% range for final size is 2% to 98% for H= and 1% to 92% for H≠, while the base DE value is 98%. Small World: Small world networks are highly clustered and lack highly-connected hubs. Nevertheless, the presence of a few long-range links is sufficient to seed the epidemic throughout the population (Watts and Strogatz 1998). Diffusion is slower on average compared to the DE and the connected, random, and scale-free networks. The existence of a few randomly placed long-range links also increases the variability in outcomes. The 95% range for T p is 22 to 154 days for H= (7 to 176 days for H≠), easily encompassing the base DE value. Slower diffusion relative to the DE causes peak prevalence in the DE to fall outside the 95% interval of AB outcomes for both H= and H≠. The main impact of heterogeneity is greater dispersion and a reduction in final size.

17

Lattice: In the lattice inpiduals only contact their k nearest neighbors, so the epidemic advances roughly linearly in a well-defined wave of new cases trailed by symptomatic and then recovered inpiduals. Such waves are observed in the spread of plant pathogens, where transmission is mostly local, though in two dimensions more complex patterns are common (Bjornstad et al. 2002; Murray 2002). Because the epidemic wave front reaches a stochastic steady state in which removal balances new cases, the probability of burnout is roughly constant over time, and I max is lower, with the base DE value falling outside the 99% range. For the same reason, mean final size is much lower and peak time longer than the base DE. Interestingly, the variance is higher as well, so that in the H= condition the DE values of F and T p fall within the 95% range of AB outcomes. In sum, peak time in the uncalibrated base DE model falls within the envelope encompassing 95% of the AB simulations in all ten network and heterogeneity conditions. Peak prevalence falls within the 95% range in all but the small world and lattice. Final size, however, is sensitive to clustering and heterogeneity, falling within the 95% range in only three cases.

Calibrated DE Model: In practice parameters such as R0 and incubation times are poorly constrained and are estimated by fitting models to aggregate data. Table 2 summarizes the results of fitting the DE model to 200 randomly selected AB simulations in each experimental condition, a total of 2000 calibrations. The median R2 for the fit to cumulative cases exceeds 0.985 in all scenarios. The mean values of F, T p, and I max in the calibrated DE fall within the range encompassing 95% of the AB outcomes in all network and heterogeneity conditions. The DE model fits well even though it is clearly mis-specified in all but the homogeneous fully connected network. Why? As the network becomes increasingly clustered and diffusion slows, the estimated parameters adjust accordingly. Specifically, in deterministic SEIR compartment models, R0 and final size are related by R0 = –ln(1 – F)/F (Anderson and May 1991). Consequently, when contact clustering leads to smaller F, the estimated incubation time or transmission rates must shift to yield a smaller estimate of R0. The parameter estimates are biased because deviations from their underlying values are the only way the DE, with its within-compartment homogeneity and mixing assumptions, can capture the impact of heterogeneity and network structure. Further, the close fit of the compartment model does not imply that its response to policies will be the same as that of the underlying clustered and heterogeneous network. The supplement provides further details.

18

Sensitivity to Population Size: We repeated the analysis for N = 50 and 800 (see the supplement). The results change little over this factor of 16. For most conditions, the rate of early burnout falls in the larger population, so the final fraction of the population infected is slightly larger (and therefore closer to the value in the DE). Population size has little impact on the other metrics. Sensitivity to R0: We varied R0 from 0.5 to 2 times the base value; detailed results are reported in the supplement. Naturally, diffusion is strongly affected by R0. Somewhat surprisingly, however, over the range tested the differences between the DE and mean AB outcomes remain small relative to the 95% outcome range for most of the metrics. Changes in R0 have two offsetting effects. First, the smaller the value of R0, the larger are the differences between the DE and means of the AB trajectories. Second, however, the smaller R0, the greater the variation in outcomes within a given network and heterogeneity condition caused by chance encounters among inpiduals. Small values of R0 reduce the expected number of new cases each infectious inpidual generates before removal. In effect, the fraction of the contact network sampled by each infectious inpidual is smaller, so the probability that the epidemic will be seeded at multiple points in the network decreases. In highly clustered and heterogeneous networks, the lower representativeness of these small samples increases the difference between the DE and the mean of the AB trajectories (for example, more cases of early quenching will be observed). For the same reason, however, inpidual realizations of the same network and heterogeneity condition will differ more with small R0, increasing the variance in outcomes for which policy makers must prepare. Similarly, larger R0 reduces the differences between the DE model and the means of the AB models but also reduces variability in outcomes because each infectious inpidual samples the network many times before recovering. These offsetting effects imply that, over the range examined here, the differences between the DE and the mean behavior of the AB models are relatively insensitive to variations in R0.

Sensitivity to disease natural history: In many diseases the exposed gradually become more infectious prior to becoming symptomatic. This progression can be modeled by adding additional compartments to the exposed stage with different infectivities in each. In the classic SEIR model used here, with only one compartment per stage, pre-symptomatic infectivity is approximated by

19

assuming the exposed are contagious, though with i ES < i IS. To test the impact of this assumption we set i ES = 0, adjusting i IS to keep R0 at its base value. Results are reported in the supplement. As expected, diffusion slows and the probability of early quenching grows. However, the differences in the mean values of the metrics across models generally remain small relative to the 95% range of AB outcomes. Assuming that exposed inpiduals are not contagious has little impact on the differences between the DE and mean AB behavior relative to the variability in AB outcomes. Policy Analysis and Sensitivity to Model Boundary: Another important question is whether the behavior of the models differs in response to policy interventions and expansion of the model boundary. While comprehensive treatment of these questions is beyond the scope of this paper, we illustrate by examining the impact of actions that reduce contact rates. For example, the 2003 SARS epidemic appears to have been quenched through contact reduction (Wallinga and Teunis 2004, Riley et al. 2003, Lipsitch et al. 2003). Contact reduction can arise from policies, e.g., quarantine (including mandatory isolation and travel restrictions), and from behavioral feedbacks, e.g., social distancing, where inpiduals who fear infection reduce contacts with others. For simplicity we assume contact rates fall linearly to a minimum value as the total number of confirmed cases (cumulative prevalence P = I + R) rises.1 Specifically, we model the contact

frequency c js between infectious persons, j

!" {E, I}, and susceptibles, s

!

" {S}, as a weighted

average of the initial frequency, c*js, and the minimum achieved under quarantine, c q js:

!c

js

=(1"q)c

js

*+qc

js

q(6)

!

q=MIN[1,MAX(0,(P"P

)/(P

q

"P

))](7) The impact of contact reduction, q, rises linearly as cumulative prevalence, P, rises from a threshold, P0, to the level at which the effect saturates, P q. We set P0 = 2 and P q = 10 cases. Neither social distancing nor quarantine are perfect; we set the minimum contact frequency, c q js = 0.15c*js. This value gradually reduces R0 in the DE model from 4.125 to ≈ 0.6, roughly similar to the reduction Wallinga and Teunis (2004) estimate for the SARS epidemic.

1 Other policies, such as targeted immunization, can exploit the structure of the contact network, if it is known, and generally require an AB model, though some such policies can be approximated in DE models (e.g., Kaplan et al. 2003).

20

As expected, contact reduction quenches the epidemic earlier. In the DE model, prevalence peaks 17 days sooner, I max falls from 27% to 4.4%, and F falls from 98% to 19% of the population, greatly easing the burden on public health resources. Contact reduction has similar benefits in the AB cases. The differences between the means of the metrics in the AB models and their DE value are small relative to the variation in outcomes caused by stochastic interactions in the AB models. The DE results fall within the 95% outcome range for all three metrics in all network and heterogeneity conditions, with one exception: the value of F in the lattice (Table 5). However, as in the base case, clustering and heterogeneity cause some differences between the DE and mean AB outcomes. Under contact reduction heterogeneity increases mean F because high-contact inpiduals tend to be infected first, increasing the exposed population relative to H= before contact reduction is triggered. In the base case, however, heterogeneity lowers F because early high-contact cases are also removed early, lowering the reproduction rate. Therefore the mean reduction in F under contact reduction is smaller in the heterogeneous cases.

Policies should be implemented if their cost-benefit ratio is favorable compared to other options, including no action. As a simple illustration, suppose the per-capita costs of mandatory contact reduction policies, denoted C, are fixed and that the benefits are linear in avoided cases, ΔF = F no policy – F policy. Ignoring uncertainty, and hence issues of policymaker risk aversion, mandatory measures should be implemented if bΔF > C, where b is the benefit per avoided case. In the scale-free case, ΔF = 0.75 for the H= case but falls to 0.59 in the H≠ condition (see the supplement). For 0.59 ≤ C/b ≤ 0.75, whether mandatory measures are indicated depends on whether the population is homogeneous or not. Uncertainty, nonlinear costs and benefits, or risk averse policymakers will change the width of this interval of policy sensitivity but not the principle that the choice among policies may be sensitive to network type, inpidual heterogeneity and other assumptions.

The size of the region of policy sensitivity also depends on the model boundary. For example, if awareness of the epidemic arising from, e.g., media reports causes inpiduals to engage in social distancing spontaneously, contacts will fall even without quarantine and travel restrictions, reducing the benefits of mandatory measures. If spontaneous social distancing reduces R0 persistently below one, mandatory measures would not be needed to quench the epidemic and would not be justified on cost-benefit grounds. At the other extreme, if the public’s

21

本文来源:https://www.bwwdw.com/article/acoq.html

Top