
MBW:Stochastic Epidemic ModelingFrom MathBioContentsIntroductionThis wiki page presents results from our final project for APPM 5390 Mathematical Biology. An exact copy with functioning equations is available here. In this wiki we present our work regarding stochastic modeling of epidemics using SIS and SIR models. We provide an alternate discretization scheme to handle larger problems and discuss deterministic vs. stochastic model selection. We apply the models to the flu epidemic in an English boarding school reported in 1978 by the British medical journal 'The Lancet' [1]. Some characteristics of the epidemic
Summary of Project* Mathematics used: This project utilizes first order ordianary differential equations to model the spread of disease in a population. These deterministic models are then compared with discrete stochastic models that rely upon statistical techniques such as Markov chains and probability distributions. * Type of Model: The models in this project simulate the dynamics of SIS and SIR diseases for a population of school children in an English boarding school. * Biological System: The epidemic of a flu strain in a human population over a period of two weeks. Markov Chains for Stochastic Epidemic ModelsTechniques from Markov chains, applied to epidemic modeling, may allow for more detailed analysis of epidemics. All but the simplest epidemic models are nonlinear, preventing direct analytic computation of the time dependent probability distribution. However, numerical solutions can be computed for systems of moderate size (100's or 1000's individuals). A continuous time Markov chain is a discrete set of states, known as the state space, and a set of rules for transitions among the states. The Markov property means transitions between states occur randomly and depend only on the previous state. Mathematically, Markovianity is represented as follows: Let be a random variable that takes a value on the state space and represents the state of the Markov chain at time t then for all sequences in the state space and all times . Commonly, Markov chains are represented as directed graphs. Imagine your self arriving at node B in the following figure. Immediately, a random clock on each outward edge starts ticking. When the first clock rings you move along that edge to another node. The process is then repeated indefinitely. The behavior of a continuous time Markov chain is completely specified by its transition rate matrix, Q, and an initial distribution. The off diagonal elements are rate parameters for exponential random variables describing the transition from state i to j. The diagonal elements are negative with a value such that every row sums to zero. This construction yields the following governing system of differential equations known as the Kolmogorov Backward Equation Where p is a discrete, time dependent probability distribution on the state space. Given initial conditions p(0) the solution to the Kolmogorov Backward Equation is In practice the solutions can be determined through by this method or by numerical solution to the system of ODEs. Computing the exponential of a large matrix is computationally and memory intensive but an ODE integrator may require many time steps to reach the desired output time and could lose accuracy. The results in this wiki were computed using the exponential solution. SIS ModelThe deterministic model is: for parameters 'r' and 'a'. The stochastic model is: The SIS model is a typical 'compartmental'type model. It is effective in modeling onestep reactions with feedback. One might apply the SIS model to epidemiological problems where infection does not deliver immunity. Examples include some sexually transmitted diseases and the common cold. Adjusting Rate ConstantsQuite often, when dealing with mass action kinetic equations, we need to convert rate constants when changing form from a deterministic equation to a stoichiometric one. The rate constant units of measurement for a deterministic model differ from those of a stochastic model. The following examples are presented with the assumption that the reactions are elementary (i.e. consist of only one step). Examples1. X > Y, with reaction rate k
Let Then Thus, in this case 2. X + Y > Z
Deterministic and Stochastic SIS ModelsThe deterministic and stochastic models give different information about the development of the disease. For detail of the benefits and drawbacks of each model type, see below. In sum, the deterministic model is easier to implement but gives less information than the stochastic model. (For more information on the SIS Model and another application see APPM4390:Modelling the dynamics of a complex lifecycle parasite ) Adjusting Rate Constants for the Stochastic ModelTo adjust the rate constants for the stochastic SIS model, consider the examples above. In the deterministic problem, the I>S rate constant is a function of the parameter a, the deterministic rate constant. This is similar to the first example in which no adjustment in the constant is necessary; thus, For the susceptible to infected reaction, the S+I>2I rate constant is a function of r. This is similar to the second example. The result of interest is where c is the deterministic rate constant and k is the stochastic rate constant. Since the particle in the SIS model is a person, (meaning that there is one person per person). For this problem, . Thus we do not need to adjust the parameter r. ResultsOnce the rate constants have been transformed properly, one can compute the solution. The results of the stochastic solution agree with the deterministic solution. The animation below shows the probability distribution evolving in time.
Conditional ProbabilityOne adjustment that needs to be made is to recognize that the stochastic simulation will give a significant (around 26%) probability of there being no infecteds. Of course, the premise is that an outbreak did occur. To correct for this, it makes sense to look not at the final simulated (day 14) probability distribution but at the final simulated conditional probability distribution, that is, the distribution of possible numbers of infected given that there is at least one infected. One computes the conditional probability as follows: where the probability that one or more people are infected is In sum, the data was normalized to reflect this conditional dependence before plotting. SIR ModelThe SIR epidemic is modeled with the following equations where S, I and R are the numbers of susceptible, infected and recovered students. The parameters r and a respectively control the rate of infection and rate of recovery. The deterministic solution was calculated for the initial conditions of 1 infected and 762 suceptibles using the ode45 integrator in Matlab. For an additional example of an SIR model see APPM4390:Population Biology of Infectious Diseases. Scaling the Stochastic SystemAs a discrete process the SIR model takes the form One drawback of the stochastic models is encountered immediately upon trying to discretize the model of the boarding school epidemic. The state space for the SIR model consists of all 3tuples (S,I,R) where S,I,R are nonnegative and S+I+R=763. That's approximately 300,000 elements which is far to large to apply the standard solution. Instead, the system was discretized into 109 groups of 7 students. The size of the groups was dictated by the integer factors of 763 and the new state space has 6105 elements and is computationally tractable. The new initial conditions are: 1 infected group, 108 suceptible groups. The change in scale requires a change in the r parameter. In the conversion from concentration to discrete stochastic, combining the students into groups requires division by the number of groups per student (this corresponds to the conversion from moles to molecules in the concentration case however, here the students are converted from individuals to groups whereas the molecules are converted from groups to individuals). The a parameter is unchanged as in the stochastic SIS model. The following figure describes the transition rates for the stochastic SIR. Boundary conditions apply to keep each element of the 3tuple nonnegative and less than or equal to 109. Matlab code to implement this model is attached to the bottom of this wiki. ResultsThe animation to the right presents the results from the stochastic SIR model. The animation shows the probability mass function evolving across a simplex describing the state space. Probabilities for states with no infecteds are not shown. The stochastic SIR model agrees with the deterministic results but suggests that the timing of the epidemic is highly variable. Practically, it would be difficult to validate the probability the model assigns to the case that no epidemic occurs. Still, it is interesting to consider the timing of end of the epidemic. The figure below plots the probability (cumulative distribution function) that the system is in a state with no infecteds as a function of time (at which point the epidemic is over and the model will remain stationary). The plot indicates that there is a significant probability (~0.25) that the epidemic won't happen. However, if there are still infected students after two days there will almost certainly be an epidemic and it won't end until sometime after the tenth day.
Model ComparisonThe ODE and stochastic models have complementary strengths. The stochastic models account for individual interaction and provide the entire probability distribution output. However, accounting for individual interaction is intractable for large systems: for N individuals the state space in the stochastic SIS model has N+1 elements, the state space in the stochastic SIR model requires (N+1)(N+2)/2 elements. Epidemic models with more compartments, such as SEIR or MSIR, require geometrically increasing state spaces. On a personal computer, it becomes impractical to solve a Markov chain with more than a few thousand elements in the state space. Solving the system using an ODE integrator requires considerably less memory but computational requirements may be just as impractical, especially for large times. Finally, verifying the stochastic models experimentally would require many epidemics under controls and would be very difficult for human subjects. The ODE models are easy to solve accurately but only provide mean output. For large systems, the continuous nature of the ode solution will not significantly change the results. Recent AdvancementsIn a recent 2010 paper entitled "Stochasticity in staged models of epidemics: quantifying the dynamics of whooping cough", Andrew J. Black and Alan J. McKane attempt to improve the accuracy of stochastic epidemic models by incorporating subclasses into the standard stages of disease progression (SEIR). As discussed in this wiki project, a usual assumption in these models is that exposed and infectious periods are exponentially distributed, which makes the probability of moving from one class to another independent of time spent within that class. This assumption has been shown to be flawed, as the probability of recovery is strongly dependent on the time since infection. Black and McKane intend to remedy this problem by incorporating staged subclasses which are traversed sequentially. By using this method, the authors were able to reinterpret the dynamics of a seasonally forced model of whooping cough and better quantify this dynamical system, which could not be done successfully with the standard stochastic epidemic model. Further models are being studied and developed that study the effect of human decision on the outcome of the epidemic. References
Stochastic SIR Matlab Source% Matlab script to run scaled version of the epidemic %% Parameters N=109; %number of groups of boys  choose this small at first or it may take several days to run initI=1; %initial number of infecteds initR=0; %initial number of recovereds initS=NinitIinitR; %initial number of suceptibles r=(2.18E3)*(763/N); a=0.44036; %% Set up variables numS=0:N; numR=0:N; % number of recoverd, suceptible, and infecteds [R,S]=meshgrid(0:N,0:N); I=ones(N+1,N+1)*N(S+R); %flip make I upper triangular, transpose makes it lower triangular % have to flip the others to match I=fliplr(I)'; R=fliplr(R)'; S=fliplr(S)'; % create an index matrix that maps (S,I,R) tuples that sum to N into a % vector. index(ss,rr) gives index in state space Q, % S(ss,rr),I(ss,rr),R(ss,rr) gives number of sucepts, infecteds and recvds nn=0; index=zeros(N+1,N+1); for rr=1:(N+1) %index rows of S, R N+1rr gives number of recovereds for ss=1:rr %index cols of S, R. ss1 gives number of susceptibles nn=nn+1; index(rr,ss)=nn; % index is upper triangular end end numStates=nn; %total number of states in the state space %% Compute the probability rate matrix n=0; Q=zeros(numStates); % probability rate matrix for rr=1:(N+1) %index rows of S, R N+1rr gives number of recovereds for ss=1:rr %index cols of S, R. ss1 gives number of susceptibles n=index(rr,ss); %S>I: S decreases by one (so we take ss1), R %remains the same rate is rSI if ss>=2 m=index(rr,ss1); Q(n,m)=r*S(rr,ss)*I(rr,ss); end %I>R: I decreases by one (ss will stay the same), R increases (so %we need rr1, since r is indexed backwards). Rate is aI if rr>=2 && rr>ss m=index(rr1,ss); Q(n,m)=a*I(rr,ss); end Q(n,n)=sum(Q(n,:)); end end %% Compute the distribution, p, as a function of time % build an index array to map vector p to matrix p revIndex=zeros(numStates,1); %index statevector back into matrix for ii=1:numStates revIndex(ii)=find(index==ii); end %compute distribution t=(0:0.1:15); p=zeros(N+1,N+1,numel(t)); % 2d probability distribution, 1st index is S, 2nd is I, 3rd is time p0=zeros(1,numStates); p0(index(N+1initR,initS+1))=1; %set initial condition ptempsqr=zeros(N+1,N+1); for ii=1:numel(t) % compute distribution as a function of time (solving p'=pQ, p(0)=p0) ptemp = p0*expm(Q*t(ii)); ptempsqr(revIndex)=ptemp; p(:,:,ii)=ptempsqr; end %% Condition that there is always at least 1 person infected %% visualization figure; h=surfc(rot90(rot90(fliplr(p(:,:,ii))))); axis([0 N+1 0 N+1 0.04 0.04]); hold on; shading flat; for ii=1:numel(t) delete(h); h=surfc(rot90(rot90(fliplr(p(:,:,ii))))); title([num2str(t(ii)) ' Days']); xlabel('S (students)'); ylabel('R (students)'); zlabel('Probability'); shading interp; F(ii)=getframe; end 