September 20, 2017, Wednesday

# Spatial Genetics

Article review by Jason Hammond and Sekson Sirisubtawee

R. A. Fisher.The wave of advance of advantageous genes , Ann. Eugenics 7:353–369, 1937.

## Brief

The contents of this page demonstrate methods for analyzing Fisher's wave equation which models gene diffusion through a uniform linearly distributed population. One method, which differs from Fisher's, uses linear stability analysis to study equilibrium solutions and compares these to Fisher's results. The other two sections are summaries of the rest of the article.

## Model

The main purpose of this paper is to try to mathematically describe how an advantageous mutation of a gene will spread linearly through a habitat. Let us define the following variables:

• p = the frequency of the mutant gene
• q = the frequency of the parent allelomorph, i.e. q=(1-p)
• m = the intensity of selection in favor of the mutant gene; assumed to be independent of p
• k = coefficient of diffusion; assumed to be constant over the entire domain
• x = the spatial variable, $x\in (-\infty ,\infty )$
• t = time, measured in generations

With these definions p must satisfy the following partial differential equation along the linear habitat for

${\frac {\partial p}{\partial t}}=k{\frac {\partial ^{{2}}p}{\partial x^{{2}}}}+mpq$

## Waves of Stationary Form

One way to look at this model is to look for a stationary wave solution. Next we will diverge slightly from the technique used by Fisher, but will arrive at the same result. To represent a wave of stationary form advancing with velocity v, we may set:

$s=x-vt$

With this substitution we change the PDE into the following nonlinear 2nd order ODE:

$k{\frac {d^{{2}}p}{ds^{{2}}}}+v{\frac {dp}{ds}}+mpq=0$

If we let ${\frac {dp}{ds}}=w$ then we have the following first order system of ODE's:

$\left[{\begin{array}{c}p'\\w'\end{array}}\right]={\begin{cases}w\\{\frac {1}{k}}(-vw-mp(1-p))\end{cases}}$

Using linear Stability analysis we find there are two equilibria; (p,w)=(0,0)&(1,0), and the Jacobian matrix is given by:

$J\left[{\begin{array}{c}p,w\end{array}}\right]=\left[{\begin{array}{cc}0&1\\-{\frac {m}{k}}+2{\frac {m}{k}}p&-{\frac {v}{k}}\end{array}}\right]$

The eigenvalues associated with J[0,0] are

$-{\frac {v}{2k}}\pm {\frac {1}{2k}}{\sqrt {v^{{2}}-4mk}}$

and the eigenvalues associated with J[1,0] are

$-{\frac {v}{2k}}\pm {\frac {1}{2k}}{\sqrt {v^{{2}}+4mk}}$

Thus if $v^{{2}}>4mk$ then (0,0) is a stable equilibrium and (1,0) is an unstable saddle equilibrium, and if $v^{{2}}<4mk$ then (0,0) is a stable spiral equilibrium and (1,0) is an unstable saddle equilibrium. So the only situation we can realistically have is the case when $v^{{2}}>4mk$, because in the other case the solutions would allow p to be negative (but $p\in [0,1]$). Fisher comes to this result on the bottom of page 356. Another point to notice is that when ${\frac {d^{2}p}{ds^{2}}}=0$ there is a point of inflection at which the change of p with respect to s is largest in magnitude (i.e. largest gradient). This occurs when $w={\frac {-mpq}{v}}$. We will solve the problem numerically and see that this minimum occurs at the intersection of the solution curve and the horizontal nullcline in the phase portrait.

### Example Solutions with $v^{2}\geq 4mk$

We will now construct an example with m=k=1 and v=2 to illustrate some solutions. We used an initial value (p,w) very close to the unstable equilibrium at (1,0) to generate these plots:

compared with

on page 360.They match quite well. And we can see that we achieved the same p-value at the max gradient.

## The tabulation of the wave form for c=1

The following two sections are direct summaries of the work done by Fisher starting on page 361.

It is possible to get the numerical value of z for each value of p from 0 to 1, and therefore to construct the wave. The following three stages are the process for constructing the wave :

(a) An expansion of z in terms of p is obtained for the neighborhood of p =1,

(b) At any point on the curve dz/dp is calculated from the differential equation, and from this, and preceding values, then the next point on the curve is obtained.

(c) Since ${\frac {dz}{dp}}\rightarrow \infty$ as $p\rightarrow 0$ at a certain stage, a series of values of p for given z are obtained by interpolation, and the process continued using ${\frac {dp}{dz}}$ instead of ${\frac {dz}{dp}}$.

If $z={\sqrt {2}}-1+aq+bq^{{2}}+cq^{{3}}+...$ when $q$ is small, by substitution in the differential equation, and equating power of $q$ , we find that $a=0.10582293,\,b=0.0538084,\,c=0.0341074$. However, the fourth coefficient is numerically about 0.02417. The value of z corresponding to p = 0.96 is z = 0.41853482, then the value of ${\frac {dz}{dp}}$ can be calculated from ${\frac {dz}{dp}}={\frac {1}{pq}}\left({\frac {1}{z}}+2qz-2-z\right).$ When $pq$ is as small as 0.0384, the error in ${\frac {dz}{dp}}$ is about 170 times as big as that in z with the opposite direction. When $pq$ increases, the error in ${\frac {dz}{dp}}$ becomes less than 100 times that in z. The increment added to z to give the next value becomes sufficiently accurate.

If $D$ is the operation of differentiation, $\Delta$ is the forward differentiation, and $\nabla$is the backward differentiation, then

$\Delta z=\left\{1+{\frac {1}{2}}\nabla +{\frac {5}{12}}\nabla ^{{2}}+{\frac {3}{8}}\nabla ^{{3}}+...\right\}Dz=1+{\frac {1}{2}}\nabla (1+{\frac {5}{6}}\nabla (1+{\frac {9}{10}}\nabla (1+...,$

where three differences are used.

To minimize initial errors eight places were used in the values of z for p=0.96, 0.95 and 0.94. The second differences of ${\frac {dz}{dp}}$ show the slight oscillation ( this is shown as in Table II in the original paper.) When the process is continued, ${\frac {dz}{dp}}$ and its differences increase. Then the third and the fourth differences is appreciable at about p = 0.70 and p = 0.35, respectively. The values of p corresponding with z = 0.663, 0.664, 0.665 and 9.666 are calculated, using nine digits, then the values of ${\frac {dz}{dp}}$ calculated from the differential equation.

The values from z = 0.667 (in Table III from the original paper) is found by calculating the successive differences from the differential coefficient. The last point calculated gives z=0.875, p=0.000401738, at which stage p is decreasing by more than a quarter of its value at each step.

## Numerical Applications from the Paper

The total number of heterozygotes in any length of habitat in comparison with the number of organisms is $\int 2pq\,dx$ on the considered interval. Since $dx=\lambda {\frac {dp}{pqz}}$ , $\int 2pq\,dx=2\lambda \int {\frac {dp}{z}}$. With the simple algebra, we can have ${\frac {dp}{z}}=2dp-d(pqz)$. Consequently, the total number of heterozygotes can be expressed in the form,

$2\lambda \int {\frac {dp}{z}}=2\lambda (2p-pqz).$

If $p=1$, then the total number of heterozygotes maintained at any time is $4\lambda =4{\sqrt {\left({\frac {k}{m}}\right)}}$ which is proportional to the square root of the coefficient of diffusion, k, and inversely to the square root of the intensity of selection, m.

The effective center of the wave in its advance is the point at which there are as many mutant genes in front as there are parent genes behind. The number of parent genes behind any point is $\int _{{-\infty }}qdx=\lambda \int ^{{1}}{\frac {dp}{pz}}$.

The form $\lambda \int _{{0}}{\frac {dp}{qz}}$ is unsuitable, since the differential coefficient of z is infinite at $p=0$, so we have to change such form to make it more appropriate. Since ${\frac {1}{z}}=2-{\frac {d}{dp}}(pqz)$, $d(pqz)=pzdq+qd(pz)$, and integration by parts, we get that

{\begin{aligned}\lambda \int _{{0}}{\frac {dp}{qz}}&=\lambda \left(-2lnq-\int _{{0}}{\frac {1}{q}}d(pqz)\right)\\&=\lambda \left(-2lnq-pz-\int _{{0}}pz{\frac {dq}{q}}\right)\\&=\lambda \left(-2lnq-pz-(p+lnq)z-\int ^{{1}}(p+lnq)dz\right).\end{aligned}}

The term $p+ln(q)$ is of the order of ${\frac {1}{2}}p^{{2}}$, and is negligible within the range tabulated.

Here is a concrete situation : Suppose that a mutation giving a selective advantage of 1% is spreading along a continuously occupied shore line, that the standard displacement of young from parents in each generation is 100 yards. Then with m = 0.01/generation, k = 5000 square yards/generation, and $\lambda =717$ yards, the number of heterozygotes is equal to the population of 2868 yards(or more than a mile and a half of coast), even though it is spread over 6 or 8 miles. The rate of advance $v=2{\sqrt {mk}}$ is about 14 yards/generation(or less than 10 miles in 1000 generations.) In order to spread over a habitat of several hundred miles may take 10,000 or 100,000 generations. The effective center in this example is about 210 yards behind the 50% point, while the steepest gradient of gene ratio is about 330 yards in advance of this point.

The proportion of mutant genes is $vg=2pqmz<{\frac {1}{4}}\%$ at its highest point, where p is about 44%. If both homozygotes and heterozygotes can be distinguished with certainty, then the sampling variance of p will be ${\frac {pq}{2n}}$, while that of the difference from two such counts will be ${\frac {pq}{n}}$. If n is as high as 10,000, then the standard error is 0.5% when p = 0.5 and the rate of change is 0.244% in each generation, so that about 5 generations must elapse. The rate of change is greatest at the maximum value of

$z={\frac {1}{1+{\sqrt {\left({\frac {1}{2}}+p\right)}}}}$

which occurs when p = 0.377 $\approx 1.297\lambda .$ If $n={\frac {1}{m^{{2}}}}$ , then the rate of change is 0.5005,its standard deviation.

### Mathematics Used

This is a reaction diffusion system.Fisher uses a stationary wave solution to analyze the partial differential equation. The group analyzing the paper used a different approach - they converted the equation into an ODE and used linear stability analysis to determine equilibrium solutions.

### Type of Model

This is a population model, analyzing how fast beneficial mutations travel through a population.

### Biological System Used

The system used is a population with a mutation that has a known positive effect. The population is spread out evenly (although it would be possible to adjust the density), and there are some basic factors that are known about approximately how much offspring travel and how often they reproduce.

### Discussion of a Recent Paper: Maini, McElwain, Leavesley

Maini, P. K. and McElwain, S. and Leavesley, D. (2004) Travelling waves in a wound healing assay. Applied Maths Letters, 17 (5). pp. 575-580.

In this paper, instead of analyzing a mutation wave through a population, the authors analyze cell growth propagation in wound healing. The authors performed experiments on human peritoneal mesothelial cells by introducing a wound into cell culture and measuring cell growth. The hypothesis that a constant speed travelling wave can be seen in biological systems was tested. The experimental data shows that the constant speed travelling wave model that is predicted by the Fisher equations accurately describes wound healing. Similar propagation scenarios can also occur in cancer cells, but the paper does not go into any detail.

## Related Articles

Relevant to Fisher's article is the work of Kolmogrov, Petrovskii and Piskunov. Many citations of Fisher's work is accompanied by citations of the three Russian mathematicians. One such article, authored by B.H. Gilding and R Kersner extends Fisher's model to describe wave solutions of a growth model for the bacteria, "Paenbacillus dendritiformis". With certain parameters the growth equation can be reduced for the extensively studied Fisher-KPP equation.

• A. KOLMOGOROV, I. PETROVSKII, and N. PISKUNOV, ´ Etude de l’´equation de la diffusion avec croissance de la quantit´e de la matier´e et son application ´a un probl´eme biologique, Moscow Univ. Bull. Math. 1:1–25 (1937).

## References

R. A. Fisher.The wave of advance of advantageous genes , Ann. Eugenics 7:353–369, 1937.