September 22, 2017, Friday
University of Colorado at Boulder Search A to Z Campus Map University of Colorado at Boulder CU 
Search Links

MBW:Geometry of Multisite Phosphorylation

From MathBio

Jump to: navigation, search


Most of the work found on this wiki pertains to the description of biological phenomena as systems of differential equations. Results are then obtained by solving the system, often numerically, and quantifying the dynamics. However, such methods rely heavily on the describing parameters. Recently, Manrai and Gunawardena developed a method for studying a simple biochemical system in a framework where there is no need to estimate parameters [1]. The analysis presents walks through one of the main results reached by Manrai and Gunawardena. It continues by discussing how to use experimental data to determine the mechanism of kinases and phosphatases.


Phosphorylation in Biology

Phosphorylation regulates enzyme activity.

In biological systems, phosphorylation is one of the principal methods used by a cell to regulate protein activity. Phosphorylation is the reversible addition of a phosphate group to a protein. As many of half of the proteins found in the cell are regulated in this manner. The enzyme that adds the phosphate group is known as a kinase while the enzyme that removes the phosphate group is a phosphatase.

One of the best characterized phosphorylation systems is part of signal transduction. In order for inter-cellular communication, cells must be able to react to stimulants on the extracellular surface of the membrane. Generally, the extracellular signals bind to proteins in the cell membrane. The binding then creates a conformational change in the protein. Then, due to the conformational change, the cytosolic portion of the membrane protein interacts differently with proteins inside the cell. Many times the conformational change activates kinase activity of the trans-membrane protein. More specifically, there is a class of these membrane proteins known as receptor tyrosine kinases (RTKs), such as the receptor for epidermal growth factor. When a signalling molecule binds to the extracellular portion of a RTK, the RTK is then able to dimerize with adjacent RTKs. In the dimerized state, the kinase domains of the RTK is activated and the phosphorylation cascade begins. Other important instances of phosphorylation in biology includes the transcription factor p53 and wee1.

Because of the prominence of phosphorylation in cellular biology, the properties of these two groups of enzymes is relevant in determining the equilibrium concentrations of different phosphorylation states within the cell. One aspect of the enzyme dynamics is the whether the enzyme is distributive or processive. A processive enzyme is able to add multiple phosphate groups to a substrate in the same reaction while a distributive enzyme would only add one. Differentiating between these two classes will be the focus of the mathematics presented in this article.

Modeling Chemical Reactions

One of the most common methods to model chemical reactions is through mass action kinetics. Mass action kinetics assumes that the rate of product formation is proportional to the concentrations of the reactants. For example, in a reaction with two species A and B that react to create C, depicted below, the rate of formation of C is proportional to the concentrations of A and B.

A+B\rightarrow C

Then, under the mass action model, the rate equations would be

{\frac  {dA}{dt}}={\frac  {dB}{dt}}=-kAB


{\frac  {dC}{dt}}=kAB

where k, the proportionality constant, is known as the rate constant.

Intuitively mass action kinetics is reasonable. When the reactant concentrations increase, a greater number of favorable collisions will occur between the two reactant in each time step, so more reactions will take place. Mass action kinetics can be derived using statistic mechanics.

The Parameter Problem in Mathematical Biology

In mathematical biology, one of the most widely used modelling techniques involves describing the biological subject as a dynamical system. Inherently, dynamical systems have many parameters. For example, in the chemical reaction scheme discussed above, each reaction in the system has an associated rate constant. Therefore, for valid conclusions to be reached, all parameters must be accurately estimated. If the dynamical system is not robust, small perturbations in the parameter values could drastically invalidate any conclusions. Unfortunately, at the same time parameters can be difficult to estimate. In the case where the system is not observable, this becomes impossible. What can be said about a dynamical system without quantifying the parameters? The work below will derive properties of a simple phosphorylation system without knowledge of any parameter values.

Mathematical Model

The simplified phosphorylation model considered in this work.

In the paper described here, the authors considered a simplified model of phosphorylation. The model consists of a substrate, a kinase, and a phosphatase shown to the right. The substrate has two unique phosphorylation sites which can be phosphorylated in either order. Once one of the sites is phosphorylated, the substrate can either be phosphorylated again by the kinase or dephosphorylated by the phosphatase. It is assumed that a single kinase and phosphatase phosphorylates both sites. The dotted arrow in the digram indicates the action of a processive enzyme because it represents a change in the phosphorylation state of both sites at the same time.

The phosphorylation reactions are represented as seen is part b of the figure. The reaction takes place in two steps. First, the enzyme and the substrate must react to form a complex. Then, the reaction must take place, creating the new phosphorylation state, releasing the enzyme.

We will denote the vector x as the concentration of species, including the concentrations of the enzymes alone, the different complexes, and the different phosphorylated states of the substrate. In all there are twelve such species. Further, we will denote the parameters as the vector a. The rate equations then take the form

{\frac  {dx_{i}}{dt}}=f_{i}(x;a)

where the function fi is the reaction rate describe by mass action kinetics. The exact form of each function can be found in the mathematica notebook written by the authors of the original paper. [2]

We will be particularly interested in the steady states of the system, the values of x such that


The set of steady states is denoted W. In attempt to be experimentally relevant, it is assumed that only the concentrations of the phosphorylated states of the substrates is measureable so W will be projected onto these components by

\pi :{\mathbb  {R}}^{{12}}\rightarrow {\mathbb  {R}}^{4}

This mapping simply forgets the unknown concentrations of the other species. This model will be used to determine whether the kinase and phosphatase are distributive or processive.


The mathematical work discussed below assumes knowledge of abstract algebra. For a brief review of abstract algebra, see the original paper. Additionally, many of the polynomials that arise through the analysis, in particular anything after the calculation of the Grobner basis, are fairly complicated and their inclusion would obscure the important details. They can be found in the mathematica notebook written by the authors of the original paper.

First, assume that both the kinase and the phosphatase are distributive. In other words, the rates associated with forming the doubly phosphorylated substrate from the unphosphorylated substrate and its reverse are zero. We will derive an invariant that is characteristic of this situation.

Consider the field extension of the real numbers over the indeterminates a.

{\mathbb  {R}}(a)

Recall that a field extension is the smallest field containing all the elements. In this case, the real numbers are extended to include all rational functions of the parameters. In this way, we can analyze the model without direct knowledge of the parameter values.

Then, the rate functions are polynomials over this field.

f_{i}\in {\mathbb  {R}}(a)[x_{1},...,x_{{12}}]

and further, define the ideal generated by the rate functions as

I=\langle f_{1},...,f_{{12}}\rangle \subset {\mathbb  {R}}(a)[x_{1},...,x_{{12}}]

Since it was assumed that only substrate concentrations are observable, we are interested in the 8-th elimination ideal. This is because the variable x9,...,x12 correspond with the substrate concentrations.

I_{8}=I\cap {\mathbb  {R}}(a)[x_{9},...,x_{{12}}]

In order to calculate this elimination ideal, we must first calculate the Grobner basis for the ideal I. This can be computed simply by using the GroebnerBasis function in mathematica. Then, using the Grobner basis, we can determine ideal membership in order to calculate the desired elimination ideal. Following this calculate, it is clear that


However, the 7th elimination ideal is nonempty. By definition this ideal is generated by polynomials that involve x8,...,x12, which is the concentration of the phosphatase and the concentrations of the substrates. The first three polynomials in the Grobner basis only depend on x8,...,x12, therefore, this ideal is generated by those polynomials, namely

I_{7}=\langle x_{8}.p_{1},x_{8}.p_{2},x_{8}.p_{3}\rangle


p_{1},p_{2},p_{3}\in {\mathbb  {R}}(a)[x_{9},...,x_{{12}}]

Recall that we are studying the steady states of the system, this corresponds with the algebraic variety


Since algebraic varieties are conserved under changes of basis, elements in the original variety must also be elements in the variety of the the Grobner basis. As a consequence, we are interested in the set of points such that the above three polynomials vanish. Since the concentration of the phosphatase is not equal to zero, it is necessary that p1=p2=p3=0.

Next, we introduce new constants to make the form of p1, p2, p3 easier to observe.

p_{1}=\alpha _{1}x_{{10}}^{2}+\alpha _{2}x_{{11}}^{2}+\alpha _{3}x_{{10}}x_{{11}}+\alpha _{4}x_{{10}}x_{{12}}+\alpha _{5}x_{{11}}x_{{12}}

p_{2}=\alpha _{6}x_{{11}}^{2}+\alpha _{7}x_{{9}}x_{{10}}^{2}+\alpha _{8}x_{{9}}x_{{12}}+\alpha _{9}x_{{10}}x_{{11}}+\alpha _{{10}}x_{{10}}x_{{12}}+\alpha _{{11}}x_{{11}}x_{{12}}

p_{3}=\alpha _{{12}}x_{{11}}^{2}+\alpha _{{13}}x_{{9}}x_{{11}}+\alpha _{{14}}x_{{9}}x_{{12}}+\alpha _{{15}}x_{{10}}x_{{11}}+\alpha _{{16}}x_{{10}}x_{{12}}+\alpha _{{17}}x_{{11}}x_{{12}}


\alpha _{i}\in {\mathbb  {R}}(a)

Notice that all three polynomials are homogenous quadrics; they all are composed of only second order monomials. This implies the interesting property that if

\left(x_{9},...,x_{{12}}\right)\in V(p_{1},p_{2},p_{3})

then so is the point

\left(\lambda x_{9},...,\lambda x_{{12}}\right)\in V(p_{1},p_{2},p_{3})

The three polynomials can be added to yield the invariant we desire

\mu _{1}[x_{{10}}^{2}]+\mu _{2}[x_{{10}}][x_{{11}}]+\mu _{3}[x_{{11}}]^{2}-\mu _{4}[x_{{9}}][x_{{12}}]

again for

\mu \in {\mathbb  {R}}(a)

Therefore, substituting the substrate concentrations for their associated values of x, the coordinates

\left({\frac  {[S_{{01}}^{2}]}{[S_{{00}}][S_{{11}}]}},{\frac  {[S_{{01}}][S_{{10}}]}{[S_{{00}}][S_{{11}}]}},{\frac  {[S_{{10}}]^{2}}{[S_{{00}}][S_{{11}}]}}\right)

must be coplanar for a fixed set of parameters.


Dependence of enzyme mechanism on location of data points with respect to the invarient
Plots of curves parameterized by t

The above analysis derived a planar invariant for functions of the concentrations of the phosphorylated states of the substrate. In a way the invariant is nonintuitive, however, when the substrate only has on phosphorylation site, a similar invarient holds. In that case,

{\frac  {[S_{1}]^{2}}{[S_{0}][S_{2}]}}=constant

in which setting the derived results appear more straightforward.

Experimentally, this theory can be directly applied by observing the steady state concentrations under different concentrations of kinase and phosphatase. If both enzymes are distributive, all the data points should be coplanar.

The paper discussed also analyzes cases when either enzyme is processive, which is beyond the scope of the analysis present above. In the case where not all enzymes are distributive, the authors were able to determine which side of the invariant plane the data points would fall on depending on which enzymes were processive. The results of that analysis can be seen in the figure on the right. They introduced the variable t where


As can be seen in the figure, the shape of the curve for increasing values of t depends highly on the mechanism of the enzyme.

To further demonstrate this, the authors randomly selected parameter values and plotted the associated curves as a function of t, shown on the right. These curves collaborate the algebraic analysis.


The most apparent restriction to the results is the simplicity of the model studied. Most molecules of interest have many phosphorylation sites, not simply two. Because of this, the invariant plane that was reached is not applicable to many real systems of interested. Additionally, the analysis does not directly extrapolate to more phosphorylation sites. In such cases, it might be more difficult to create the proper elimination ideal and once its created, it is not clear that the polynomials would remain as homogeneous quadrics. Similarly, it is not always the case that there is a single kinase and phosphatase that is able to react with the substrate. Accounting for multiple enzymes would further increase the complexity of the analysis.

Although to an extent the conclusions of this paper are remarkable, there is still some ambiguity about the application of the developed theory to experimental data. In any experimental assay there will be noise obscuring the observations. As with the estimation of parameters for a dynamical system, the noise could make it difficult to determine the planarity of the desired values. Based solely on the analysis discussed above, the sensitivity of the conclusion to such error is uncharacterized. However, it is likely that the error is dramatically below that which would be incurred by a method that relied on parameter estimation.


  1. Manrai, A.K. and J. Gunawardena (2008). "The Geometry of Multisite Phosphorylation." Biophysical Journal 95(12): 5533-5543