# Introduction

(1)

According to the the Centers for Disease Control and Prevention (CDC),
heart disease is the leading cause of death for both men and
women[5], accounting for approximately
deaths per year in the United States. Because the number of
deaths caused by heart disease is so high, considerable effort is
spent trying to understand the causes and implications while also
creating better ways to predict disease onset. One physiological
aspect that is studied is the heart rate, or the rate at which the
heart contracts, pumping blood through the arteries. Heart rate is
usually studied via *electrocardiography* a branch of medicine
which uses sensors to detect the electrical activity in the heart.
These sensors, also known as EKGs, generate time series data of
electrical signals. The time between contractions is commonly
referred to simply as "heart rate."

In this work I further investigate heart rate dynamics by analyzing
time series data. I use various nonlinear dynamics techniques such as
delay coordinate embedding, Lyapunov exponents, and capacity dimension
calculations in the hope of determining whether heart rate time series
data is chaotic. However, there are two novel differences I
introduce. First, I conduct nonlinear time series analysis of heart
rate data collected from a triathlete, training and racing at an
Ironman distance. Secondly, this data is not collected by a clinical
EKG machine, but by wearable, consumer-grade electronics such as a
Garmin Heart Rate Monitor. The main source of data for previous
research comes from clinical or hospital settings. The data is
usually gathered from different, broad categories of people by age,
such as "normal," "male, ages 30-34," or congestive heart failure
patients. Also this data is collected while the patients are asleep
or at rest. Results from the work presented here can be used in
determining differences between health people and congestive heart
failure patients, for example. Also, it may be possible to help
create better training programs for endurance multi-sport athletes.

Three different regimes are investigated. The hypothesis is that each
of the following regimes will provide measureably different dynamics:
training vs. racing, early cycle vs. late cycle, and sequential
vs. isolated events. I believe that racing data will exhibit more
evidence for chaos than training data. This is based on the personal
experience that there are significantly more things out of your
control during a race, and these added variables will cause the system
to be chaotic. Also, early cycle training data will demonstrate more
chaotic evidence than late cycle data due to the fact that early in a
cycle the body is struggling in more ways. Lastly, isolated events
will show more indications of chaos than sequential events. This is a
mostly a guess from experience: the longer the race goes on the less
fluctuations you experience.

# Background

(2)

Due to the large number of people who suffer from heart disease,
research into cardiac functions, such as heart rate, is prevalent.
Previous research has analyzed the heart rate (as time series data),
especially from the perspective of nonlinear dynamics, in an effort to
further understand the details of the cardiovascular system. In fact,
in 2008, the journal *Chaos* , introduced a new feature:
"Controversial Topics in Nonlinear Dynamics," and the topic of the
first edition was entitled "Is the Normal Heart Rate Chaotic?" The
edition included eight, relatively introductory, papers on this broad
topic. One paper by Baille, Cecen, and Erkal makes the claim that
heart rate is not chaotic[1]. This results
corroborates early work by Kanters et
al. [4]. However, both of theses papers directly
conflict with earlier work by Ivanov
et. al. [3]. The work by Ivanov et al. is
related to other multifractality investigations by Sassi et
al. [6], which suggest that heart rate
variability may be multi-fractal for certain conditions, but not all.
There is also work that suggests the original question, "Is the
Normal Heart Rate Chaotic?" is too general, and the authors, Wessel
et al. suggest that "chaotic-ness" of the heart rate is related to
respiration [9]. Clearly, the subject is not well
understood, and I intend to further the discussion by taking a
somewhat different approach.

# Methods

(3)

## Delay Coordinate Embedding

(4)

An implied assumption I have made is that the heart rate is a single
state-variable in a much larger biological system. For example, from
experience I know that both hydration-level and temperature affect the
heart rate variable. Also, pace, terrain, energy stores, rest, and
injuries all play a role in the heart rate. However, using a heart
rate time series, there is no access to any of these other variables.
Therefore, I use delay coordinate embedding to "generate" these
other variables. Now, these generated values are not used directly;
however, Takens theorem says that certain quantities, such as Lyapunov
exponents and capacity dimension, are topologically invariant from the
variables in the original system [8]. To say
this another way, the reconstructed state space is
*diffeomorphic* to the orignal state space.

The first step in creating an embedding is to estimate the delay
value, . To do this, I use the TISEAN program =mutual= .
The **mutual** program uses adjusted mutual information to
approximate this value. A typical method is to use the first minimum
in the mutual information calculation for the delay value, .
Once we have a value for the delay, the next step is to estimate the
number of dimensions. The TISEAN program **false_nearest** is
what I used to do this. The **false_nearest** program reports
the percentage of "false" neighbors found when projecting all points
onto a single dimension. A common technique is to choose the
dimension where the false neighbor count drops below a threshold of
10 percent.

## Lyapunov Exponents

(5)

Now that a reconstructed state-space, or embedding, has been created
the calculation of the maximal Lyapunov exponent can proceed.
TISEAN's program **lyap_k** is used to do this. In doing so, I
supply both the estimated delay, , and the approximate dimension
of the data, both of which were calculated previously in
Section 4. Without explaining in too much detail, the
output of the **lyap_k** program is basically an estimated
"stretching factor." The **lyap_k** calculates this
stretching factor for various points and different values of a
paramemter called , which controls how big the local
neighborhood for each calculation is. For more information please
see [2]. When analyzing the **lyap_k**
output, I am looking for scaling regions with constant slope, where
the value of that slope is the maximal Lyapunov exponent.

## Capacity Dimension

(6)

When we think of dimension, either in a physical sense or a higher,
multi-dimensional sense, we usually think of integer-valued
dimensions. For example, a line is one dimension, while physical
objects exist in three dimensions. However, some objects, such as the
Koch curve, exhibit different properties. Any two points points on
the Koch curve are infinitely far apart. However, the curve encloses
a finite area (or volume). There is as calculation called the
capacity dimension which captures this relationship. The Koch curve,
as well as many other fractals, have a non-integer valued capacity
dimensions. Chaotic systems often have fractal structure. Therefore,
computing the capacity dimension of the embedded time series data can
provide more evidence of a chaotic system. To compute the capacity
dimension of the heart rate data I used the TISEAN program
**d2**. Again, without going into the details of the **d2**
algorithm, the output gives a correlation sum for each value of the
parameter , as discussed in Section 5. Instead
of looking for a constant slope scaling region, we are looking for a
region which is *asymptotically* *invaraiant* . This means
region of the curve where the correlation sum does not change for many
values of .

## Process

(7)

The first question I investigated was this: do events completed in
isolation (a single run, for example) show different properties than
events completed in succession (a run preceded by a bike ride)? In a
triathlon, each event, swimming, biking and running, is performed in
quick succession. That is, there is no resting between events. In
fact, the time between events (called *transition* ) is typically
a very stressful period for athletes. However, most athletes train
events singularly: go to the pool and swim, or just leave your house
for a run. To study this, I looked at two similar length bike rides.
The first was a 3 hour, 30 minute bike ride in isolation. The second
was a 2 hour 45 minute bike ride which was preceded by a 10 kilometer
(approximately 56 minute) run. Even though a race is swim before
bike, sometimes getting to a pool and biking right afterwards can be
difficult, so we simulate that action by running first instead of
swimming. This figure shows the TISEAN output for
**mutual** and **false_nearest** , used to approximate the
delay and the dimension, respectively.

From these graphs I chose the following parameters. For the isolated
bike event I determined that and the dimension, .
For the sequential bike I found and the dimension, . Now that we have estimated these parameters we can calculate the
Lyapunov exponents. The next figure shows the
TISEAN output: the y-axis is what is called the "stretching-factor"
and the x-axis is the iteration. Each of the different colors
represent a different value for the algorithms parameters ,
(the local size), and , the dimension. Looking at the left side of
each graph in this figure, a positive and
constant slope region is visibile before the graph becomes much more
scattered. This is exactly the scaling region that indicates the
Lyapunov exponent.

In the left-hand figure the scaling
region is clear, but in the right-hand figure, while the
region is still present it is significantly less pronounced. I found
the slope of both these regions: isolated = 0.31881 and sequential =
0.27798. To do these calculations, I took the average slope of the
scaling regions for all the different parameters.
The next figures shows lines fitted to the scaling regions
for both the isolated and sequential bike events.

Next we look at the capacity dimension results, which can be seen
here.

In both plots we clearly see an
asymptotically invariant region where the points are clustered around
a line before dropping off. In fact, both plots are very similar. To
find the capacity dimension, I fitted a line to these points (only the
asymptotically invariant ones) and found where that line interesects
that y-axis. This happens at approximately , for both plots,
therefore I estimated the capacity dimension to be 1 in both cases.
This does not indicate a fractal nature to the time series. And
while it does not support the claim of a chaotic system, it does not
necessarily detract from it either.

The above illustrates the process I used for the entirety of this
work. I do not detail the other investigations in this way (isolated
run vs. sequential run, training vs. racing, etc), but present the
results in Section 22.

# Results

(22)

## Isolated vs. Sequential

(23)

I have shown the isolated vs. sequential results in
Section 7, for the biking case.
The next images present the
results for the running case. Here the scaling region in the Lyapunov
calculation is significantly less pronounced. In fact, I believe it
is not actually present, indicating the maximal Lyapunov exponent is
not positive. Also the capacity dimension calculations are similar to
what was calculated above: . Again, this result does not
support the evidence of a chaotic system. However, for these running
examples, there has been no strong evidence of a chaotic system.

Here we have the sequential results:

## Training vs. Racing

(30)

For this case I am displaying results from biking events. This is for
two reasons: first, the biking events are longer and provide more data,
and second, the batteries on my heart rate device sometimes do not
last the whole race, which means they truncate the running data. We
compare the 3 hour 30 minute bike presented previously against data
collected during the bike event at the Boulder Half Ironman. This
case provides significant information, because these events were
completed on the same bike course, removing some variables. However,
the race data is, by definition, sequential. All race data is
sequential, therefore, I am unfortunately unable to prevent the
introduction of this new variable. Race results are shown in the next figure.
Figure~33.

From this figure, we clearly see a scaling region
similar to the isolated bike event. This is strong evidence that
there is a positive Lyapunov exponent. The capacity dimension is also
very similar to the previous results, and indicates a capacity
dimension, .

## Early vs Late in the Training Cycle

(34)

One strategy for endurance training is the concept of periodicty.
This a loose organization of workouts that gradually build in duration
and intensity for a period of several weeks. At the end of this
period, both the duration and intensity are signficantly reduced
allowing the athelte some recovery before starting the next cycle. To
investigate the heart rate dynamics that occur in different places in
this cycle I compared the same 3 hour 30 minute bike ride, which
occured just a few weeks before a half-Ironman race, against the
very first bike ride of this year's season, a 2 hour ride. The
calculations of the 2 hour ride are depicted here.

These results are, again, very similar to the isolated bike
investigation. We see a pretty clearly defined scaling region
indicating a positive maximial Lyapunov exponent, and the capacity
dimension still appears to be around one.

# Dicussion

(38)

## Dealing with Real Data

(39)

One of the main challenges of this work was working with real, actual
data. For example, for all the above algorithms and ideas to work,
the assumption is made that the data are sampled at evenly spaced
points. Unfortunately, this is not the case with the device I used.
Therefore I attempted to use inter-spike interval
embedding [7]. Inter-spike interval embedding
basically states that if your data events are the result of some
integral-like action, firing of a neuron and a heart beat are good
examples, than you can simply use the difference in time of the firing
of the events. My device does not report when the heart actually
beats, it reports (as far as I can tell) a running average of the last
several seconds. This value is not the result of some integral like
process and therefore inter-spike interval embedding is not applicable.
The only other thing I could do was to interpolate the points at
regular intervals. This is usually a pretty bad idea, but because I
am at the mercy of the data and the sensor, I had no other choice. I
would also like to point out that because I did use interpolation, the
reported results are cast under a shadow of uncertainty.

Another issue I ran into was the manual intervention required at many
of the steps. For example, all of the algorithms I used make the
assumption that the trajectory is tending towards infinity. To say
that another way, the algorithms do not operate on the transient part
of the trajectory. But, my data definitely has a transient portion,
while the heart rate climbs to an endurance level. Therefore, I had
to manually remove the transient from each data file. This was a
time-consuming and tedious process.

Lastly, for the algorithms to work correctly they must operate on the
same regime of data. But many of my longer training sessions have
breaks, or in several workouts during the summer when it is very hot
outside, I am forced to walk during a run, causing my heart rate to
drop and leave the regime I am trying to investigate. Stopping or
resting also introduces another transient period, which I would need
to remove. I did not use data files where multiple regimes were
present. This limited my choices, especially for longer workouts.

## Selecting Data

(40)

Before calculating embeddings, Lyapunov exponents, and capacity
dimensions, I first selected specific workouts to analyze. Selecting
these workouts was a somewhat arbitrary task. I have over two years
of data, hundreds of hours and thousands of miles. However, as
mentioned in Section 39, many of the data files contained
problems, such as transient trajectories and regime changes.
Therefore, I tried to choose the data files that best fit the
requirements of the algorithms. I picked longer time series that did
not contain rest. I also removed the initial transient trajectories.
This is not to say that I chose files that would support my claim of
chaos more than others. Longer files with no-regime changes were
given preference.

## Data Chunks

(41)

To further validate the results, I also ran the same calculations on
portions of the sequential bike data. The idea is that if we analyze,
say, the middle two-thirds of the data, we should get similar results.
The next figure shows the results of doing just that. The
average slope in this is calculated to be 0.12842, and while this is a
different number it is still positive and similar. Also the plot does
not change much. This further strengthens our confidence in the
positve maximal Lyapunov exponent.

# Conclusion

(43)

I have shown that for biking data there is evidence of a positive
maximal Lyapunov exponent in all test scenarios: isolated
vs. sequential, racing vs. training, and early vs. late season. In
that way, my intial hypothesis was incorrect in that the dynmacics
were chaotically similar. All of the running data, across all
regimes, showed little evidence for chaos. The capacity dimensions
for all cases provided little support because the calculations were
all approximately one. I also looked at operating on different chunks
of data, and this further strengthened the case for a positive
Lyapunov exponent. There are many issues in this work, not least of which is
the interpolated data and the amount of data. Future work should
focus on a better data collection mechanism while also focusing on a
method to collect more data.

# References

[1] Richard T Baillie, Aydin A Cecen, and Cahit Erkal. Normal heartbeat series are nonchaotic, nonlinear, and multifractal: New evidence from semiparametric and parametric tests. Chaos: An Interdisciplinary Journal of Nonlinear Science, 19(2):028503–028503, 2009.

[2] Rainer Hegger, Holger Kantz, and Thomas Schreiber. Practical implementation of nonlinear time series methods: The tisean package. Chaos: An Interdisciplinary Journal of Nonlinear Science, 9(2):413–435, 1999.

[3] Plamen Ch Ivanov, Lu ́ıs A Nunes Amaral, Ary L Goldberger, Shlomo Havlin, Michael G Rosenblum, Zbigniew R Struzik, and H Eugene Stanley. Multifractality in human heartbeat dynamics. Nature, 399(6735):461–465, 1999.

[4] Jørgen K Kanters, Niels-Henrik Holstein-Rathlou, and Erik Agner. Lack of evidence for low-dimensional chaos in heart rate variability. Journal of Cardiovascular Electrophysiology, 5(7):591–601, 1994.

[5] Kenneth D Kochanek, Jiaquan Xu, Sherry L Murphy, Arialdi M Minin ̃o, and Hsiang-Ching Kung. National vital statistics reports. National Vital Statistics Reports, 59(4):1, 2011.

[6] Roberto Sassi, Maria Gabriella Signorini, and Sergio Cerutti. Multifractality and heart rate variability. Chaos: An Interdisciplinary Journal of Nonlinear Science, 19(2):028507–028507, 2009.

[7] Tim Sauer. Interspike interval embedding of chaotic signals. Chaos: An Interdisciplinary Journal of Nonlinear Science, 5(1):127–132, 1995.

[8] Floris Takens. Detecting strange attractors in turbulence. In Dynamical systems and turbulence, Warwick 1980, pages 366–381. Springer, 1981.

[9] Niels Wessel, Maik Riedl, and Ju ̈rgen Kurths. Is the normal heart rate chaotic due to respiration? Chaos: An Interdisciplinary Journal of Nonlinear Science, 19(2):028508–028508, 2009.