Are Green Number Runners More Likely to Bail?

2013-06-22     R running

Comrades Marathon runners are awarded a permanent green race number once they have completed 10 journeys between Durban and Pietermaritzburg. For many runners, once they have completed the race a few times, achieving a green number becomes a possibility. And once the idea takes hold, it can become something of a compulsion. I can testify to this: I am thoroughly compelled! For runners with this goal in mind, every finish is one step closer to a green number. Read more »

The Green Number Effect

2013-06-18     R running

Following up on a suggestion from my previous post, here are the statistics for medal count versus age. Every point on the plot is the number (see colour legend on right) of athletes who have achieved a given number of medals by a particular age. Read more »

Age Distribution of Comrades Marathon Athletes

2013-06-18     R running

I can clearly remember watching the end of the 1989 Comrades Marathon on television and seeing Wally Hayward coming in just before the final gun, completing the epic race at the age of 80! I was in awe. Since I have been delving into the Comrades Marathon data, this got me thinking about the typical age distribution of athletes taking part. The plot below indicates the ages of athletes who finished the race, going all the way back to 1984. You can clearly spot the two years when Wally Hayward ran (1988 and 1989). My data indicates that he was only 79 on the day of the 1989 Comrades Marathon, but I am not going to quibble over a year and I am more than happy to accept that he was 80! Read more »

Kagi Chart Indicator


In addition to a range of data analysis services, Exegetic Analytics also implements algorithms for automated FOREX trading. I am currently developing an expert advisor (EA) for a client. The strategy was developed on the ProRealTime charting software using Kagi Charts. My client wants to automate the strategy and implement it in MQL on the MetaTrader platform. One snag: Kagi Charts are independent of time. Or, more accurately, they do not have a uniform time axis. Charts in MetaTrader are of the classical variety with a nice linear time axis. So my first problem was to implement something analogous to the Kagi Chart under MetaTrader. Read more »

Medal Allocations at the Comrades Marathon

2013-06-09     R running

Comrades Marathon Attrition Rate

2013-06-07     R

It is a bit of a mission to get the complete data set for this year’s Comrades Marathon. The full results are easily accessible, but come as an HTML file. Embedded in this file are links to the splits for individual athletes. So with a bit of scripting wizardry it is also possible to download the HTML files for each of the individual athletes. Parsing all of these yields the complete result set, which is the starting point for this analysis. Read more »

Analysis of Cable Morning Trade Strategy

2013-05-29     R

A couple of years ago I implemented an automated trading algorithm for a strategy called the “Cable Morning Trade”. The basis of the strategy is the range of GBPUSD during the interval 05:00 to 09:00 London time. Two buy stop orders are placed 5 points above the highest high for this period; two sell stop orders are placed 5 points below the lowest low. All orders have a protective stop at 40 points. When either the buy or sell orders are filled, the other orders are cancelled. Of the filled orders, one exits at a profit equal to the stop loss, while the other is left to run until the close of the London session. Read more »

Package MatchIt: Balancing experimental data

2013-05-23     R

A balanced experimental design is one in which the distribution of the covariates is the same in both the control and treatment groups. However, although achievable in an experimental scenario, for observational data this ideal is seldom attained. The MatchIt package provides a means of pre-processing data so that the treated and control groups are as similar as possible, minimising the dependence between the treatment variable and the other covariates. Read more »

xkcd Style Bubble Plot

2013-05-23     R

A package was recently released to generate plots in the style of xkcd using R. Being a big fan of the cartoon, I could not resist trying it out. So I set out to produce something like one of Hans Rosling’s bubble plots. Read more »

Swing Alert Indicator


I’ve just finished coding a swing alert indicator for a client. The rules are rather straightforward and it all depends on two simple moving averages (by default with periods of 25 and 5). The indicator generates alerts via Read more »

Package party: Conditional Inference Trees

2013-05-21     R

I am going to be using the party package for one of my projects, so I spent some time today familiarising myself with it. The details of the package are described in Hothorn, T., Hornik, K., & Zeileis, A. (1999). “party: A Laboratory for Recursive Partytioning” which is available from CRAN. Read more »

Plotting categorical variables

2013-05-20     R

In the previous installment we generated a few plots using numerical data straight out of the National Health and Nutrition Examination Survey. This time we are going to incorporate some of the categorical variables into the plots. Although going from raw numerical data to categorical data bins (like we did for age and BMI) does give you less precision, it can make drawing conclusions from plots a lot easier. We will start off with a simple plot of two numerical variables: age against BMI. Read more »

Plotting numerical variables

2013-05-18     R

In the previous installment we generated some simple descriptive statistics for the National Health and Nutrition Examination Survey data. Now we are going to move on to an area in which R really excels: making plots and visualisations. There are a variety of systems for plotting in R, but we will start off with base graphics. Read more »

Descriptive Statistics

2013-05-18     R

Categorical Variables

2013-05-12     R

In the previous installment we sucked some data from the National Health and Nutrition Examination Survey into R and did some preliminary work: selecting only the fields of interest, renaming columns and removing missing data. Now we are going to play with some categorical data. There is already one categorical field in the data representing gender. However, the labels are not ideal: > head(DS0012) id gender age mass height BMI 1 41475 2 62 138. Read more »

Loading Data

2013-05-12     R

I have just started preparing a series of talks aimed at introducing the use of R to a rather broad audience consisting of physicists, chemists, statisticians, biologists and computer scientists (plus a few other disciplines thrown in for good measure). I want to use a single consistent set of data throughout the talks. Finding something that would resonate with such a disparate set of people was quite a challenge. After playing around with a couple of options, I settled on using data for age, height and mass. Read more »

Support & Resistance Indicator


I was recently browsing through the variety of of MetaTrader indicators for support and resistance levels. None of them ticked all of my boxes. Either they were not aesthetically pleasing (making a mess of my pristine charts) or they failed to produce what I consider to be reasonable levels. So, embracing my pioneering spirit, I set out to fashion my own indicator, one which will ultimately tick all of my boxes! Read more »

Locations of Geosynchronous Satellites

2013-04-16     R

A year or so ago I went to a talk which included the diagram below. It shows the locations of the Earth’s fleet of geosynchronous satellites. According to the speaker, the information in this diagram was already quite dated: the satellites and their locations had changed. I decided to update the diagram using the locations of satellites from the list of geosynchronous satellites published on Wikipedia. Probably not the most definitive source of data on this subject, but it was a good starting point. Read more »


