Check out R-bloggers for more excellent content!

Are Green Number Runners More Likely to Bail?

2013-06-22     R running

Comrades Marathon runners are awarded a permanent green race number once they have completed 10 journeys between Durban and Pietermaritzburg. For many runners, once they have completed the race a few times, achieving a green number becomes a possibility. And once the idea takes hold, it can become something of a compulsion. I can testify to this: I am thoroughly compelled! For runners with this goal in mind, every finish is one step closer to a green number. Read more »

The Green Number Effect

2013-06-18     R running

Following up on a suggestion from my previous post, here are the statistics for medal count versus age. Every point on the plot is the number (see colour legend on right) of athletes who have achieved a given number of medals by a particular age. Read more »

Age Distribution of Comrades Marathon Athletes

2013-06-18     R running

I can clearly remember watching the end of the 1989 Comrades Marathon on television and seeing Wally Hayward coming in just before the final gun, completing the epic race at the age of 80! I was in awe. Since I have been delving into the Comrades Marathon data, this got me thinking about the typical age distribution of athletes taking part. The plot below indicates the ages of athletes who finished the race, going all the way back to 1984. You can clearly spot the two years when Wally Hayward ran (1988 and 1989). My data indicates that he was only 79 on the day of the 1989 Comrades Marathon, but I am not going to quibble over a year and I am more than happy to accept that he was 80! Read more »

Kagi Chart Indicator


In addition to a range of data analysis services, Exegetic Analytics also implements algorithms for automated FOREX trading. I am currently developing an expert advisor (EA) for a client. The strategy was developed on the ProRealTime charting software using Kagi Charts. My client wants to automate the strategy and implement it in MQL on the MetaTrader platform. One snag: Kagi Charts are independent of time. Or, more accurately, they do not have a uniform time axis. Charts in MetaTrader are of the classical variety with a nice linear time axis. So my first problem was to implement something analogous to the Kagi Chart under MetaTrader. Read more »

Medal Allocations at the Comrades Marathon

2013-06-09     R running

Comrades Marathon Attrition Rate

2013-06-07     R

It is a bit of a mission to get the complete data set for this year’s Comrades Marathon. The full results are easily accessible, but come as an HTML file. Embedded in this file are links to the splits for individual athletes. So with a bit of scripting wizardry it is also possible to download the HTML files for each of the individual athletes. Parsing all of these yields the complete result set, which is the starting point for this analysis. Read more »

Analysis of Cable Morning Trade Strategy

2013-05-29     R

A couple of years ago I implemented an automated trading algorithm for a strategy called the “Cable Morning Trade”. The basis of the strategy is the range of GBPUSD during the interval 05:00 to 09:00 London time. Two buy stop orders are placed 5 points above the highest high for this period; two sell stop orders are placed 5 points below the lowest low. All orders have a protective stop at 40 points. When either the buy or sell orders are filled, the other orders are cancelled. Of the filled orders, one exits at a profit equal to the stop loss, while the other is left to run until the close of the London session. Read more »

Package MatchIt: Balancing experimental data

2013-05-23     R

A balanced experimental design is one in which the distribution of the covariates is the same in both the control and treatment groups. However, although achievable in an experimental scenario, for observational data this ideal is seldom attained. The MatchIt package provides a means of pre-processing data so that the treated and control groups are as similar as possible, minimising the dependence between the treatment variable and the other covariates. Read more »

xkcd Style Bubble Plot

2013-05-23     R

A package was recently released to generate plots in the style of xkcd using R. Being a big fan of the cartoon, I could not resist trying it out. So I set out to produce something like one of Hans Rosling’s bubble plots. Read more »

Swing Alert Indicator


I’ve just finished coding a swing alert indicator for a client. The rules are rather straightforward and it all depends on two simple moving averages (by default with periods of 25 and 5). The indicator generates alerts via Read more »

Package party: Conditional Inference Trees

2013-05-21     R

I am going to be using the party package for one of my projects, so I spent some time today familiarising myself with it. The details of the package are described in Hothorn, T., Hornik, K., & Zeileis, A. (1999). “party: A Laboratory for Recursive Partytioning” which is available from CRAN. Read more »

Plotting categorical variables

2013-05-20     R

In the previous installment we generated a few plots using numerical data straight out of the National Health and Nutrition Examination Survey. This time we are going to incorporate some of the categorical variables into the plots. Although going from raw numerical data to categorical data bins (like we did for age and BMI) does give you less precision, it can make drawing conclusions from plots a lot easier. We will start off with a simple plot of two numerical variables: age against BMI. Read more »

Plotting numerical variables

2013-05-18     R

In the previous installment we generated some simple descriptive statistics for the National Health and Nutrition Examination Survey data. Now we are going to move on to an area in which R really excels: making plots and visualisations. There are a variety of systems for plotting in R, but we will start off with base graphics. Read more »

Descriptive Statistics

2013-05-18     R

Categorical Variables

2013-05-12     R

In the previous installment we sucked some data from the National Health and Nutrition Examination Survey into R and did some preliminary work: selecting only the fields of interest, renaming columns and removing missing data. Now we are going to play with some categorical data. There is already one categorical field in the data representing gender. However, the labels are not ideal: > head(DS0012) id gender age mass height BMI 1 41475 2 62 138. Read more »

Loading Data

2013-05-12     R

I have just started preparing a series of talks aimed at introducing the use of R to a rather broad audience consisting of physicists, chemists, statisticians, biologists and computer scientists (plus a few other disciplines thrown in for good measure). I want to use a single consistent set of data throughout the talks. Finding something that would resonate with such a disparate set of people was quite a challenge. After playing around with a couple of options, I settled on using data for age, height and mass. Read more »

Support & Resistance Indicator


I was recently browsing through the variety of of MetaTrader indicators for support and resistance levels. None of them ticked all of my boxes. Either they were not aesthetically pleasing (making a mess of my pristine charts) or they failed to produce what I consider to be reasonable levels. So, embracing my pioneering spirit, I set out to fashion my own indicator, one which will ultimately tick all of my boxes! Read more »

Locations of Geosynchronous Satellites

2013-04-16     R

A year or so ago I went to a talk which included the diagram below. It shows the locations of the Earth’s fleet of geosynchronous satellites. According to the speaker, the information in this diagram was already quite dated: the satellites and their locations had changed. I decided to update the diagram using the locations of satellites from the list of geosynchronous satellites published on Wikipedia. Probably not the most definitive source of data on this subject, but it was a good starting point. Read more »


Chapter 2 in “Digital Watermarking” in Mendeley. draft: true

0001-01-01 draft: true


Here’s an example. Make something simpler for illustration. draft: true movement_models <- function(method = c(“ets”, “arima”)) { # Build a closure which has all of the data. # models <- subscribers_long %>% group_by(offering, feature) %>% nest() %>% mutate( mod = map(data, function(df) { count_series = zoo(df$count, df$date) count_series = ts( coredata(count_series), start = c(lubridate::year(start(count_series)), lubridate::month(start(count_series))), end = c(lubridate::year(end(count_series)), lubridate::month(end(count_series))), frequency = 12 ) forecast::stlm(count_series, s.window = “periodic”, robust = FALSE, method = method) }) ) Read more »


Terminal draft: true screen tmux tree Text split {% highlight bash %} $ split –lines=50 foo.txt {% endhighlight %}


draft: true Metapixel {% highlight bash %} $ sudo apt-get install metapixel {% endhighlight %} Pixelize {% highlight bash %} $ sudo apt-get install pixelize {% endhighlight %} {% highlight bash %} {% endhighlight %} {% highlight bash %} {% endhighlight %} {% highlight bash %} {% endhighlight %} {% highlight bash %} {% endhighlight %} {% highlight bash %} {% endhighlight %} {% highlight bash %} {% endhighlight %}

0001-01-01 draft: true THIS HAS A USEFUL SECTION ON ENCODINGS: Identifying Encoding Using file you can get an idea of a file’s encoding. Here are some possibilities: ASCII text UTF-8 Unicode text UTF-8 Unicode (with BOM) text Non-ISO extended-ASCII text Simple Approach A reasonable initial approach is simply guessing. {% highlight bash %} $ iconv -f latin1 -t UTF-8 input-file.txt {% endhighlight %} ENCA enca is a useful tool for detecting a file’s encoding. Read more »

0001-01-01 draft: true Pushing to the Cloud Removing Containers Removing Images

0001-01-01 draft: true Use smaller base images (check out Alpine Linux!!)


Life Cycle draft: true docker create docker rename docker run docker rm docker update Misc docker cp docker export docker exec Start and Stop docker start docker stop docker restart docker pause docker unpause docker wait docker kill docker attach Info docker ps docker logs docker inspect docker events docker port docker top docker stats docker diff


draft: true Protip for @github that not too many people seem to know about: when viewing a diff, append `?w=1` to the URL to ignore all whitespace diffs


DISK ACTIVITY draft: true sudo iotop –only FINDING CONTENTS OF PACKAGE: apt-file list FINDING WHAT PACKAGE A FILE IS FROM: $ locate GL/glu.h /usr/include/GL/glu.h $ dpkg -S /usr/include/GL/glu.h libglu1-mesa-dev:amd64: /usr/include/GL/glu.h RESIZING WINDOW wmctrl -l wmctrl -i -r 0x032000c5 -e 0,0,0,1100,750 PACKAGES List available packages apt-cache pkgnames See SHELL Keeping Track of Processes top htop watch watch -n 5 free


Download the archive here. draft: true Processing 3 {% highlight bash %} $ sudo tar -zxf processing-3.3.6-linux64.tgz -C /opt {% endhighlight %} {% highlight bash %} $ sudo ln -s /opt/processing-3.3.6/processing /usr/local/bin/processing3 {% endhighlight %} Processing 2 {% highlight bash %} $ sudo tar -zxf processing-2.2.1-linux64.tgz -C /opt {% endhighlight %} {% highlight bash %} $ sudo ln -s /opt/processing-2.2.1/processing /usr/local/bin/processing2 {% endhighlight %} Adding Modes Press the dropdown in the top right corner. Read more »