Check out R-bloggers for more excellent content!

Neha Narula: The future of money

2016-09-20     Blockchain TED Talk

View POST Data using Chrome Developer Tools

2016-09-19     Web Scraping

When figuring out how to formulate the contents of a POST request it’s often useful to see the “typical” fields submitted directly from a web form. Open Developer Tools in Chrome. Select the Network tab (at the top). Submit the form. Watch the magic happening in the Developer Tools console. Click on the first document listed in the Developer Tools console, then select the `Headers` tab. That’s just scratching the surface of the wealth of information available on the Network tab. Read more »

Deleting All Nodes and Relationships

2016-09-15     Neo4j

Seems that I am doing this a lot: deleting my entire graph (all nodes and relationships) and rebuilding from scratch. I guess that this is part of the learning process. Route 1: Delete Relationships then Nodes A relationship is constrained to join a start node to an end node. Every relationship must be associated with at least one node (a relationship may begin and end on the same node). No such constraint exists for nodes. Read more »

Running Cypher Queries from File on Windows

2016-09-14     Neo4j

Recent packages of Neo4j for Windows do not include neo4j-shell. The Neo4j browser will only accept one statement at a time, making scripts consisting of multiple Cypher commands a problem. Read more »

Remote Access to Neo4j on Windows

2016-09-13     Neo4j

Installing Neo4j on Ubuntu 16.04

2016-09-06     Neo4j Linux

Some instructions for installing Neo4j on Ubuntu 16.04. More for my own benefit than anything else. Installing Java Neo4j is implemented in Java, so you’ll need to have the Java Runtime Environment (JRE) installed. If you already have this up and running, go ahead and skip this step. sudo apt install default-jre default-jre-headless Check whether you can now run the java executable. java If that works for you, great! It didn’t immediately work on one of my machines. Read more »

James Veitch: When you reply to spam email

2016-09-05     TED Talk

PLOS Subject Keywords: Association Rules

2016-09-01     R Association Rules

In a previous post I detailed the process of compiling data on subject keywords used in articles published in PLOS journals. In this instalment I’ll be using those data to mine Association Rules with the arules package. Good references on the topic of Association Rules are Section 14.2 of The Elements of Statistical Learning (2009) by Hastie, Tibshirani and Friedman; and Introduction to arules by Hahsler, Grün, Hornik and Buchta. Read more »

ubeR: A Package for the Uber API

2016-08-31     R

Uber exposes an extensive API for interacting with their service. ubeR is a R package for working with that API which Arthur Wu and I put together during a Hackathon at iXperience. Installation The package is currently hosted on GitHub. Installation is simple using the devtools package. > devtools::install_github("DataWookie/ubeR") > library(ubeR) Authentication To work with the API you’ll need to create a new application for the Rides API. Set Redirect URL to http://localhost:1410/. Read more »

Talks about the Blockchain

2016-08-29     Blockchain TED Talk

Finally educating myself about the blockchain. These videos are a good place to start. Don Tapscott: How the blockchain is changing money and business Bettina Warburg: How the blockchain will radically transform the economy

PLOS Subject Keywords: Gathering Data

2016-08-24     R Association Rules Collaborative Filtering

I’m putting together a couple of articles on Collaborative Filtering and Association Rules. Naturally, the first step is finding suitable data for illustrative purposes. Read more »

Sportsbook Betting (Part 3): Evolving Odds

2016-08-23     R Gambling

In previous instalments in this series I have not taken into account how odds can change over time. Read more »

Garmin ANT on Ubuntu

2016-08-22     Linux

I finally got tired of booting up Windows to download data from my Garmin 910XT. I tried to get my old Ubuntu 15.04 system to recognise my ANT stick but failed. Now that I have a stable Ubuntu 16.04 system the time seems ripe. openant Install openant, a Python library for downloading and uploading files from ANT-FS compliant devices. Download the zip file from Unpack the archive and install using $ sudo python setup. Read more »

Anthony Goldbloom: The jobs we'll lose to machines

2016-08-22     Machine Learning TED Talk

Sportsbook Betting (Part 2): Bookmakers' Odds

2016-08-10     R Gambling

In the first instalment of this series we gained an understanding of the various types of odds used in Sportsbook betting and the link between those odds and implied probabilities. We noted that the implied probabilities for all possible outcomes in an event may sum to more than 100%. At first sight these seems a bit odd. It certainly appears to violate the basic principles of statistics. However, this anomaly is the mechanism by which bookmakers assure their profits. Read more »

Animated Mortality

2016-08-09     R

feedeR: Reading RSS and Atom Feeds from R

2016-08-08     R

I’m working on a project in which I need to systematically parse a number of RSS and Atom feeds from within R. I was somewhat surprised to find that no package currently exists on CRAN to handle this task. So this presented the opportunity for a bit of DIY. You can find the fruits of my morning’s labour here. Read more »

Web Scraping and "invalid multibyte string"

2016-08-02     R Web Scraping

A couple of my collaborators have had trouble using read_html() from the xml2 package to access this Wikipedia page. Specifically they have been getting errors like this: Read more »

John Green: The Nerd's Guide to Learning Everything Online

2016-08-02     TED Talk

99% of my learning in the last decade has happened online, so this resonates with me.

Sportsbook Betting (Part 1): Odds

2016-08-01     R Gambling

This series of articles was written as support material for Statistics exercises in a course that I’m teaching for iXperience. In the series I’ll be using illustrative examples for wagering on a variety of Sportsbook events including Horse Racing, Rugby and Tennis. The same principles can be applied across essentially all betting markets. Read more »

Arthur Benjamin: Teach statistics before calculus!

2016-07-29     TED Talk Teaching

Arthur Benjamin thinks that the end goal of teaching Mathematics at school should be Statistics rather than Calculus. He has a point: in terms of understanding things in the real world, Statistics is definitely more powerful. These ideas are quite compatible with those of Conrad Wolfram, who thinks that we should be using computers more extensively in Mathematics education. Read more »

Building a Life Table

2016-07-28     R

Calculating Pi using Buffon's Needle

2016-07-26     R

Conrad Wolfram: Teaching kids real math with computers

2016-07-25     TED Talk Teaching

Conrad Wolfram gives a thought provoking talk on a different way to teach Mathematics in schools. Read more »

Mortality by Year and Age

2016-07-22     R

Taking another look at the data from the lifespan package. Plot below shows the evolution of mortality in the US as a function of year and age. Read more »

Life Expectancy by Country

2016-07-20     R

I was rather inspired by this plot on Wikipedia’s List of Countries by Life Expectancy. Read more »

Mortality Rate by Age

2016-07-19     R

Working further with the mortality data from, I’ve added a breakdown of deaths by age and gender to the lifespan package on GitHub. Read more »

Escalating Life Expectancy

2016-07-18     R

I’ve added mortality data to the lifespan package. A result that immediately emerges from these data is that average life expectancy is steadily climbing. Read more »

Birth Month by Gender

2016-07-16     R

Based on some feedback to a previous post I normalised the birth counts by the (average) number of days in each month. As pointed out by a reader, the results indicate a gradual increase in the number of conceptions during (northern hemisphere) Autumn and Winter, roughly up to the end of December. Normalising the data to give births per day also shifts the peak from August to September. Read more »

Most Probable Birth Month

2016-07-15     R

In a previous post I showed that the data from support Malcolm Gladwell’s contention that more professional baseball players are born in August than any other month. Although this might be explained by the 31 July cutoff for admission to baseball leagues, it was suggested that it could also be linked to a larger proportion of babies being born in August. Read more »