Machine Learning

Classification: Get the Balance Right

For classification problems the positive class (which is what you’re normally trying to predict) is often sparsely represented in the data. Unless you do something to address this imbalance then your classifier is likely to be rather underwhelming. Achieving a reasonable balance in the proportions of the target classes is seldom emphasised. Perhaps it’s not very sexy. But it can have a massive effect on a model.

Clustering Time Series Data

I have been looking at methods for clustering time domain data and recently read TSclust: An R Package for Time Series Clustering by Pablo Montero and José Vilar. Here are the results of my initial experiments with the TSclust package.

Zeynep Tufekci: Machine intelligence and human morals

fast-neural-style: Real-Time Style Transfer

I followed up a reference to fast-neural-style from Twitter and spent a glorious hour experimenting with this code. Very cool stuff indeed. It’s documented in Perceptual Losses for Real-Time Style Transfer and Super-Resolution by Justin Johnson, Alexandre Alahi and Fei-Fei Li. The basic idea is to use feed-forward convolutional neural networks to generate image transformations. The networks are trained using perceptual loss functions and effectively apply style transfer. What is “style transfer”?

Talks about Bots

Seth Juarez and Matt Winkler having an informal chat about bots. Matt Winkler talking about Bots as the Next UX: Expanding Your Apps with Conversation at the Microsoft Machine Learning & Data Science Summit (2016). At the confluence of the rise in messaging applications, advances in text and language processing, and mobile form factors, bots are emerging as a key area of innovation and excitement. Bots (or conversation agents) are rapidly becoming an integral part of your digital experience: they are as vital a way for people to interact with a service or application as is a web site or a mobile experience.

Anthony Goldbloom: The jobs we'll lose to machines

The Next Rembrandt

Creating The Next Rembrandt: using data to touch the human soul. How a team from ING, Microsoft, TU Delft, Mauritshuis and Rembrandthuis used technology to synthesise a painting in the style of the Dutch master, Rembrandt, almost 350 years after his death.

Review: Data Mining with Rattle and R

Review: Machine Learning with R Cookbook

“Machine Learning with R Cookbook” by Chiu Yu-Wei is nothing more or less than it purports to be: a collection of 110 recipes for applying Data Analysis and Machine Learning techniques in R. I was asked by the publishers to review this book and found it to be an interesting and informative read. It will not help you understand how Machine Learning works (that’s not the goal!) but it will help you quickly learn how to apply Machine Learning techniques to you own problems.