Data Science Weekly - Issue 164
Issue #164 Jan 12 2017
Editor Picks
TensorKart: self-driving MarioKart with TensorFlow
This winter break, I decided to try and finish a project I started a few years ago: training an artificial neural network to play MarioKart 64. It had been a few years since I’d done any serious machine learning, and I wanted to try out some of the new hotness (aka TensorFlow) I’d been hearing about. The timing was right...
Machine-Learning Algorithm Identifies Tweets Sent Under the Influence of Alcohol
An analysis of tweeting-while-drinking reveals patterns of alcohol-related behavior in unprecedented detail...
Five 2016 Trends We Expect to Come to Fruition in 2017
The start of a new year is an excellent occasion for audacious extrapolation. Based on 2016 developments, what do we expect for 2017? This blog post covers five prominent trends: Deep Learning Beyond Cats, Chat Bots - Take Two, All the News In The World - Turning Text Into Action, The Proliferation of Data Roles, and What Are You Doing to My Data?...
A Message from this week's Sponsor: Yhat
Rodeo: A Python IDE for Data Science
Doing data science on a Windows machine? Rodeo IDE ships with Python (Miniconda) included, so you don't have to go through the painful installation!
Also available for Mac & Linux.
Data Science Articles & Videos
Poker Is the Latest Game to Fold Against Artificial Intelligence
Two research groups have developed poker-playing AI programs that show how computers can out-hustle the best humans...
Analyzing Emotions using Facial Expressions in Video with Microsoft AI and R
The Emotion API uses Deep Convolutional Neural Network based model that has been trained by a number of images that were pre-labeled with universal expressions. We thought this was super cool and wanted to give it a try for ourselves. The original post was using Python partially, but we couldn’t see any reason why we couldn’t do all in R, so one of our team member, Yosuke, has quickly taken the original code and translated it all in R...
Was 2016 especially dangerous for celebrities? An empirical analysis.
It’s become cliché that unusually many prominent people died in 2016. Is this true? To answer this we need to know: (The easy part) What is unusually many? (The hard part) What is a celebrity?...
My Experience as a Freelance Data Scientist
Every so often, data scientists who are thinking about going off on their own will email me with questions about my year of freelancing (2015). In my most recent response, I was a little more detailed than usual, so I figured it'd make sense as a blog post too...
Is Google Hyping it?
Why Deep Learning cannot be Applied to Natural Languages Easily
Neural networks (NNs), recently referred to as deep learning, only work "effectively" with data that is produced from a process of a continuous function. My article should actually stop here with one sentence. However, there is so much hype, sadly, keeping the entire AI industry busy, not to mention some announcements from big players like Google and IBM. Not knowing what they are doing exactly forces us to give them the benefit of the doubt for now. Nevertheless, NNs are not a natural fit for natural languages and knowledge representation as I explained below in layman's terms...
king - man + woman is queen; but why?
word2vec is an algorithm that transforms words into vectors, so that words with similar meaning end up laying close to each other. Moreover, it allows us to use vector arithmetics to work with analogies, for example the famous king - man + woman = queen. I will try to explain how it works, with special emphasis on the meaning of vector differences, at the same time omitting as many technicalities as possible...
An economics analogy for why adversarial examples work
One of the most interesting results from “Explaining and Harnessing Adversarial Examples” is the idea that adversarial examples for a machine learning model do not arise because of the supposed complexity or nonlinearity of the model, but rather because of high dimensionality of the input space. I want to take a stab at explaining “Explaining”’s result with an economics analogy. Take it with a grain of salt, since I have little to no formal training in either machine learning or economics. Let’s go!...
How to do an NLG Evaluation: Human Ratings in Artificial Context
The quickest, cheapest, and most common type of human NLG evaluation is to ask human subjects to rate NLG texts in an artificial context (ie, not in the context of actually using the texts in a real-world context). I give advice here on how to conduct such a study...
Jobs
Data Scientist - Airtime - New York We are pioneering a new social experience, designed for togetherness. It’s an intimate space for people to share conversations and content in real time. A place for us to truly be together. This is Airtime.
Our company was founded a few years ago by Sean Parker and Shawn Fanning and is backed by Kleiner Perkins, Andreessen Horowitz, Google Ventures, Founders Fund, and a host of other amazing partners. Airtime is built on some amazing new technology crafted by a world-class team of brainiacs in Palo Alto and New York City.
We're well-funded, running at full sprint, and looking for extraordinary people to join us on this exciting adventure!...
Training & Resources
A comprehensive introduction to data wrangling
You may have heard the term data wrangling before. This example-filled guide will help you understand what exactly it is, and how you can start doing some data wrangling yourself, with plenty of code examples for you to follow along...
What every Python project should have
In this article I am going to provide a short list of items every Python project should have in order to be accessible and maintainable...
Rules of Machine Learning: Best Practices for ML Engineering
This document is intended to help those with a basic knowledge of machine learning get the benefit of best practices in machine learning from around Google...
Books
Weapons of Math Destruction "A former Wall Street quant sounds an alarm on the mathematical models that pervade modern life — and threaten to rip apart our social fabric"...
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S. Interested in reaching fellow readers of this newsletter? Consider sponsoring! Email us for details :) - All the best, Hannah & Sebastian