Data Science Weekly - Issue 97
Issue #97 October 1 2015
Editor Picks
The Internet Knows If You’ll Be Dead
These authors used three years of electronic health record data to derive a predictive Bayesian network for patient status. Its scope: home, hospitalized, or dead. There are many simple models for predicting such things, but this one is interesting because it attempts to utilize multiple patient features, vital signs, and laboratory results in a continuously updating algorithm. Ultimately, their model was capable of predicting outcomes up through one week from the initial hospitalization event...
Google DeepMind Artificial Intelligence can beat Humans at 31 video games but can't master Pac-Man
Google-owned artificial intelligence start-up DeepMind has revealed that its deep learning software is now able to outperform humans in 31 different video games. The algorithm, which uses reinforcement learning to master the games, has been described as the "first significant rung of the ladder" towards proving such a system can work, and a significant step towards use in real-world applications...
Classifying Steps with Machine Learning at Jawbone
When we first began to explore the idea of building a step classifier, we knew we would be constrained to a very limited population of individuals (Jawbone employees) available to us for early development and testing...
A Message from this week's Sponsor: HipChat
Bring Your Team To Life
Make HipChat Your Collaboration Command Center. Group Chat, Video Chat & Screensharing. $0/Unlimited Users. Get Started >>
Data Science Articles & Videos
Hitachi Says it can Predict Crimes Before They Happen
Not quite Minority Report, but monitoring everything from the weather to Twitter may be able to detect where and when crime will occur...
The fourth generation of machine learning: Adaptive learning
The fourth generation of machine intelligence, adaptive learning, creates the first truly integrated human and machine learning environment...
Liberty Mutual Property Inspection, Winner's Interview: Qingchen Wang
The hugely popular Liberty Mutual Group: Property Inspection Prediction competition wrapped up on August 28, 2015 with Qingchen Wang at the top of a crowded leaderboard. A total of 2,362 players on 2,236 teams competed to predict how many hazards a property inspector would count during a home inspection...
How machine learning startups are defining the fintech market landscape
There is something extremely intriguing about the way startups are disrupting the fintech space. The likely anticipation is that a new unicorn might emerge from anywhere and anytime...
Computer scientist receives $625K for work that helps find human traffickers
For Christopher Re, a computer scientist at Stanford University, is taking big data to a whole new level. He's building powerful data-processing programs that are open for anyone to use for anything — from tracking down human traffickers to analyzing genes...
Optimizing RNN performance
This is part I of a multi-part series detailing some of the techniques we've used here at Baidu's Silicon Valley AI Lab to accelerate the training of recurrent neural networks. This part focuses on GEMM performance...
Statistics Without the Agonizing Pain
There are two essential skills for the data scientist: engineering and statistics. A great many data scientists are very strong engineers but feel like impostors when it comes to statistics. In this talk John will argue that the ability to program a computer gives you special access to the deepest and most fundamental ideas in statistics. John’s goal is to convince the non-statistician engineers in the audience that the road to statistical fluency is much, much shorter than they think...
Google voice search: faster and more accurate
Today, we’re happy to announce we built even better neural network acoustic models using Connectionist Temporal Classification (CTC) and sequence discriminative training techniques. These models are a special extension of recurrent neural networks (RNNs) that are more accurate, especially in noisy environments, and they are blazingly fast!...
The sorry state of football analytics
The sad truth of the matter is that the state of football analytics in 2015 is not good and isn't showing signs of improving. This is especially true in the NFL, though I think a lot of this applies to college football as well. The body of football research is not advancing with the same rate and is not of the same quality as in basketball, baseball, or hockey...
Jobs
VP Data Science - The Weather Company - NYC We are The Weather Company, and our name speaks for itself. We are a company focused entirely on the weather,and we’re proud to say we reach two-thirds of all U.S. adults through a media portfolio that includes The Weather Channel, weather.com and our mobile applications –Weather Services International, and Weather Underground. The VP of Data Science will lead a team of data scientists, engineers, and product managers working on massive sets of weather, location, and audience data. The VP will define data science strategy across all of our advertising products, and own building and implementing proprietary methods in the areas of audience targeting, ad attribution, mobile and location-based targeting, and performance tracking across all of our ad products. ...
Training & Resources
Recurrent Neural Networks Tutorial, Part 2 – Implementing a RNN with Python, Numpy and Theano
In this part we will implement a full Recurrent Neural Network from scratch using Python and optimize our implementation using Theano, a library to perform operations on a GPU. The full code is available on Github. I will skip over some boilerplate code that is not essential to understanding Recurrent Neural Networks, but all of that is also on Github...
How do neural networks learn?
To help understand how neural networks learn, I built a visualization of a network at the neuron level, including animations that show how it learns...
Large-scale CelebFaces Attributes (CelebA) Dataset
CelebFaces Attributes Dataset (CelebA) is a large-scale face attributes dataset with more than 200K celebrity images, each with 40 attribute annotations...
Books
The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World New release...
"With terms like ‘Machine Learning’ and ‘Big Data’ regularly making headlines, there is no shortage of hype-filled business books on the subject. There are also textbooks that are too technical to be accessible. For those in the middle—from executives to college students—this is the ideal book, showing how and why things really work without the heavy math..."
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S. Interested in reaching fellow readers of this newsletter? Consider sponsoring! Email us for details :) - All the best, Hannah & Sebastian