Data Science Weekly - Issue 37
Issue #37 Aug 7 2014
Editor Picks
Recommending music on Spotify with deep learning
This summer, I’m interning at Spotify in New York City, where I’m working on content-based music recommendation using convolutional neural networks. In this post, I’ll explain my approach and show some preliminary results...
Predicting Supreme Court Rulings: Model Can Predict 7 out 10
Today, I am proud to announce the next evolution in Supreme Court prediction. Rather than relying on human predictions, my colleagues and I have developed an algorithm that can predict any case decided by the Supreme Court, since 1953, using only information available at the time of the cert grant...
Extracting audio from visual information: Algorithm recovers speech from the vibrations of a potato-chip bag filmed through soundproof glass.
Researchers at MIT, Microsoft, and Adobe have developed an algorithm that can reconstruct an audio signal by analyzing minute vibrations of objects depicted in video. In one set of experiments, they were able to recover intelligible speech from the vibrations of a potato-chip bag photographed from 15 feet away through soundproof glass...
Data Science Articles & Videos
Create your own machine learning powered RSS reader in under 30 minutes
I recently discovered SkimFeed, which I love and call my “dashboard into nerd-dom,” basically it is a single view of the major tech sites’ titles. However, I wanted more information on each article before I decided to click on one, so I thought: Why not use text analysis algorithms as a more efficient way of consuming my feeds?...
Why a deep-learning genius left Google & joined Chinese tech shop Baidu (interview)
The strength of Baidu lies not in youth-friendly marketing or an enterprise-focused sales team. It lives instead in Baidu’s data centers, where servers run complex algorithms on huge volumes of data and gradually make its applications smarter, including not just Web search but also Baidu’s tools for music, news, pictures, video, and speech recognition...
Self-editing video automatically cuts out the boring bits
LiveLight uses machine learning and algorithms to automatically edit video reels into a montage of the most interesting clips...
KCBO – A Bayesian Data Analysis Toolkit
The goal of KCBO is to provide an easy to use, Bayesian framework to the masses...
How do I become a data scientist? An evaluation of 3 alternatives
One of the most frequent questions we hear, right behind “so, what exactly is a data scientist” or “what makes a great data scientist”, is “how do I become one? I should probably just get a Master’s, right?” Perhaps not anymore...
How ‘Game of Thrones’ Will Predict the Next Bin Laden
How do you predict the terror leader of the future? In sort of the same way you can predict what happens next on Game of Thrones, applied statistics...
Learning from Kaggle Masters
As a part of my master's thesis on competitive machine learning, I talked to a series of Kaggle Masters to try to understand how they were consistently performing well in competitions...
A Multiresolution Stochastic Process Model for Predicting Basketball Possession Outcomes
Basketball games evolve continuously in space and time as players constantly interact with their teammates, the opposing team, and the ball. However, current analyses of basketball outcomes rely on discretized summaries of the game that reduce such interactions to tallies of points, assists, and similar events. In this paper, we propose a framework for using optical player tracking data to estimate, in real time, the expected number of points obtained by the end of a possession. ...
Interview: Thomas Levi, PlentyOfFish - What Big Data tells us about Romance
We discuss interesting research on the state of romance in US, how PlentyOfFish is managing competition, personal journey from String Theory to Data Science, career advice and more...
Linear Discriminant Analysis bit by bit
I received a lot of positive feedback about the step-wise Principal Component Analysis (PCA) implementation. Thus, I decided to write a little follow-up about Linear Discriminant Analysis (LDA) — another useful linear transformation technique...
Jobs
Data Scientist - Dominos - Ann Arbor, MI (relocation provided) Data Scientists at Domino's Pizza don't just crunch numbers...they view the universe as one large data set, and they decipher relationships and high level insights from that mass of information. The analytics they develop are then used across the organization to guide decisions, predict outcomes, and develop a quantitative ROI...
Training & Resources
R Documentation
Rdocumentation is a tool that helps you easily find and browse the documentation of all current and some past packages on CRAN (6741 R packages and 137527 R functions)...
Neural Networks and Deep Learning
Free online book...
Introduction to Recommender Systems: A 4-hour lecture
A couple of weeks ago, I gave a 4 hour lecture on Recommender Systems at the 2014 Machine Learning Summer School at CMU. The school was organized by Alex Smola and Zico Kolter and, judging by the attendance and the quality of the speakers, it was a big success. ...
Books
What is a p-value anyway?
34 Stories to Help You Actually Understand Statistics Statistics and humor!...
"This book is a must read for statistics student or teachers at any level and speaks directly to the "I just don't' get it" thought that so many of have experienced. After reading this book-you will get it. In a casual and easy to read manner, Vickers reviews statistical concepts and ideas that have puzzled students for years and explains them in a way that is easy to grasp..."
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S. Enjoyed the newsletter? Please forward it to friends and peers - we'd love to have them onboard too :-) - All the best, Hannah & Sebastian