Data Science Weekly - Issue 43
Issue #43 Sept 18 2014
Editor Picks
Something About Cats, Dogs, Machine and Deep Learning
Deep Blue beat Kasparov at chess in 1997. Watson beat the brightest trivia minds at Jeopardy in 2011. Can you tell Fido from Mittens in 2013?...
Y Combinator Data Science Start-ups
There are new and exciting commercial opportunities in the data science space. We take a look at the data science start-ups from the latest yCombinator batch....
Guess Who Rated This Movie: Identifying Users Through Subspace Clustering
It is often the case that, within an online recommender system, multiple users share a common account. Can such shared accounts be identified solely on the basis of the userprovided ratings? Can recommendations be adjusted accordingly...
Data Science Articles & Videos
Why are we still teaching t-tests?
My posting about the statistics profession losing ground to computer science drew many comments, not only here in Mad (Data) Scientist, but also in the co-posting at Revolution Analytics, and in Slashdot. One of the themes in those comments was that Statistics Departments are out of touch and have failed to modernize their curricula. Though I may disagree with the commenters’ definitions of “modern,” I have in fact long felt that there are indeed serious problems in statistics curricula...
Propensity Modeling, Causal Inference, and Discovering Drivers of Growth
Causality is incredibly important, yet often extremely difficult to establish...
What does randomness look like?
Here are two patterns, from Steven Pinker’s book, The Better Angels of our Nature. One of the patterns is randomly generated. The other imitates a pattern from nature. Can you tell which is which?...
How Twitter Handles Your Data -
Interview with Jake Mannix, Machine Learning Engineer at Twitter
We sat down with Jake Mannix RE.WORK conference in Berlin to talk about how Twitter handles your data, and some of his past work at LinkedIn...
Using data science to build better products
This is a post on how data science is creating some of the coolest and most useful products ever...
Higgs Boson Machine Learning Challenge
Winning submission shared (a bag of 70 neural networks)...
Sequence to Sequence Learning with Neural Networks
Deep Neural Networks (DNNs) are powerful models that have achieved excellent performance on difficult learning tasks. Although DNNs work well whenever large labeled training sets are available, they cannot be used to map sequences to sequences. In this paper, we present a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure...
Google uses R to calculate ROI on advertising campaigns
Google has just released a new package for R: CausalImpact. Amongst many other things, this package allows Google to resolve the classical conundrum: how can we asses the impact of an intervention (for example, the effect of an advertising campaign on website clicks) when we can't know what would have happened if we hadn't run the campaign?...
Acting on Analytics: How to Build a Data-Driven Enterprise
Talk from Mario Faria, Chief Data Officer and Advisor, Bill and Melinda Gates Foundation and member of MIT Data Science Initiative...
QANTA: A Deep Question Answering Model
We introduce a recursive neural network model that is able to correctly answer paragraph-length factoid questions from a trivia competition called quiz bowl...
Jobs
Data Scientist, Instagram Analytics - Menlo Park, CA We’re looking for data scientists with a passion for Internet technology to help drive informed business decisions for Instagram. You will enjoy working with top-notch people, one of the richest data sets in the world, cutting edge technology, and the ability to see your insights turned into real products on a regular basis...
Training & Resources
Dashing D3.js - Tutorial
Learn how to make Data Visualizations with D3.js...
DeepLearning.University – An Annotated Deep Learning Bibliography
DeepLearning.University is an annotated bibliography of recent publications (2014-) related to Deep Learning...
Machine Learning Cheat Sheet
Classical equations and diagrams in machine learning...
Books
The Norm Chronicles: Stories and Numbers About Danger and Death Explains how statistical regularities and irregularities are central to every aspect of our lives...
"Very well written book, it is a very interesting look at risks and the assosiated statistics. Put into perspective by a presentation of the shortcomings of the same statistics..."
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S. Enjoyed the newsletter? Please forward it to friends and peers - we'd love to have them onboard too :-) - All the best, Hannah & Sebastian