Data Science Weekly - Issue 26
Issue #26 May 22 2014
Editor Picks
Data Science at Twitch - CEO Perspective: Emmett Shear Interview We recently caught up with Emmett Shear, CEO of Twitch. We were keen to learn more about his background, how data and Data Science have influenced Twitch's growth to this point, and what role they have to play going forward...
Neural networks and a dive into Julia A few weekends ago, I made the decision to casually brush up on my neural networks. Why? Well, for starters neural networks are super interesting. Additionally, I was keen to revisit the topic given all the activity around "deep learning" in the Twittersphere. Julia turned out to be the perfect language for digging into the guts of a machine learning algorithm...
Emerging Science of Superspreaders (And How to Tell If You're One) Nobody has figured out how to spot the most influential spreaders of information in a real-world network. Now that looks set to change with important implications, not least for the superspreaders themselves...
Data Science Articles & Videos
The Next Big Thing You Missed:
Airbnb’s Human Brains Crunch Data Better Than Computers
In 2011, Airbnb had a problem. The room-sharing site was growing fast, but so were customer complaints. People just couldn’t figure out how to use the service. The issue was so severe, Airbnb was getting an average of one customer service call for every room booked. To figure out how to fix this problem, the company asked Newman to look at the data...
The term Big Data is going to disappear in the next 2 years.
Statistics will be what remains.
There is no question that big data have hit the business, government and scientific sectors. However, there is plenty of misleading hype around the terms `big data' and `data science'. This presentation gives a professional statistician's view on these terms, illustrates the connection between data science and statistics, and highlights some challenges and opportunities from a statistical perspective.....
The Mind-Blowing Possibilities of plot.ly
We were fortunate enough to have Matt Sunquist of plot.ly come to our campus recently to talk about something that is his passion: sharing data for the purpose of data literacy...
Web-Scale Bayesian Click-Through Rate Prediction for Sponsored Search Advertising in Microsoft’s Bing Search Engine
We describe a new Bayesian click-through rate (CTR) prediction algorithm used for Sponsored Search in Microsoft’s Bing search engine. The algorithm is based on a probit regression model that maps discrete or real-valued input features to probabilities...
Can We do Better than R-squared?
The R2 calculated in Excel is often used as a measure of how well a model explains a response variable. There's a hidden trap, though. R2 will increase as you add terms to a model, even if those terms offer no real explanatory power. By using the R2 that Excel so helpfully provides, we can fool ourselves into believing that a model is better than it is. Below I'll demonstrate this and show an alternative that can be implemented easily in R...
Michael O’Connell, Chief Data Scientist, TIBCO on How to Lead in Big Data
We discuss Big Data vs. Fast Data, Data Visualization trends, Jaspersoft acquisition, factors differentiating future leaders of Big Data and more...
Python Implementation of Convolutional Neural Network
This is a solution to the Convolutional Neural Network exercise in the Stanford UFLDL Tutorial...
Ranking algorithms and the NFL (Part 1 of a series)
I recently picked up Who’s #1?: The Science of Rating and Ranking, a really fun read on the many ways to take a list of items and order them by some score. Obviously, rankings are a huge topic of interest in sports, and my day job is working on recommender systems, so I saw this as the natural intersection of these things...
VC Firm names Algorithm to its Board of Directors
Deep Knowledge Ventures, a firm that focuses on age-related disease drugs and regenerative medicine projects, says the program, called VITAL, can make investment recommendations about life sciences firms by poring over large amounts of data...
Jobs
Machine Learning Principal Scientist - Algorithms Engineering;
Netflix - Los Gatos, CA The Algorithms Engineering (AE) team owns the research, development and innovation for the algorithms driving the Netflix product including Personalization and Search. We are looking for an experienced machine learning leader to join our team and become the technical point of reference for a brilliant team of researchers and developers...
Training & Resources
A Primer on Deep Learning
In a presentation I gave at Boston Data Festival 2013 and at a recent PyData Boston meetup I provided some history of the method and a sense of what it is being used for presently. This post aims to cover the first half of that presentation, focusing on the question of why we have been hearing so much about deep learning lately...
Review of the First Three Johns Hopkins Coursera Data Science Courses I am currently working towards the Johns Hopkins Data Science Specialization at Coursera. I posted my initial, and very positive, impressions when I was about half-way through the first four-week block. My impressions are still very favorable at completion. Now that the course is complete, I can post my complete thoughts for the first three courses...
An Introduction to Data-Driven Decisions for Managers Who Don’t Like Math
Not everyone needs to become a quant. But it is worth brushing up on the basics of quantitative analysis, so as to understand and improve the use of data in your business. We’ve created a reading list of the best HBR articles on the subject to get you started...
Books
The Signal and the Noise: Why So Many Predictions Fail — but Some Don't Not a new book, though very well reviewed...
"This is the best general-readership book on applied statistics that I've read. Short review: if you're interested in science, economics, or prediction: read it. It's full of interesting cases, builds intuition, and is a readable example of Bayesian thinking."
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S. Did you enjoy the newsletter? Do you have friends/colleagues who might like it too? If so, please forward it along - we would love to have them onboard :)