Data Science Weekly - Issue 41
Issue #41 Sept 4 2014
SPECIAL NOTICE: 20% DISCOUNT for Strata Conference + Hadoop World
October 15–17, 2014 | New York, NY
Strata + Hadoop World is where cutting-edge data science and new business fundamentals intersect—and merge. Strata brings together the decision makers using big data to drive business strategy and practitioners who collect and analyze the data. Combined with Hadoop World, the joint event is also the largest gathering of the Apache Hadoop community in the world.
Register with code DTSW to save 20%
Editor Picks
Making a Bayesian Model to infer Uber Rider Destinations
In this latest #UberData installment, we bring you the data science details of how we use classic Bayesian statistics to solve a uniquely Uber problem...
What I learned from competing against a ConvNet on ImageNet
The results of the 2014 ImageNet Large Scale Visual Recognition Challenge (ILSVRC) were published a few days ago. The New York Times wrote about it too. ILSVRC is one of the largest challenges in Computer Vision and every year teams compete to claim the state-of-the-art performance on the dataset...
Field Medal Winners
A collection of Video Lectures from Field Medal Winners, back to the 1980s...
Data Science Articles & Videos
Mathematical Predictions for the iPhone 6
Ok, this isn’t really math. Let’s instead call this a plain old model (you could argue it’s math if you like). Suppose I look at the historical progression of features on the previous iPhones. Could I use this to make a prediction about future iPhone models? In particular, what can I say about the rumored iPhone 6 that should be announced on September 9?...
Statistical Inference: The Big Picture
Definitive paper on the debate between frequentists and Bayesians...
Neglected machine learning ideas
This post is inspired by the “metacademy” suggestions for “leveling up your machine learning.” They make some halfway decent suggestions for beginners. The problem is, these suggestions won’t give you a view of machine learning as a field; they’ll only teach you about the subjects of interest to authors of machine learning books, which is different...
Data scientist: Your mileage may vary
For those madly scrambling to hire data scientists, make sure you're hiring the right kind. Getting it wrong can be very expensive...
John Wilbanks: Let’s Pool Our Medical Data
In this TED Talk, data scientist John Wilbanks discusses how strict privacy laws inhibit scientific research efforts, and asks us to imagine what potential discoveries could result from a giant pool of freely available anonymized health and genomic data...
What's in a Post, Part 1
What’s in a post? Reddit pulls in around 115 million unique visitors each month, amassing a staggering 5 billion page views per month. For a long time, I’ve wondered what factors draw people to certain Reddit posts while shunning others - does it have to do with the time of day a post is submitted? Do certain users have a monopoly on the most viewed posts? What about text posts vs. links?...
A visual proof that neural nets can compute any function
One of the most striking facts about neural networks is that they can compute any function at all...
The Next Big Thing in Sports data - Predicting (and Avoiding) Injuries
Can data tell a player's future? Teams like the San Antonio Spurs and New England Patriots are betting on it...
What I Learned As Pandora’s First Data Scientist
Three years ago, Gordon Rios became Pandora’s first official data scientist. Since then, he’s seen the team grow to over a dozen strong and become hugely influential in every decision the company makes. Considering how much of Pandora’s service is data-driven — from maintaining its famous Music Genome Project to creating even more ways for people to discover music they’ll love — it’s one of the best examples around of a data science team growing fast and lean to make a difference...
Jobs
Principal Data Scientist - Skype - Redmond, WA We are seeking a highly capable Data Scientist who is passionate about data analysis and modeling, and is driven to make Skype and Lync have the best audio and video communication experience possible. The Skype Real-Time Media group applies core expertise in audio and video signal processing to problems in advanced telecommunication scenarios that are used by hundreds of millions of customers worldwide...
Training & Resources
JuliaCon 2014: Videos
These are all the talks from JuliaCon 2014, held in Chicago on June 26 and 27. It was attended by close to a hundred people, and featured talks on various aspects of Julia....
International Conference on Machine Learning 2014: Videos
TechTalks from event: International Conference on Machine Learning 2014...
A Methodology to Perform Linear Regression
Building a Linear Regression Predictor: The challenge is to build a predictor that takes in a set of tab delimited floating values and predicts a floating value...
Books
Machine Learning with R Practical tutorial that uses hands-on examples to step through real-world application of machine learning...
"If you are new to both machine learning and R and want to learn both at the same time, I can't imagine there being a better book...."
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S. Enjoyed the newsletter? Please forward it to friends and peers - we'd love to have them onboard too :-) - All the best, Hannah & Sebastian