Data Science Weekly - Issue 118
Issue #118 February 25 2016
Editor Picks
Recommending for the World
The Netflix experience is driven by a number of Machine Learning algorithms: personalized ranking, page generation, search, similarity, ratings, etc. On the 6th of January, we simultaneously launched Netflix in 130 new countries around the world... In this post, we highlight the four most interesting challenges we’ve encountered in making our algorithms operate globally and, most importantly, how this improved our ability to connect members worldwide with stories they'll love...
Britain's diet in data
The British diet has undergone a transformation in the last half-century. Traditional staples such as eggs, potatoes and butter have gradually given way to more exotic or convenient foods such as aubergines, olive oil and stir-fry packs. Explore the changes across four decades and hundreds of food and drink categories in this interactive visualisation...
Viewing the US Presidential Primary Through the Lens of Twitter
Some of our work on social media analytics was highlighted in a recent Wall Street Journal article, and it gives us a great opportunity to talk about some of the methods we use for making sense of the Twitter firehose...
A Message from this week's Sponsor:
RJMetrics Pipeline: All Your Data in Redshift
RJMetrics Pipeline connects to the systems you use and streams that data to Amazon Redshift, ready for you to analyze. Query data from Salesforce, Postgres, Stripe, MongoDB, and 30 more integrations in SQL or in your favorite BI tool. Setup takes five minutes or less, start your free 14-day trial today!
Data Science Articles & Videos
The Data Science of Firing Your (NHL) Coach
The Habs — as the Canadiens are affectionately known — have not been doing so great this year. In fact, they have lost so many games that Head Coach Michel Therrien’s job might very well be in jeopardy. This got me thinking: Does firing a coach during the season actually help a team improve their record? I decided to find out for myself....
Automate Your Oscars Pool with R
This is a program made with R / ggplot2 that machine processes ballots, winners, and plots standings for your Oscars pool...
Where the f*** can I park?
I made a map showing the different residential parking areas in my city...
Using Data Science to Improve Diversity at Airbnb
We worried that some form of unconscious bias had infiltrated our interviews, leading to lower conversion rates for women. But before diving into a solution, we decided to treat this like any problem we work on — begin with research, identify an opportunity, experiment with a solution, and iterate...
Will It Shuffle?
D3 visualization comparing shuffling algorithms...
Data Science at Instacart
We work incredibly hard to make Instacart easy to use. Our site and app are intuitive – you fill your shopping cart, pick the hour you want delivery to occur in, and then the groceries are handed to you at your doorstep. But achieving this simplicity cost effectively at scale requires an enormous investment in engineering and data science...
A Billion Taxi Rides in Redshift
Truth is, outside of geospatial-specific queries, many columnar-based store engines would be a benefit to this dataset [1.1 billion Taxi journeys made in New York City between 2009 and 2015] in terms of query performance. In this blog post I'll look at getting the raw data from it's original sources, denormalising it and importing it into a Redshift Data Warehouse on AWS....
Practical Black-Box Attacks against DL Systems using Adversarial Examples
What is the potential of machine art, and can it truly be described as creative or imaginative?...
Why pandas users should be excited about Apache Arrow
I'm super excited to be involved in the new open source Apache Arrow community initiative. There's plenty of places you can learn more about Arrow, but this post is about how it's specifically relevant to pandas users...
Robots That Teach Each Other
What if robots could figure out more things on their own and share that knowledge among themselves?...
Jobs
Data Scientist Intern, Summer 2016 - Groupon - Chicago, IL Groupon is currently at the exciting intersection of local and online commerce, mobile, personalization and discovery. We are looking for the next wave of great algorithm engineers and data scientists to work on presenting just the right deals to the right users at the right time, for hundreds of millions of daily interactions. We work in small, independent and agile teams, and collaborate with product designers to implement stuff quickly...
Training & Resources
Introducing Vega-Lite
Today we are excited to announce the official 1.0 release of Vega-Lite, a high-level format for rapidly creating visualizations for analysis and presentation...
TensorFuse
Common interface for Theano, CGT, TensorFlow, and mxnet (experimental)...
So you want to build a generator…
This is a beginner-level advice essay for people just getting started with building generators. It’s also for practiced experts who want a way to organize their knowledge...
Books
Cognitive Computing: A Brief Guide for Game Changers Concise report on the most noteworthy developments in artificial intelligence...
"I loved it- this book that is- but it scared the living (Stuffing) out of me. Very very thought provoking..."
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S. Interested in reaching fellow readers of this newsletter? Consider sponsoring! Email us for details :) - All the best, Hannah & Sebastian