Data Science Weekly - Issue 158
Issue #158 Dec 1 2016
Editor Picks
The Simple Economics of Machine Intelligence
As economists, we believe some simple rules apply. Technological revolutions tend to involve some important activity becoming cheap, like the cost of communication or finding information. Machine intelligence is, in its essence, a prediction technology, so the economic shift will center around a drop in the cost of prediction...
The secret to smarter fresh-food replenishment? Machine learning
With machine-learning technology, retailers can address the common—and costly—problem of having too much or too little fresh food in stock...
An AI Ophthalmologist Shows How Machine Learning May Transform Medicine
Google researchers trained an algorithm to recognize a common form of eye disease as well as many experts can...
A Message from this week's Sponsor:
Level Up Your Python Workflow & Get Notebooks You Can Share
Mode is a SQL editor, Python notebook, and visualization builder all rolled into one. Explore data with SQL and pass results instantly into a Python notebook for further exploration and visualization. Pick and choose output cells to present to others, or send the whole notebook—you can even share with people who don't have a Python environment set up.
Data Science Articles & Videos
Google's Hand-Fed AI Now Gives Answers, Not Just Search Results
Deep learning is changing how Google's search engine works. But its new found efficiency takes a lot of painstaking human work behind the scenes...
AI Machine Attempts to Understand Comic Books ... and Fails
The list of activities in which artificial intelligence machines have bested humans is increasing at an alarming rate. Face recognition, object recognition, chess, Go, various video games, and numerous other tasks have all fallen in this battle. So it’s natural to ask about the types of tasks that machines still have difficulty with. Where do humans still rule the roost?...
How the Trump Campaign Built an Identity Database and Used Facebook Ads to Win the Election
There may be some fake news on Facebook, but the power of the Facebook advertising platform to influence voters is very real. This is the story of how the Trump campaign used data to target African Americans and young women with $150 million dollars of Facebook and Instagram advertisements in the final weeks of the election, quietly launching the most successful digital voter suppression operation in American history...
Fast Face-swap Using Convolutional Neural Networks
We consider the problem of face swapping in images, where an input identity is transformed into a target identity while preserving pose, facial expression, and lighting. To perform this mapping, we use convolutional neural networks trained to capture the appearance of the target identity from an unstructured collection of his/her photographs...
Probabilistic Data Structure Showdown: Cuckoo Filters vs. Bloom Filters
The Fast Forward Labs team explored probabilistic data structures in our “Probabilistic Methods for Real-time Streams” report and prototype. This post provides an update by exploring Cuckoo filters, a new probabilistic data structure that improves upon the standard Bloom filter...
Improving variational approximations
Nick Foti, Ryan Adams, and I just put a paper on the arxiv about improving variational approximations (short version accepted early to AABI2016). We focused on one problematic aspect of variational inference in practice — that once the optimization problem is solved, the approximation is set and there isn’t a straightforward way to improve it, even when we can afford some extra compute time...
Plotting Earthquakes with D3.js + Leaflet
I am still learning d3.js, and thought it would be a good idea to share with you my trial and error process (admittedly, sometimes more error than trial) when doing the earthquake visualization. Here is the visualization. I later go through some of the steps I took to complete it...
Decoding The Thought Vector
In this blog post I put forward a possible interpretation of these vectors. I argue we shouldn't take these vectors literally, but rather as an encoding for a simpler, sparse data structure. This gives rise to a simple technique (the -SVD) for reverse engineering this data structure, and gives us the tools to decode the vectors' meaning...
Jobs
Senior Data Science Analyst - VSCO - Oakland, CA VSCO is a leading creative platform with a monthly audience of over 45 million and growing.
We are looking for a Senior Data Science Analyst to build data at VSCO from the ground up. You will design our data model for user behavior, content impression, and mine the data to bring out insights that will influence the product roadmap. Expect to get your hands dirty with Redshift, Spark, and data visualization tools under the guidance of our Director of Data Science...
Training & Resources
Calculating AUC: the area under a ROC Curve
In this post I’ll work through the geometry exercise of computing the area, and develop a concise vectorized function that uses this approach. Then we’ll look at another way of viewing AUC which leads to a probabilistic interpretation...
D3js Tutorial Video: Basic Chart - Grouped Bar Chart
You will use the CSV data from the D3js.org website Grouped Bar Chart Example to see how a full D3 Grouped Bar Chart Example data visualization is built...
An Interactive Tutorial on Numerical Optimization
I ended up writing a bunch of numerical optimization routines back when I was first trying to learn javascript. Since I had all this code lying around anyway, I thought that it might be fun to provide some interactive visualizations of how these algorithms work. The cool thing about this post is that the code is all running in the browser, meaning you can interactively set hyper-parameters for each algorithm, change the initial location, and change what function is being called to get a better sense of how these algorithms work....
Books
Learn Python the Hard Way: A Very Simple Introduction to the Terrifyingly Beautiful World of Computers and Code "Zed Shaw has perfected the world's best system for learning Python. Follow it and you will succeed-just like the hundreds of thousands of beginners Zed has taught to date"...
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S. Interested in reaching fellow readers of this newsletter? Consider sponsoring! Email us for details :) - All the best, Hannah & Sebastian