Data Science Weekly - Issue 157
Issue #157 Nov 24 2016
Editor Picks
In the fight against fake news, AI is waging a battle it cannot win
It’s become clear that the algorithms Facebook and Google designed to deliver news to their users have failed. But while fake news is a headache for those tech giants right now, the underlying research question—whether and how machines tell truth from lies on the internet—is one that will persist as long as the world wide web stays an open forum...
Here’s Waldo: Computing the optimal search strategy for finding Waldo
That’s when I decided what my weekend project would be: I was going to pull out every machine learning trick in my tool box to compute the optimal search strategy for finding Waldo. I was going to crush Slate’s supposed foolproof strategy and carve a trail of defeated Waldo-searchers in my wake...
Machine Learning for Everyday Tasks
Machine learning is often thought to be too complicated for everyday development tasks. We often associate it with things like big data, data mining, data science, and artificial intelligence. I have always felt like we can benefit from using machine learning for simple tasks that we do regularly...
A Message from this week's Sponsor:
The End of Job Searching As You Know It
Hired brings job offers to you, so you can stop wasting your time applying. Apply to 4,000+ companies simultaneously and get free personalized support when you want it.
Data Science Articles & Videos
Reproducible Research: Stripe’s approach to Data Science
When people talk about their data infrastructure, they tend to focus on the technologies: Hadoop, Scalding, Impala, and the like. However, we’ve found that just as important as the technologies themselves are the principles that guide their use. We’d like to share our experience with one such principle that we’ve found particularly useful: reproducibility...
iSee: Using deep learning to remove eyeglasses from faces
How long does it usually take you to pick out a new pair of glasses at the store? 10 minutes? 30? When left unsupervised, I’ve admittedly taken over an hour. Head tilt. Half smile. Side shot. Next pair. It’s 2016; there must certainly be some sort of technology that has solved this problem. Of course there is!...
Google’s AI translation tool invents its own secret internal language
All right, don’t panic, but computers have created their own secret language and are probably talking about us right now. Well, that’s kind of an oversimplification, and the last part is just plain untrue. But there is a fascinating and existentially challenging development that Google’s AI researchers recently happened across...
Even in the Moneyball Era, Baseball’s Pundits Won’t Go Away
This past season Miguel Cabrera of the Detroit Tigers won baseball’s triple crown – he led the American League in home runs, runs batted in, and batting average. No one had done this since 1967, so Cabrera was the near-universal pick as the league’s most valuable player, winning 22 of 28 first place votes. This is a bit odd, since he was only the second most valuable player in the league...
What I Discovered About Trump & Clinton From 4MM Facebook Posts
On Facebook, headlines are often more important than the articles themselves. Most headlines are browsed, not clicked — think about your own Facebook behavior; How often do you click on links? Because of this, the headlines frame our positions on topics without even having to read the content. It’s quick, simple, and we feel informed. But with respect to politics, this news feed browsing behavior creates an electorate that can become dangerously uninformed...
Machine Learning is About to Turn the Marketing World Upside Down
Scene: It’s 2017, and your CEO calls you in. She asks about your market segmentation—you have five personas, based on demographic data. Then she hands you a report from the data scientist. He’s identified nine distinct segments, based on purchase intent and customer behavior, for which there is an opportunity to increase margin by better targeting service offerings and marketing messages. You sneak a peek at the methodology, and see statistical and technical gobbledygook...
Statistical Mistakes and How to Avoid Them
Computer scientists in systemsy fields, myself included, aren’t great at using statistics. Maybe it’s because there are so many other potential problems with empirical evaluations that solid statistical reasoning doesn’t seem that important. Other subfields, like HCI and machine learning, have much higher standards for data analysis. Let’s learn from their example. Here are three kinds of avoidable statistics mistakes that I notice in published papers...
Solving 8 visualisation challenges with ggplot2
My presentation to NYC Data Vis last week...
Jobs
Senior Data Science Analyst - VSCO - Oakland, CA VSCO is a leading creative platform with a monthly audience of over 45 million and growing.
We are looking for a Senior Data Science Analyst to build data at VSCO from the ground up. You will design our data model for user behavior, content impression, and mine the data to bring out insights that will influence the product roadmap. Expect to get your hands dirty with Redshift, Spark, and data visualization tools under the guidance of our Director of Data Science...
Training & Resources
How to Learn Machine Learning: The Self Starter Way
In this guide, we're going to reveal how you can get a world-class machine learning education for free...
The 10 Best AI, Data Science and Machine Learning Podcasts
Learn the basics and keep up with the latest news in data science, machine learning and artificial intelligence by listening to these great podcasts...
Awesome Machine Learning
A curated list of awesome machine learning frameworks, libraries and software (by language)...
Books
Learn Python the Hard Way: A Very Simple Introduction to the Terrifyingly Beautiful World of Computers and Code "Zed Shaw has perfected the world's best system for learning Python. Follow it and you will succeed-just like the hundreds of thousands of beginners Zed has taught to date"...
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S. Interested in reaching fellow readers of this newsletter? Consider sponsoring! Email us for details :) - All the best, Hannah & Sebastian