Data Science Weekly - Issue 141
Issue #141 Aug 4 2016
Editor Picks
Make Algorithms Accountable
Algorithms are ubiquitous in our lives. They map out the best route to our destination and help us find new music based on what we listen to now. But they are also being employed to inform fundamental decisions about our lives...
Build Algorithms Like You Give a Damn
For the second year in a row, WrangleConf did not disappoint. The conversation picked up right where last year’s left off: on the ethics of our craft. Last year the focus was on the humans building algorithms and the humans whom algorithms affect. This year, the discussion expanded in scope to consider the growing number people who interact with data science teams...
Twitter Facial Analysis Reveals Demographics of Presidential Campaign Followers
If you follow Hillary Clinton or Donald Trump on Twitter, your face has probably been analyzed by a machine to determine your age, ethnicity, and social influence...
A Message from this week's Sponsor:
SQL Dashboards in a Flash.
Periscope Data lets you run analyses over billions of rows in seconds...
Data Science Articles & Videos
The AI That Cut Google’s Energy Bill Could Soon Help You
The same type of algorithm that beats humans at complex games is being applied in more practical areas....
Machine Learning over 1M hotel reviews finds interesting insights
On this post we will cover how we can use these machine learning models to analyze millions of reviews from TripAdvisor and then compare how people feel about hotels in different cities...
Machine Vision’s Achilles’ Heel Revealed by Google Brain Researchers
By some measures machine vision is better than human vision. But now researchers have found a class of “adversarial images” that easily fool it...
Does sentiment analysis work? A tidy analysis of Yelp reviews
Sentiment analysis is often used by companies to quantify general social media opinion (for example, using tweets about several brands to compare customer satisfaction). One of the simplest and most common sentiment analysis methods is to classify words as “positive” or “negative”, then to average the values of each word to categorize the entire document. But does this method actually work?...
Pinterest’s Founder: Algorithms Don’t Know What You Want
CEO Ben Silbermann says Pinterest is built on the idea that crowds of people are best at finding content that consumers care about...
An experiment in trying to predict Google rankings
In late 2015, JR Oakes and his colleagues undertook an experiment to attempt to predict Google ranking for a given webpage using machine learning. What follows are their findings, which they wanted to share with the SEO community...
Team USA by the Numbers
Is being an Olympic champion determined by your genes? Longtime readers will remember the three-part series from 2014 exploring this question. That project was prompted by David Epstein’s claim, based on his book The Sports Gene, that given a roster of Olympic athletes and their weights and heights, he could predict their events with high accuracy. Using a variety of machine learning models I determined that they could achieve about 30 percent accuracy, which made me dubious of Epstein’s claims...
Dreaming of names with RBMs
A classic problem in natural language processing is named entity recognition. Given a text, we have to identify the proper nouns. But what about the generative mirror image of this problem - i.e. named entity generation? What if we ask a model to dream up new names of people, places and things?...
Jobs
Junior Data Scientist - Penguin Random House - NYC At Penguin Random House, The Data Science & Analytics group is an agile team comprised of data scientists, software engineers, front-end developers, and industry experts capable of tackling any data-oriented problem.
As a junior data scientist on this team, you will have an opportunity to work on a variety of high-profile projects while working closely with senior data scientists and key decision makers across the organization to help solve analytical problems of strategic value. Our areas of focus include price elasticity, consumer research, marketing attribution, and title segmentation, as well as ad-hoc analysis and data exploration. Your domain of expertise will be equal parts feature engineering and statistical analysis – or equivalently, machine learning...
Training & Resources
Introduction to Statistics and Basics of Mathematics for Data Science -
The Hacker's Way
This is the repository for the full day workshop conducted at Fifth Elephant 2016...
Deep Reinforcement Learning for Keras
keras-rl implements some state-of-the art deep reinforcement learning algorithms in Python and seamlessly integrates with the deep learning library Keras...
Tips on Building Neural Machine Translation Systems
This tutorial will explain some practical tips about how to train a neural machine translation system. It is partly based around examples using the lamtram toolkit...
Books
Data Visualization with Python and JavaScript: Scrape, Clean, Explore & Transform Your Data Learn how to turn raw data into rich, interactive web visualizations with the powerful combination of Python and JavaScript. With this hands-on guide, author Kyran Dale teaches you how build a basic dataviz toolchain with best-of-breed Python and JavaScript libraries—including Scrapy, Matplotlib, Pandas, Flask, and D3—for crafting engaging, browser-based visualizations...
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S. Interested in reaching fellow readers of this newsletter? Consider sponsoring! Email us for details :) - All the best, Hannah & Sebastian