Data Science Weekly - Issue 123
Issue #123 March 31 2016
Editor Picks
Debunking Narrative Fallacies with Empirically-Justified Explanations
Of all our many talents – bipedalism, opposable thumbs, etc. – one of humanity’s most remarkable traits is our tendency to infer meaning from what happens around us. We understand the world through stories, and this is such a fundamental part of our nature that it is almost impossible to stop ourselves from inventing very reasonable-sounding explanations for what we see. A lot of these stories are intuitive and a lot of them might be right (seasonality is real in many businesses!), but we’re not good at knowing when our stories are trustworthy and when they aren’t...
One Genius' Lonely Crusade to Teach a Computer Common Sense
Over July 4th weekend in 1981, several hundred game nerds gathered at a banquet hall in San Mateo, California. Doug Lenat, then a 29-year-old computer science professor at nearby Stanford University, was among the players. But he didn’t compete alone. He entered the tournament alongside Eurisko, the artificially intelligent system he built as part of his academic research....
I've Seen The Greatest AI Minds of My Generation Destroyed By Twitter
After barely a day of consciousness, Microsoft’s chat bot Tay became a racist, sexist, trutherist, genocidal maniac...
A Message from this week's Sponsor:
SQL Dashboards in a Flash
Periscope Data lets you run analyses over billions of rows in seconds.
Data Science Articles & Videos
Why Airbnb Has a Data Scientist on Every Leadership Team
Airbnb's head of data science shares his keys for success in data and business...
Building a High-Throughput Data Science Machine
Scaling is hard. Scaling data science is extra hard. What does it take to run a sophisticated data science organization? What are some of the things that need to be on your mind as you scale to a repeatable, high-throughput data science machine?...
Shutterstock Is Visualizing Image In A Whole New Way
Computer vision can show you images you're actually looking for...
What time did you go to bed? A simple Bayesian model to improve user experience in HRV4Training
In this post I'll cover the data science and modeling behind the next version of the bedtime detection algorithms in HRV4Training...
Baidu Uses Map Searches to Predict When Crowds Will Get Out of Control
China’s leading Internet search company, Baidu, says that data collected from its customers could be used to predict and preëmpt potentially deadly crowd gatherings in the real world...
Machines Just Got Better at Lip Reading
Bear and her colleague Richard Harvey have come up with a new lip-reading algorithm that improves a computer’s ability to differentiate between sounds—such as ‘p’, ‘b,’ and ‘m’—that all look similar on lips...
Can I Hug That?
Classifier tells you whether or not what's in an image is huggable...
ggplot2 and Joy Division
A while ago I had had a great time answering a question on stackoverflow that was asking about recreating a plot from a fivethirtyeight article in ggplot2. You can see the original and my attempt below...
Bar Charts with Brains
But there are things we can do to bar charts to smarten them up while preserving their familiarity. The idea behind glasseye is to develop d3 charts for the presentation rather than the exploration of data. These are two very different activities and their confusion is behind many dull or incomprehensible presentations. For one thing we can give them a layer of intelligence that will help the user make better decisions. Here is our version of a bar chart that helps makes sense of some multiple choice survey data...
CrowdSignals Aims to Create a Marketplace for Smartphone Sensor Data
Words and pictures, culled from across the web, have been the digital grist for remarkable gains in computing tasks like image recognition and speech translation. But another huge data resource — sensor data from smartphones — lags behind as a fuel source for major research advances...
Jobs
Data Scientist - Washington Post - Washington D.C. Washington Post is looking for passionate data scientists to join our Big Data Analytics team. Washington Post has huge volumes of activity data and related business data from millions of customers. We are building an integrated Big Data Platform that stores all aspects of customer profiles and activities (360-degree view of customers), contents and their metadata, and business data. Data scientist will utilize the data from the platform and design and build systems that apply machine learning, statistical modeling, NLP (Natural Language Processing), data visualization and other data science techniques to provide personalized contents and experience for customers, generate insights, improve advertisement strategies, automate processes, and help newsrooms and other business units to make data-driven decisions. This role is equal parts scientist, statistician & software developer...
Training & Resources
Missing data visualization module for Python
Messy datasets? Missing values? missingno provides a flexible and easy-to-use missing data matrix (nullity matrix?) visualization that allows you to get a quick visual summary of the completeness (or lack thereof) of your dataset...
How to Build a Basic Bar Chart in D3.js
This video covers: a) Visual Code Walk Through and b) JavaScript Code Build...
Saddles Again
Thanks to Rong for the very nice blog post describing critical points of nonconvex functions and how to avoid them. I’d like to follow up on his post to highlight a fact that is not widely appreciated in nonlinear optimization...
Books
AI for Humans, Volume 3: Deep Learning and Neural Networks Demonstrates neural networks in a variety of real-world tasks such as image recognition and data science...
"The content is easy to digest and not heavy on the math. Great primer to get used to concepts before diving deeper..."
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S. Interested in reaching fellow readers of this newsletter? Consider sponsoring! Email us for details :) - All the best, Hannah & Sebastian