Data Science Weekly - Issue 112
Issue #112 January 14 2016
Editor Picks
The Fair Price to Pay a Spy: An Introduction to the Value of Information
This article covers the decision-theoretic concept of value of information through a classic example...The following example is from one of the important papers on decision theory and decision analysis, now in its 50th anniversary year(!)...
AI Algorithm Identifies Humorous Pictures
It’s easy to imagine that humor will be one of the last bastions that separates humans from machines. Computers, the thinking goes, cannot possibly develop a sense of humor until they can grasp the subtleties of our rich social and cultural settings. And even the most powerful AI machines are surely a long way from that. That thinking may soon have to change...
AMA Data Scientist: Jake Porway of DataKind - QUESTIONS ANSWERED
DataKind’s founder and executive director Jake Porway did his first ever Reddit AMA on January 13. It was a terrifically candid discussion of what it takes to apply data science for social good. (Hint - much more than good intentions.) He and the DataKind team answered all kinds of questions that you might find useful!...
A Message from this week's Sponsor:
DataNerd
Create a free account with New Relic and get this swanky shirt for FREE!
Data Science Articles & Videos
Deep Grammar: Grammar Checking Using Deep Learning
Deep Grammar is a grammar checker built on top of deep learning. Deep Grammar uses deep learning to learn a model of language, and it then uses this model to check text for errors in three steps...
A 'Brief' History of Neural Nets and Deep Learning, Part 1
This is the first part of ‘A Brief History of Neural Nets and Deep Learning’. In this part, we shall cover the birth of neural nets with the Perceptron in 1958, the AI Winter of the 70s, and neural nets’ return to popularity with backpropagation in 1986...
Recognizing and Localizing Endangered Right Whales with Extremely Deep Neural Networks
In this post I’ll share my experience and explain my approach for the Kaggle Right Whale challenge. I managed to finish in 2nd place...
Experiments with style transfer
Since the original Artistic style transfer and the subsequent Torch implementation of the algorithm by Justin Johnson were released I’ve been playing with various ways to use the algorithm in other ways. Here’s a quick dump of the results of my experiments...
Colorizing Black&White Movies with Neural Networks
Testing the "Automatic Colorization" Neural Network by Ryan Dahl...
Understanding the Pseudo-Truth as an Optimal Approximation
One of the things that set statistics apart from the rest of applied mathematics is an interest in the problems introduced by sampling: how can we learn about a model if we’re given only a finite and potentially noisy sample of data? Although frequently important, the issues introduced by sampling can be a distraction when the core difficulties you face would persist even with access to an infinite supply of noiseless data...
Implicit Recommender Systems: Biased Matrix Factorization
In today's post, we will explain a certain algorithm for matrix factorization models for recommender systems which goes by the name Alternating Least Squares (there are others, for example based on stochastic gradient descent). We will go through the basic ALS algorithm, as well as how one can modify it to incorporate user and item biases...
Implications of use of multiple controls in an A/B test
This post will examine the idea of using two control buckets in order to guard against Type I and Type II errors. We will demonstrate this causes significant problems, and that creating a single large control is a superior and unbiased way to achieve the same goal using the same amount of data....
Why Slam Matters, The Future Of Real-time Slam, Deep Learning VS Slam
Today's post contains a brief introduction to SLAM (Simultaneous Localization and Mapping), a detailed description of what happened at my ICCV's (International Conference of Computer Vision) Future of Real-Time SLAM Workshop (with summaries of all 7 talks), and some take-home messages from the Deep Learning-focused panel discussion at the end of the session...
Seven things I learned at my first data science hackathon
I love hackathons. I can learn more in a day or two of hard work with friends than I could in six months studying on my own. So when I heard about the first Social Data Science Hackathon in the Twin Cities, I was one of the first to sign up. Here are a few of the most important lessons I learned from the experience...
Jobs
Data Scientist - Graphiq - Santa Barbara, CA As a Data Scientist on Graphiq’s data team, you will eat, breath, and sleep data. You will be responsible for building full-fledged data products that help our product managers provide users with deeper insights and tell a more compelling story with our data. You will research, design, and implement robust methods for statistical analysis that can be used to help understand our billions of data points. You will build scalable solutions for analyzing our large and very connected knowledge graph...
Learn more about the role and get some terrific advice for your Data Science resume, in our interview with the Hiring Manager, Nick Larusso...
Training & Resources
DataBasic: A Suite Of Data Tools For The Beginner
Easy-to-use web tools that help data newbies (and the more experienced) grasp & learn the basics...
Getting Started with Markov Chains
In this post, we’ll explore some basic properties of discrete time Markov chains using the functions provided by the markovchain package supplemented with standard R functions and a few functions from other contributed packages...
Locality Preserving Projections in Python
lpproj is a Python implementation of Locality Preserving Projections, built to be compatible with scikit-learn...
Books
Superforecasting: The Art and Science of Prediction Interesting take on prediction, drawing on decades of research and the results of a massive, government-funded forecasting tournament (The Good Judgment Project) involving tens of thousands of ordinary people...
"Superforecasting is the rare book that is both scholarly and engaging. The lessons are scientific, compelling, and enormously practical..."
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S. Interested in reaching fellow readers of this newsletter? Consider sponsoring! Email us for details :) - All the best, Hannah & Sebastian