Data Science Weekly - Issue 105
Issue #105 November 26 2015
Editor Picks
Taking a Neural Net out for a walk
Kyle McDonald hooked a neural network program up to a webcam and had it try to analyze what it was seeing in realtime as he walked around Amsterdam...
The Effects of Uber’s Surge Pricing: A Case Study
Uber's Surge Algorithm - remarkable consistency of the expected wait time for a ride...
Big Data or Pokemon?
Fun quiz!...
A Message from this week's Sponsor:
Join Sparkathon: It's Raining Data - $30,000 Hackathon with Devpost & IBM Bluemix (now through January 20, 2016)
Devpost just launched Sparkathon, a big data hackathon powered by IBM Bluemix with $30,000 in prizes. This global online competition challenges you to build weather apps using climate data and IBM’s new Analytics for Apache Spark. Bluemix's Spark service allows you to analyze data using Jupyter notebooks written in Python or Scala and connect to common data sources like SWIFT and Cloudant.
There's a $15,000 grand prize, plus special student & fan favorite awards. You'll also get extended Bluemix support for your app, including 5GB of object storage. Enter your Spark project by January 20! Get more info and register.
Sparkathon: It’s Raining Data is open to the following jurisdictions only: USA, Canada (with the exception of Quebec province), Hong Kong, China, Mexico, Germany, Japan, India, Israel, South Korea, United Kingdom, Australia, Netherlands, and France. Must be 18 or older. Void where prohibited. Contest ends January 20, 2016. For full challenge details and rules, go to Sparkathon.Devpost.com/rules.
Data Science Articles & Videos
What “50 Years of Data Science” Leaves Out
As a B-list celebrity data scientist, and skeptic of the underspecified, overhyped “Data Science” movement, I was so glad to find David Donoho’s critical take in 50 Years of Data Science, which has made its way around the Internet. Read it now. I suppose it should really be called 53 years of Data Science, but 50 is a popular number of things to have something of...
Machine Learning in the Wild
A bridge between robust control and reinforcement learning...
Character-based Neural Machine Translation
We introduce a neural machine translation model that views the input and output sentences as sequences of characters rather than words...
Reducing Overfitting in Deep Networks by Decorrelating Representations
One major challenge in training Deep Neural Networks is preventing overfitting. Many techniques such as data augmentation and novel regularizers such as Dropout have been proposed to prevent overfitting without requiring a massive amount of training data. In this work, we propose a new regularizer called DeCov which leads to significantly reduced overfitting (as indicated by the difference between train and val performance), and better generalization...
Fun with Simpson's Paradox: Simulating Confounders
Wikipedia describes Simpson’s paradox as “a trend that appears in different groups of data but disappears or reverses when these groups are combined.”...
"Neural Art" in TensorFlow
An implementation of "A neural algorithm of Artistic style" in TensorFlow...
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks
In recent years, supervised learning with convolutional networks (CNNs) has seen huge adoption in computer vision applications. Comparatively, unsupervised learning with CNNs has received less attention. In this work we hope to help bridge the gap between the success of CNNs for supervised learning and unsupervised learning... and generate really cool bedroom images that look super real!...
The Hardest Parts of Data Science
Contrary to common belief, the hardest part of data science isn’t building an accurate model or obtaining good, clean data. It is much harder to define feasible problems and come up with reasonable ways of measuring solutions. This post discusses some examples of these issues and how they can be addressed....
Machine Learning methods used in a major breakthrough in Nutrition Science
Two groups led by Eran Elinav and Eran Segal have presented a stunning paper providing startling new insight into the personal nature of nutrition. The Israeli research teams have demonstrated that there exists a high degree of variability in the responses of different individuals to identical meals, and through the elegant application of machine learning, they have provided insight into the diverse factors underlying this variability...
Music Generation Using Stacked Denoising Autoencoder and LSTM model in Keras
I was able to generate music by training a NN model over Joanna Newsom's song "Sapokanikan"...
Jobs
Data Scientist - Shyp - San Francisco, CA Data is the backbone for how we make decisions, build amazing products and operate cutting edge logistics. We’re looking for a technical candidate who is passionate about building and mining databases, turning data into insights, and leveraging data to make improvements in the product and throughout the company. As the first dedicated data science hire, this person will work closely with our Product, Operations and Marketing teams to level up our data-driven culture...
Training & Resources
Scikit Flow
This is a simplified interface for TensorFlow, to get people started on predictive analytics and data mining...
TensorFlow Examples
Code examples for some popular machine learning algorithms, using TensorFlow library. This tutorial is designed to easily dive into TensorFlow, through examples. It includes both notebook and code with explanations...
Awesome Reinforcement Learning
A curated list of resources dedicated to reinforcement learning...
Books
Python Data Science Cookbook New release and getting very good reviews...
"This book gives a very practical approach to learn some of the important algorithms using Python. Great hands on experience. I am sure once you do the examples in this book, most of the fundamental concepts will be understood..."
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S. Interested in reaching fellow readers of this newsletter? Consider sponsoring! Email us for details :) - All the best, Hannah & Sebastian