Data Science Weekly - Issue 104
Issue #104 November 19 2015
Editor Picks
If Google predicts your future, will it be a cliché?
Hubris at the Next Economy conference around robotic writing reminded me of this essay from 5 years ago...
Play Go Against a DNN
Your opponent is a deep convolutional neural network trained to play Go...
Data Science of James Bond movies
Poor Spectre results will lead to new Bond actor...
A Message from this week's Sponsor:
Distribute Processing on Your Cluster with Anaconda
Using Python on distributed computing technologies like Hadoop and Spark makes it easier to create and deploy advanced analytics in production. But managing packages on your cluster can be a full-time job. And that's why we created the cluster features of Anaconda. Learn how to manage Python packages across an entire cluster with one line of code. Watch the Recording & Get the Slides
Data Science Articles & Videos
Short Story on AI: A Cognitive Discontinuity
The idea of writing a collection of short stories has been on my mind for a while. This post is my first ever half-serious attempt at a story, and what better way to kick things off than with a story on AI and what that might look like if you extrapolate our current technology...
Introducing a new way to visually search on Pinterest
Discovery products at Pinterest are built on top of Pins. Last year, we introduced Guided Search, a feature built on top of understanding Pins’ descriptions. Before that, we launched Related Pins, a service built on top of understanding Pin to board connections. Though we’ve been able to use these Pinner curated signals to build new products and features, there’s one signal within every Pin we haven’t been able to utilize, a Pin’s image - until now...
Deep Learning for Visual Question Answering
In this blog post, I’ll talk about the Visual Question Answering problem, and I’ll also present neural network based approaches for same. The source code for this blog post is written in Python and Keras, and is available on Github...
“Shrinking bull’s-eye” algorithm speeds up complex modeling
Now MIT researchers have developed a new algorithm that vastly reduces the computation of virtually any computational model. The algorithm may be thought of as a shrinking bull’s-eye that, over several runs of a model, and in combination with some relevant data points, incrementally narrows in on its target: a probability distribution of values for each unknown parameter...
Machine learning could solve riddles of galaxy formation
A new, faster modeling technique for galaxy formation has been developed by University of Illinois student Harshil Kamdar and professor Robert Brunner. The technique uses machine learning to cut down computing times from thousands of computing hours to mere minutes...
The Discovery of Statistical Regression
So how did regression, so simple to Gauss, and so essential to much of modern science, arise?...
TensorFlow vs. Theano vs. Torch
In this study, I evaluate some popular deep learning frameworks. The candidates are listed in alphabet order: TensorFlow, Theano, and Torch. This is a dynamic document and the evaluation is based the current state of their code, not what the authors claim in white papers...
Tensor Factorization: Statistically Recover Hidden Topics for New York Times
This example demonstrates the recovery of topics from New York Times data obtained in the the following dataset: New York Times bag of words...
Generating Faces with Torch
In this blog post we'll implement a generative image model that converts random noise into images of faces! Code available on Github...
How To Get A Data Science Hiring Manager To Take You Seriously
You're going to industry events to network with data science hiring managers and they don't seem very interested in someone with your background...
Jobs
Commercial Building Energy Efficiency Analysis Tools - U.S. Dept of Energy, Office of Energy Efficiency & Renewable Energy, Building Technologies Office ... Post Graduate Opportunity; Washington, D.C. The program accelerates the voluntary uptake efficient building technologies that are market-viable but underutilized and develops solutions to overcome non-technical barriers to energy efficiency. We accomplish this by engaging in applied research, demonstration and integration of newly commercialized or underutilized advanced technologies, developing tools and solutions to remove barriers to investment and drive uptake of efficiency measures, and utilizing market partnerships to drive technologies and solutions into the commercial buildings marketplace...
Training & Resources
Wrangle Conference 2015
All the videos are up...
10 more lessons learned from building Machine Learning systems
Presentation at MLConf 2015 in San Francisco...
Anyone Can Learn To Code an LSTM-RNN in Python
I learn best with toy code that I can play with. This tutorial teaches Recurrent Neural Networks via a very simple toy example, a short python implementation...
Books
Storytelling with Data: A Data Visualization Guide for Business Professionals New release and getting good reviews...
"A must-read for anyone who works with numbers. I've read other data viz books, but this one is by far the best. Full of useful tips, easy to read, and great examples..."
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S. Interested in reaching fellow readers of this newsletter? Consider sponsoring! Email us for details :) - All the best, Hannah & Sebastian