Data Science Weekly - Issue 195
Issue #195 Aug 17 2017
Editor Picks
Hype or Not? Some Perspective on OpenAI’s DotA 2 Bot
The OpenAI news came as such a shock. How can this be true? Have there been recent breakthroughs that I wasn’t aware of? As I started looking more into what exactly the DotA 2 bot was doing, how it was trained, and what game environment it was in, I came to the conclusion that it’s an impressive achievement, but not the AI breakthrough the press would like you to believe it is. That’s what this post is about. I would like to offer a sober explanation of what’s actually new...
Amazing graphics from the 1950s New York Times archive
The “morgue” is a smelly storage room in a dark basement just down the street from The New York Times headquarters. About seven million photographs and tens of millions of clippings are stored there. A journalist’s dream, a minimalist’s nightmare...
Machine Learning for Flappy Bird using Neural Network and Genetic Algorithm
Here is the source code for a HTML5 project that implements a machine learning algorithm in the Flappy Bird video game using neural networks and a genetic algorithm. The program teaches a little bird how to flap optimally in order to fly safely through barriers as long as possible...
A Message from this week's Sponsor:
Get started with Python for data science in minutes
Using Python for data science and machine learning is easy with ActiveState’s Python distribution. Pre-bundled with 300+ packages, ActivePython includes NumPy, SciPy, scikit-learn, TensorFlow, Theano and Keras, and leverages the Intel Math Kernel Library, so you can focus on your data and not setting up software. Download ActivePython and start developing for free.
Data Science Articles & Videos
Encartopedia:
How machine learning can power new interfaces for exploring Wikipedia
For this experiment, Encartopedia, I used machine learning techniques and visualization to explore new navigation possibilities for Wikipedia while preserving its hypertextual feel. With Encartopedia, you can map the path of any journey through Wikipedia, or use the visualization to jump to articles near and far...
Meet the Bregman Divergences
What I hope to do in this post is gently introduce you to the Bregman divergences, point out some of their interesting properties, and highlight one result that I found surprising and I believe is underappreciated...
An AI Dreamed Up Street Scenes, and They’re Surprisingly Good
You're looking at pure fiction: this image was actually created by an AI, trained on the kinds of driver's-eye labeled images often supplied to self-driving cars...
Inside the Increasingly Complex Algorithms That Get Packages to Your Door
Working out the best way to deliver parcels is a near-impossible job, and it’s only getting harder...
Captioning Novel Objects in Images
The task of visual description aims to develop visual systems that generate contextual descriptions about objects in images. Visual description is challenging because it requires recognizing not only objects (bear), but other visual elements, such as actions (standing) and attributes (brown), and constructing a fluent sentence describing how objects, actions, and attributes are related in an image (such as the brown bear is standing on a rock in the forest)...
Simple Square Packing Algorithm
In a recent project the design asked for a component which shows a small number of values in squares. It was important to represent the relation between the values, so they should be mapped to the area and not the size of the squares...
Autoregressive Convolutional Neural Networks for Asynchronous Time Series
We propose 'Significance-Offset Convolutional Neural Network', a deep convolutional network architecture for multivariate time series regression. The model is inspired by standard autoregressive (AR) models and gating mechanisms used in recurrent neural networks. It involves an AR-like weighting system, where the final predictor is obtained as a weighted sum of sub-predictors while the weights are data-dependent functions learnt through a convolutional network...
Thoughts after taking the Deeplearning.ai courses
Between a full time job and a toddler at home, I spend my spare time learning about the ideas in cognitive science & AI...
Jobs
Data Scientist - Qubit - London, UK We’re looking for a Data Scientist to join our Research team, to help us develop intelligent products around this data, and conduct cutting-edge research into consumer behaviour on the web.
This is a great opportunity to conduct real R&D around human behaviour. Our data collection tools store more than 1 billion data points every day. Overall, Qubit technology tracks consumer journeys leading to billions of pounds of online spending worldwide every year, for some of the largest names in online retail.
We’re looking for someone smart and motivated, with experience solving real data analysis problems with statistical and machine learning techniques. As part of our research team you’ll help to understand our ever growing dataset, working closely with other parts of the business to ensure our products are ahead of the competition...
Training & Resources
Stanford Lecture Collection | Convolutional Neural Networks for Visual Recognition (Spring 2017)
This lecture collection is a deep dive into details of the deep learning architectures with a focus on learning end-to-end models for these tasks, particularly image classification. From this lecture collection, students will learn to implement, train and debug their own neural networks and gain a detailed understanding of cutting-edge research in computer vision...
Pandas tips and tricks
This post includes some useful tips for how to use Pandas for efficiently preprocessing and feature engineering from large datasets...
Python Data Science Handbook
This website contains the full text of the Python Data Science Handbook by Jake VanderPlas; the content is available on GitHub in the form of Jupyter notebooks...
Books
The Book of R: A First Course in Programming and Statistics "The Book of R is a comprehensive, beginner-friendly guide to R, the world’s most popular programming language for statistical analysis. Even if you have no programming experience and little more than a grounding in the basics of mathematics, you’ll find everything you need to begin using R effectively for statistical analysis"...
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S. Want to break into Data Science? We've put together a comprehensive guide to get you started. Check it out here! :) - All the best, Hannah & Sebastian