Data Science Weekly - Issue 185
Issue #185 June 8 2017
Editor Picks
Google Brain Residency
Last year, after nerding out a bit on TensorFlow, I applied and was accepted into the inaugural class of the Google Brain Residency Program. The program invites two dozen people, with varying backgrounds in ML, to spend a year at Google's deep learning research lab in Mountain View to work with the scientists and engineers pushing on the forefront of this technology. The year has just concluded and this is a summary of how I spent it...
A Retiree Discovers An Elusive Math Proof - And No-one Notices
As he was brushing his teeth on the morning of July 17, 2014, Thomas Royen, a little-known retired German statistician, suddenly lit upon the proof of a famous conjecture at the intersection of geometry, probability theory, and statistics that had eluded top experts for decades...
Customer Service Bots Are Getting Better at Detecting Your Agitation
SRI International, the Silicon Valley research lab where Apple’s virtual assistant Siri was born, is working on a new generation of virtual assistants that respond to users’ emotions...
A Message from this week's Sponsor:
Get started with Python for data science in minutes
Using Python for data science and machine learning is easy with ActiveState’s Python distribution. Pre-bundled with 300+ packages, ActivePython includes NumPy, SciPy, scikit-learn, TensorFlow, Theano and Keras, and leverages the Intel Math Kernel Library, so you can focus on your data and not setting up software. Download ActivePython and start developing for free.
Data Science Articles & Videos
You can probably use deep learning even if your data isn't that big
Over at Simply Stats Jeff Leek posted an article entitled “Don’t use deep learning your data isn’t that big” that I’ll admit, rustled my jimmies a little bit. To be clear, I don’t think deep learning is a universal panacea and I mostly agree with his central thesis (more on that later), but I think there are several things going on at once, and I’d like to explore a few of those further in this post...
How big data can help you pick better wine
There are currently over 5,000 distinct bottles of Bordeaux-style red blends available for purchase on Wine.com. Rather than segmenting these wines using traditional structured data — like price, vintage, winery, grape varietal — what if we could instead rely on the rich, expressive language used in the product description and expert reviews posted online? Enter NLP (natural language processing)...
These days in baseball, every batter is trying to find an angle
With increasingly sophisticated data available, major league hitters are focusing on getting the ball in the air...
How to Call BS on Big Data: A Practical Guide
"Nothing that you will learn in the course of your studies will be of the slightest possible use to you,” the Oxford philosophy professor John Alexander Smith told his students, in 1914, “save only this: if you work hard and intelligently, you should be able to detect when a man is talking rot.”...
What If People Run Out of Things to Do?
What gives our lives meaning? And what if one day, whatever gives us meaning went away—what would we do then? I’m still thinking about those weighty questions after finishing Homo Deus, the provocative new book by Yuval Noah Harari...
Google Sprinkles AI on Its Spreadsheets to Automate Away Some Office Work
In Google’s commercial for its virtual assistant, people ask it to play dance music, videos, and set a timer. A new feature from the search giant that lets you ask questions of its online spreadsheets is less flashy, but it could be the start of something that has a huge impact on how some companies operate...
Geometry of Optimization and Implicit Regularization in Deep Learning
We argue that the optimization plays a crucial role in generalization of deep learning models through implicit regularization. We do this by demonstrating that generalization ability is not controlled by network size but rather by some other implicit control. We then demonstrate how changing the empirical optimization procedure can improve generalization, even if actual optimization quality is not affected. We do so by studying the geometry of the parameter space of deep networks, and devising an optimization algorithm attuned to this geometry...
A simple neural network module for relational reasoning
Relational reasoning is a central component of generally intelligent behavior, but has proven difficult for neural networks to learn. In this paper we describe how to use Relation Networks (RNs) as a simple plug-and-play module to solve problems that fundamentally hinge on relational reasoning...
Jobs
Netflix - Los Gatos & Los Angeles, CA We are looking to fill several key roles across our Data Science groups.
Director, Production Science & Algorithms In this role, you will lead a high-impact data science team focused on the digital supply chain at Netflix. The problems this team will work on have a direct impact on the viewing experience of our global member base, including ensuring that the digital assets (video, audio, and subtitle/text files) are of high quality, and developing new algorithms and metrics to improve the perceptual quality of our encoded assets.
Manager, Content Programming Science & Algorithms The ideal candidate for Manager of Content Programming Science & Algorithms is an experienced and entrepreneurial-minded data scientist. This is high-impact and challenging role, and will require both strong leadership and technical prowess.
Senior Data Scientist, Content Science & Algorithms We are looking for an experienced individual who is passionate about data science and enjoys working in a collaborative environment. Members of the Content Science team typically work on one or two projects (e.g. predicting movie viewership) over any six month period.
Training & Resources
NeuroNER
Named-entity recognition using neural networks. Easy-to-use and state-of-the-art results...
WebDNN: Fastest DNN Execution Framework on Web Browser
WebDNN is an open source software framework for executing deep neural network (DNN) pre-trained model on web browser...
miner and craft
In addition to our miner package and our in-development bookdown book, the R/minecraft team from the ROpenSci Unconference had created a bunch of other useful code for interacting with Minecraft from R, which we’re putting into a second package...
Books
Probability - A Beginner's Guide To Permutations And Combinations: The Classic Equations, Better Explained "The focus of this book is on understanding why the permutation and combination equations are what they are, which ends up making them a lot easier to understand, remember, and expand than simply memorizing the equations"...
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S. Looking to hire a Data Scientist? Find an awesome one among our readers! Email us for details on how to post your job :) - All the best, Hannah & Sebastian