Data Science Weekly - Issue 244
Issue #244 July 26 2018
Editor Picks
One Data Science Job Doesn’t Fit All
At Airbnb, wee recently established a role-defining framework. My hope is that what we've learned along the way can help other companies be strategic in defining data science roles. The main takeaway I will share is that companies consider three tracks of data science work to meet the needs of your business — Analytics, Inference, and Algorithms. Below I'll describe the evolution of how we came to these three tracks of work and how it helps...
Google's AutoML: Cutting Through the Hype
In today’s post, I want to look specifically at Google’s AutoML, a product which has received a lot of media attention, and address the following: What is Google's AutoML? What is transfer learning? Why all the hype about Google's AutoML?...
uTensor - Test Release: AI inference library based on mbed and TensorFlow
uTensor is an extremely light-weight machine learning inference framework built on Mbed and Tensorflow. The project contains a runtime library and an offline tool. The total size of graph definition and algorithm implementation of a 3-layer MLP produced by uTensor is less than 32kB in the resulting binary (excluding the weights)...
A Message from this week's Sponsor:
Mode Studio: a complete toolkit for every analyst
Mode Studio combines a SQL editor, Python & R notebooks, and a visualization builder in one platform. And it's free forever. Connect data from anywhere and analyze with the best language for the job, without having to jump between tools. Build custom visualizations or use our out-of-the-box charts. Share your analysis with a click—every report lives at a URL.
Data Science Articles & Videos
Why Germany did not defeat Brazil in the final, or Data Science lessons from the World Cup
We review World Cup predictions (all failed), examine what makes such events difficult to predict, and suggest 3 golden rules to determine when you can trust the predictions...
Biased random number generation
Very clear explanation of how biases arise in RNG algorithms...
New Research on Multi-Task Learning
Multi-task learning is an alternative approach to training machine learning algorithms that allows machines to master more than one task; machines gain the ability to benefit from task relationships. Machine learning becomes, a little bit more, like human learning - capable of taking on more complex challenges involving richer representations of reality...
Software beats animal tests at predicting toxicity of chemicals
Machine learning on mountain of safety data improves automated assessments...
A survey on policy search algorithms for learning robot controllers in a handful of trials
Most policy search algorithms require thousands of training episodes to find an effective policy, which is often infeasible with a physical robot. This survey article focuses on the extreme other end of the spectrum: how can a robot adapt with only a handful of trials (a dozen) and a few minutes?...
signboardr
Utilize Google Vision API to extract text from archaeological photos containing a sign board. Further, the extracted text can be added as searchable XMP metadata tags to photos...
A New Hope: AI for News Media
If the news media wants to affect how news content is created, developed, presented and delivered to us in the future, they need to take an active role in AI development. If news organizations want to understand the way data and information are constantly affected and manipulated in digital environments, they need to start embracing the possibilities of machine learning...
Building your own Duplex AI agent using Rasa and Twilio
In this article I’m going to illustrate how you can build your own Duplex-like agent to handle phone calls autonomously. We’re going to approach the problem at hand from the other direction — calling a business and talking with a machine (rather than a machine calling a business)...
Jobs
Data Scientist (Product) - Spotify - NYC
We are looking for a Data Scientist to join the band and help drive a data-first culture across Spotify. As a Data Scientist, our mission is to turn terabytes of data into insights and get a deep understanding of how our people use our apps to impact the product, strategy and direction of Spotify. You will study user behavior, strategic initiatives, content, and new features and bring data and insights into every decision we make. Above all, your work will impact the way the world experiences music...
Training & Resources
Clip PyTorch Tensor Values To A Range
Learn how to clip PyTorch Tensor values to a range by using the PyTorch clamp operation, via a screencast video and full tutorial transcript...
Using Python to Figure out Sample Sizes for your Study
Understanding the sample size you need depends on the statistical test you plan to use. If it’s a straightforward test, then finding the desired sample size can be just a matter of plugging numbers into an equation. However, it can be more involved, in which case a programming language like Python can make life easier. In this post, I’ll go through one of these more difficult cases...
Neural Networks gone wild! They can sample from discrete distributions now!
In the following sections you'll learn: what the Gumbel distribution is, how it is used for sampling from a discrete distribution, how the weights that affect the distribution's parameters can be trained, how to use all of that in a toy example (with code)...
Books
Guesstimation: Solving the World's Problems on the Back of a Cocktail Napkin "Guesstimation enables anyone with basic math and science skills to estimate virtually anything--quickly--using plausible assumptions and elementary arithmetic"...
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S., Want to reach our audience / fellow readers? Consider sponsoring - grab a spot now; first come first served! All the best, Hannah & Sebastian