Data Science Weekly - Issue 266
Issue #266 Dec 27 2018
Editor Picks
The Netflix Data War
A recent article in the Wall Street Journal, “At Netflix, Who Wins When It’s Hollywood vs. the Algorithm?” by Shalini Ramachandran and Joe Flint details some of the internal debates within Netflix between the Los Angeles-based content team, which is in charge of developing and marketing new content for the streaming service, and the data team. I thought it was a useful place to launch a discussion about the activity of a data team and how it interfaces with other aspects of a company...
We tried teaching an AI to write Christmas movie plots. Hilarity ensued. Eventually.
Using a neural network to create ridiculous plot lines takes a lot of work—and reveals the challenges of generating human language...
Why Data Is Never Raw
On the seductive myth of information free of human judgment...
A Message from this week's Sponsor:
Save the Date | Rev Summit for Data Science Leaders | May 23-24, 2019
Join Daniel Kahneman -- world-renowned psychologist and author of Thinking, Fast and Slow -- along with data science leaders from Netflix, Nike, Google, Slack, Turner Broadcasting System, Lloyds Banking Group and more at this year’s Rev Summit for Data Science Leaders in New York City, May 23-24. This year’s event, co-chaired by Derwen’s Paco Nathan and Domino Data Lab, will focus on providing practical guidance to teams aspiring to make data science an enterprise-grade capability. Teams of 4+ get a 50% discount.
Apply to speak or register to attend today.
Data Science Articles & Videos
One Giant Step for a Chess-Playing Machine
The stunning success of AlphaZero, a deep-learning algorithm, heralds a new age of insight — one that, for humans, may not last long...
Trends in Deep Learning with Jeremy Howard
In this episode of our AI Rewind series, we’re bringing back one of your favorite guests of the year, Jeremy Howard, founder and researcher at Fast.ai. Jeremy joins us to discuss trends in Deep Learning in 2018 and beyond. We cover many of the papers, tools and techniques that have contributed to making deep learning more accessible than ever to so many developers and data scientists...
The year in AI/ML advances: 2018 roundup
It has become a sort of tradition for me to try to summarize ML advances at this time of the year. As always, this summary will necessarily be biased by my own interests and focus, but I have tried to keep it as broad as possible...
Photo Wake-Up: 3D Character Animation from a Single Photo
We present a method and application for animating a human subject from a single photo. E.g., the character can walk out, run, sit, or jump in 3D. The key contributions of this paper are: 1) an application of viewing and animating humans in single photos in 3D, 2) a novel 2D warping method to deform a posable template body model to fit the person's complex silhouette to create an animatable mesh, and 3) a method for handling partial self occlusions...
Neuroevolution-Bots
Neuroevolution-Bots is a personal project that demonstrates neuroevolution in a browser environment using TensorFlow.js, Neataptic (for neural nets) and HTML5 Canvas (for graphics). I tried to create a scaled down 2D version of the popular Gym’s Humanoid-v2 environment using Planck.js, a JavaScript rewrite of Box2D...
DeepSolar: A Machine Learning Framework to Efficiently Construct a Solar Deployment Database in the United States
How deep learning helped to map every solar panel in the US...
Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation
Creating super slow motion videos by predicting missing frames using a neural network, instead of simple interpolation. With code...
Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond
We introduce an architecture to learn joint multilingual sentence representations for 93 languages, belonging to more than 30 different language families and written in 28 different scripts. Our system uses a single BiLSTM encoder with a shared BPE vocabulary for all languages, which is coupled with an auxiliary decoder and trained on publicly available parallel corpora. This enables us to learn a classifier on top of the resulting sentence embeddings using English annotated data only, and transfer it to any of the 93 languages without any modification...
Jobs
Senior Data Scientist/Machine Learning Engineer - PepsiCo eCommerce - NYC
Want to build an RL system with real money against business experts? Apply now! PepsiCo operates in an environment undergoing immense and rapid change, driven by eCommerce and emergent retail technologies. To ensure continued success in the food and beverage space, PepsiCo has assembled a dedicated eCommerce team – tasked with optimizing eCommerce operations and developing innovations that will give PepsiCo a sustainable competitive advantage. While tied closely to broader PepsiCo, the eCommerce group more closely resembles a start-up environment; embracing the core values of having bias for action, being results oriented, maintaining a community-focus, and prioritizing people
PepsiCo’s Data Science and Analytics group is a team of data scientists, technology specialists, and business innovators who operate within eCommerce to build industry-leading systems and solutions. By focusing on machine learning and automation, the Data Science & Analytics group is pushing the bounds of possibility for PepsiCo and its strategic partners...
Training & Resources
PyTorch item: Convert A 0-dim PyTorch Tensor To A Python Number
Learn how to use PyTorch's item operation to convert a 0-dim PyTorch Tensor to a Python number, via a screencast video and full tutorial transcript...
Deep Graph Infomax
General approach for learning node representations within graph-structured data in an unsupervised manner based upon mutual information, rather than random walks...
Introducing Pandas-Sets: Set-oriented Operations in Pandas
I frequently find myself storing standard Python set objects in DataFrame columns. This usually happens when I have some kind of a tags or labels column for each observation. It can also be the output of a groupby operation where the end result needs to be a list-like (or set-like) object before it's aggregated. Using set operations (union, intersection etc.) can come in handy in such cases...
Books
Math for Machine Learning:
Open Doors to Data Science and Artificial Intelligence
From self-driving cars and recommender systems to speech and face recognition, machine learning is the way of the future. Would you like to learn the mathematics behind machine learning to enter the exciting fields of data science and artificial intelligence? There aren't many resources out there that give simple detailed examples and that walk you through the topics step by step.
This book not only explains what kind of math is involved and the confusing notation, it also introduces you directly to the foundational topics in machine learning. This book will get you started in machine learning in a smooth and natural way, preparing you for more advanced topics and dispelling the belief that machine learning is complicated, difficult, and intimidating.
Praise from students
"Your book is by far the best I’ve found for understanding the derivations of machine learning algorithms. I love that you don’t skip steps and that you provide clear examples."--Robert H"
Link to preview of first 2 chapters and table of contents available here
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S., Want to reach our audience / fellow readers? Consider sponsoring - grab a spot now; first come first served! All the best, Hannah & Sebastian