Data Science Weekly - Issue 240
Issue #240 June 28 2018
Editor Picks
A team of AI algorithms just crushed humans in a complex computer game
Five different AI algorithms have teamed up to kick human butt in Dota 2, a popular strategy computer game....
From quantitative finance to data science or how to escape the city
How to move from quantitative finance to data science and machine learning. Some insights from my experience escaping the city!...
Self-Supervised Tracking via Video Colorization
In “Tracking Emerges by Colorizing Videos”, we introduce a convolutional network that colorizes grayscale videos, but is constrained to copy colors from a single reference frame. In doing so, the network learns to visually track objects automatically without supervision. Importantly, although the model was never trained explicitly for tracking, it can follow multiple objects, track through occlusions, and remain robust over deformations without requiring any labeled training data...
A Message from this week's Sponsor:
Clark University: Transform Data Into Something Meaningful
Business Analytics at Clark University will give you the skills employers demand by teaching you how to synthesize data into powerful information. Whether you enroll in a full- or part-time master’s or accelerated certificate program, you will be equipped to make informed decisions and improve organizational performance.
You don’t need a background in statistics or science to succeed here. We offer:
Blended curriculum
Career-ready courses
Affordable excellence
Move your career forward in one of the fields with the largest demand. Learn more at clarku.edu/analytics
Data Science Articles & Videos
Everyone Poops
Here in San Francisco, human waste is a growing issue; both for the people who run into it and for the people that have no other options than to relieve themselves on public streets. This is a multi-faceted problem, with many potential solutions that are best solved by social scientists. However, I think there is a place for data science in this conversation...
Major or Minor? Classifying the Mode of a Song
For a while now, I have wanted to work with music in machine learning in one capacity or another. The day of reckoning has come...
Baseball Pitch Recommendation: a look into the data science process.
I’ll walk you through a model I created to recommend pitches to the Cubs in games against the Cardinals, and the steps I took to get there. (Technically, this model could help any team, or any talented pitcher quite frankly when throwing pitches against Cardinals player, but my model is dedicated to my Cubs)...
Attentive GAN for Raindrop Removal from A Single Image
Raindrops adhered to a glass window or camera lens can severely hamper the visibility of a background scene and degrade an image considerably. In this paper, we address the problem by visually removing raindrops, and thus transforming a raindrop degraded image into a clean one...
Add Constrained Optimization To Your Toolbelt
At Stitch Fix, whenever we serve a customer, we must choose the right day to style that client’s fix, the right stylist for that client’s particular aesthetic, the right warehouse for that client’s shipping address. But stylists and warehouses are in high demand, with many clients competing for their time and attention. Constrained optimization helps us get work to stylists and warehouses in a manner that is fair and efficient, and gives our clients the best possible experience...
Data Dictionary: a how to and best practices
A data dictionary is a list of key terms and metrics with definitions, a business glossary. While it is sounds simple, almost trivial, its ability to align the business and remove confusion can be profound. In fact, a data dictionary is possibly one of the most valuable artifacts that a data team can deliver to the business...
How Can Neural Network Similarity Help Us Understand Training and Generalization?
In our most recent collaboration with Google Brain, we measure the similarity between neural network representations to provide insights into generalisation and the training dynamics of RNNs...
Travel Time Optimization With Machine Learning And Genetic Algorithm
What is the relationship between machine learning and optimization? — On the one hand, mathematical optimization is used in machine learning during model training, when we are trying to minimize the cost of errors between our model and our data points. On the other hand, what happens when machine learning is used to solve optimization problems?...
Jobs
eCommerce Data Science & Machine Learning Analyst - PepsiCo - NYC
Have a strong opinion about Tensorflow lacking an autoregressive dynamic network? So do we!
PepsiCo’s eCommerce Data Science and Analytics group is a team of data scientists, technology specialists, and business innovators who operate within eCommerce to build industry-leading systems and solutions. By focusing on machine learning and automation, the Data Science & Analytics group is pushing the bounds of possibility for PepsiCo and its strategic partners...
Training & Resources
Turn A List Of PyTorch Tensors Into One Tensor
Learn how to turn a list Of PyTorch Tensors into One Tensor, via a screencast video and full tutorial transcript...
The Hitchhiker’s Guide to Hyperparameter Tuning
If you go around and ask people how they tune their models, their most likely answer will be “just write a script that does it for you”. Well, that’s easier said than done... Apparently, there are a few things you should keep in mind when implementing such a script. Here, at Taboola, we implemented a hyperparameter tuning script. Let me share with you the things we learned along the way...
Using fastText and Comet.ml to classify relationships in Knowledge Graphs
In this post, we will examine how a simple model, fastText, learns to represent entities in a subset of the FB15K knowledge graph, by classifying the relationship between pairs of entities in the graph...
Books
Text Mining with R: A Tidy Approach Much of the data available today is unstructured and text-heavy, making it challenging for analysts to apply their usual data wrangling and visualization tools. With this practical book, you’ll explore text-mining techniques with tidytext, a package that authors Julia Silge and David Robinson developed using the tidy principles behind R packages like ggraph and dplyr. You’ll learn how tidytext and other tidy tools in R can make text analysis easier and more effective....
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S., Want to reach our audience / fellow readers? Consider sponsoring - grab a spot now; first come first served! All the best, Hannah & Sebastian