Data Science Weekly - Issue 89
Issue #89 August 6 2015
Editor Picks
Machine Learning And Human Bias: An Uneasy Pair
I’d wager that most readers feel a little uneasy about how the Chicago PD Heat List was implemented – even if they agree that the intention behind the algorithm was good. To use machine learning and public data responsibly, we need to have an uncomfortable discussion about what we teach machines and how we use the output...
These Artworks Were Made by Algorithms
Think artificial intelligence and you probably envisage clear-cut tests of mental acumen: solving equations, processing scientific data, winning at Chess. But intelligence—in the very human way we understand it—is about more than ticking boxes. It requires a little bit of creativity...
How To Prepare For A Data Science Training Course
You have decided to start a data science training program. Maybe it's a bootcamp, maybe it's a fellowship, maybe it's an apprenticeship, or maybe it's a professional degree like a masters program. In either case, you are ready to to make the most out of the situation. The only thing left to do is to prepare for the program so that you can achieve your eventual goal of getting a data science job...
A Message from this week's Sponsor
Try JIRA for Free Today
For $10/month, enable collaboration across your team with the most flexible tool out there. Trusted by 30,000 companies of every size and industry.
Get Started!
Data Science Articles & Videos
Google And MIT Researchers Demo An Algorithm That
Lets You Take Clear Photos Through Reflections
Whenever I try to take a photo through a plane or hotel window, chances are there are plenty of reflections that show up on the final image and ruin it. Now, however, Google and MIT researchers have found a way to take these photos and automatically remove these reflections and other obstructions...
Visualizing GoogLeNet Classes
Ever wondered what a deep neural network thinks a Dalmatian should look like? Well, wonder no more...
Cut the marketing nonsense: Will the real data scientist please stand up?
Marketing people are just muddying the waters by misappropriating the 'data scientist' job title, according to former CERN physicist Dr Paul Schaack...
Using other compute engines with Ibis
Several people have asked me about using Ibis with execution engines other than Impala. The purpose of this post is to explain how one can make Ibis work with other systems and what that might mean for the actual users....
Learning Seattle's Work Habits from Bicycle Counts (Updated!)
Last year I wrote a blog post examining trends in Seattle bicycling and how they relate to weather, daylight, day of the week, and other factors... where the previous post examined the data using a supervised machine learning approach for data modeling, this post will examine the data using an unsupervised learning approach for data exploration...
GRUV: Algorithmic Music Generation using Recurrent Neural Networks
We compare the performance of two different types of recurrent neural networks (RNNs) for the task of algorithmic music generation, with audio waveforms as input. Our results indicate that the generated outputs of the LSTM network were significantly more musically plausible than those of the GRU...
GAM: The Predictive Modeling Silver Bullet
Despite its lack of popularity in the data science community, GAM is a powerful and yet simple technique. Hence, the purpose of this post is to convince more data scientists to use GAM. Of course, GAM is no silver bullet, but it is a technique you should add to your arsenal...
Large-scale machine learning at Criteo
At Criteo, machine learning lies at the core of our business. We use machine learning for choosing when we want to display ads as well as for personalized product recommendations and for optimizing the look & feel of our banners (as we automatically generate our own banners for each partner using our catalog of products)... we’ve built a large scale distributed machine learning framework, called Irma, that we use in production and for running experiments when we search for improvements on our models...
Data scientists to CEOs: You can’t handle the truth
Too many big data initiatives fail because companies, top to bottom, aren’t committed to the truth in analytics. Let me explain...
Jobs
Data Scientist - American Express - New York, NY American Express is working on our company’s next transformation—integrating into the digital universe and developing new forms of payment and lifestyle services. As a Data Scientist, you will be part of a team dedicated to helping American Express accelerate its digital transformation, and you will be challenged with designing winning applications and developing new Big Data capabilities and innovation that will elevate American Express to the forefront of the digital revolution...
Training & Resources
Deep Learning Playbook
To help those who are interested to know and use deep learning in their problems, I curated a minimal list of resources based on my own learning experience to get your hands dirty on deep learning quickly without being distracted by the flood of materials online...
Top 10 Machine Learning APIs
The APIs that made it to our top 10 machine learning APIs list offer a wide range of capabilities including image tagging, face recognition, document classification, speech recognition, predictive modeling, sentiment analysis, and pattern recognition...
DataSet - 18 months of NYC Taxi & Limo Trip Data
This dataset includes trip records from all trips completed in yellow and green taxis in NYC in 2014 and select months of 2015...
Books
Effective Python: 59 Specific Ways to Write Better Python Recommended by several readers of the newsletter...
"Effective Python is a time-efficient way to learn – or remind yourself – what the best practices are and why we use them. It’s a concise book of practical techniques to write maintainable, performant and robust code using practices widely accepted in the community..."
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S. Interested in reaching fellow readers of this newsletter? Consider sponsoring! Email us for details :) - All the best, Hannah & Sebastian