Data Science Weekly - Issue 81
Issue #81 June 11 2015
Editor Picks
YCombinator 2015 predictions (based on machine learning)
I've created a machine learning algorithm that learns from the yclist of dead, active and exited startups. The tool only uses the names of the companies...
How Airbnb Uses Big Data And Machine Learning
To Guide Hosts To The Perfect Price
Airbnb wants its hosts to set their own prices. But the home-sharing company, armed with billions of data points, is nevertheless starting to nudge hosts toward prices that earn them — and Airbnb — more money...
Computer Independently Solves 150 Year Old Biological Mystery
For the first time ever a computer has managed to develop a new scientific theory using only its artificial intelligence, and with no help from human beings...
A Message from this week's Sponsor
Want to be a Data Scientist, but don't know where to start?
Learn essential Data Science skills in SlideRule's Intro to Data Science Workshop. In this online bootcamp, you'll learn R, data wrangling, analytics and visualization by working on real projects, with 1-on-1 mentorship from expert Data Scientists from LinkedIn, Glassdoor, Trulia and Stripe.
Spots are limited; registration ends in 48 hours!
Data Science Articles & Videos
New Website can Identify Birds using Photos
In a breakthrough for computer vision and for bird watching, researchers and bird enthusiasts have enabled computers to achieve a task that stumps most humans—identifying hundreds of bird species pictured in photos...
Another Tottering Step Toward a New Era of Data-Making
Ken Benoit, Drew Conway, Benjamin Lauderdale, Michael Laver, and Slava Mikhaylov have an article forthcoming in the American Political Science Review that knocked my socks off when I read it this morning...
Talking Machines #12: The Economic Impact of Machine Learning
and Using The Kernel Trick on Big Data
In episode twelve we talk with Andrew Ng, Chief Scientist at Baidu, we’re introduced to random features for large-scale kernel machines, and we take a listener question about the size of computing power in machine learning...
Predicting Gender from Music Tastes
Continuing on my mission to get better at Python I’ve been learning about the Pandas and sklearn libraries. I was looking for a challenge to use these libraries on and I had recently come across a nice lastFM data extract...
How Much Did It Rain? Winner's Interview: 2nd place, No Rain No Gain
Kagglers Sudalai and Marios came together to form team "No Rain No Gain!" and take second place in the How Much Did it Rain? competition. Sudalai had two goals in competing: to earn a Master's badge and to finish in the top 100. In the blog below, Sudalai shares how he managed to accomplish both (and get a new friend) by being part of a great team...
Developing for Development: Machine Learning for the Greater Good
When we contemplate machine learning, we might think of our phone companions Siri or Cortana, hot fitness items on the market now that adjust to our routines and abilities, or about big business and data science. Seldom does our picture of “cutting-edge” involve places that are not industrialized. But the “developing” world is a rich forum for machine learning and artificial intelligence...
Extracting text from an image using Ocropus
In this post, I'll explain how to extract text from images like these using the Ocropus OCR library...
Visualizing and Understanding Recurrent Networks
Recurrent Neural Networks (RNNs), and specifically a variant with Long Short-Term Memory (LSTM), are enjoying renewed interest as a result of successful applications in a wide range of machine learning problems that involve sequential data...
State of Hyperparameter Selection
Historically hyperparameter determination has been a woefully forgotten aspect of machine learning...
Jobs
Data Scientist - ZocDoc - New York, NY Are you a driven and inquisitive data scientist with a PhD in a STEM field? Do you love finding insights in large, unique datasets? Want to play a crucial role at a dynamic company that’s improving healthcare for patients across the country? Join ZocDoc as a Data Scientist! As part of this fast-growing team, you’ll develop deep insights for our business units to execute as we continue to grow, and have a tremendous impact on the success of a game-changing company...
Training & Resources
Interview Questions for Data Scientist Positions
These are some questions I came up with when I was asked to conduct interviews...
Videos of Complete Deep Learning Course (Oxford University, 2015)
Nando de Freitas lecture series...
Chainer: A Powerful, Flexible, and Intuitive Framework of Neural Networks
Python Deep Learning library...
Books
Signal: Understanding What Matters in a World of Noise New release!...
"In Signal, I provide straightforward and practical instruction in everyday signal detection. Using data visualization methods, I teach how you can apply statistics to gain a comprehensive understanding of your data, which will serve as the context for signal detection. I then adapt the techniques of Statistical Process Control in new ways to detect not just changes in the measures but also significant changes in the patterns that characterize your data..."
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S. Interested in reaching fellow readers of this newsletter? Consider sponsoring! Email us for details :) - All the best, Hannah & Sebastian