Data Science Weekly - Issue 70
Issue #70 March 26 2015
Editor Picks
Three Things About Data Science You Won't Find In the Books
In case you haven’t heard yet, Data Science is all the craze. Courses, posts, and schools are springing up everywhere. However, every time I take a look at one of those offerings, I see that a lot of emphasis is put on specific learning algorithms. Of course, understanding how logistic regression or deep learning works is cool, but once you start working with data, you find out that there are other things equally important, or maybe even more...
Images that fool computer vision raise security concerns
Computers are learning to recognize objects with near-human ability. But Cornell researchers have found that computers, like humans, can be fooled by optical illusions, which raises security concerns and opens new avenues for research in computer vision...
How to Train a Data Scientist
Panel discussion with range of Data Scientists...
Data Science Articles & Videos
Six Years In. A Few Thoughts on Foursquare.
But the real interesting part of the Foursquare story is all the technology we’ve had to build so that, say, the Foursquare app can ping you to suggest a sandwich shop you’d love as you walk through a neighborhood for the first time, or so the Swarm app can automatically “snap” you to the place we know you’re about to check in to. There’s a reason that we’re one of the only companies doing proactive and predictive local search and firing off contextual notifications — it’s hard...
The Math of March Madness
These days, when statistical algorithms can figure out what breakfast cereal you want based on your browser history, stats-minded hoops fans have thrown lots of complex analysis at the problem of picking winners. But what’s the best method?...
Congress is a Game, and We Have the Data to Show Who’s Winning
Most approaches to measuring the influence that different interest groups have in Congress favor quantitative metrics like money spent lobbying vs money saved in tax breaks. But while that kind of data is great at shining a spotlight on an individual corporation, it does little to illuminate the playing field and reveal how much influence groups have relative to one another...
Assembling thefacebook: Using heterogeneity to understand online social network assembly
How much of a network's assembly is driven by simple growth? How does a network's structure change as it matures? How does network structure vary with adoption rates and user heterogeneity, and do these properties play different roles at different points in the assembly? We investigate these and other questions using a unique dataset of online connections among the roughly one million users at the first 100 colleges admitted to Facebook, captured just 20 months after its launch...
Crushed it! Landing a data science job
Data science interviews are the worst because data science is interdisciplinary: code for “you have to know everything about all the disciplines.” Depending on the company and the team, your interview might look like a software developer’s interview, or it might look a like a statistician’s interview, and the bad news is that virtually none of the material overlaps. I recently spent a ton of time studying for interviews and I’ve got some hot tips to pass along if you’re thinking about a move soon...
Brad Klingenberg, StitchFix on Decoding Fashion via Analytics and ML
We discuss the challenges in making personal styling recommendations, unexpected insights, interesting trends, motivation, advice, desired qualities in data scientists and more...
Pinnability: Machine learning in the home feed
Pinnability is the collective name of the machine learning models we developed to help Pinners find the best content in their home feed. It’s part of the technology powered by smart feed, which we introduced last August, and estimates the relevance score of how likely a Pinner will interact with a Pin. With accurate predictions, we prioritize those Pins with high relevance scores and show them at the top of home feed...
Memantic: A Medical Knowledge Discovery Engine
We present a system that constructs and maintains an up-to-date co-occurrence network of medical concepts based on continuously mining the latest biomedical literature. Users can explore this network visually via a concise online interface to quickly discover important and novel relationships between medical entities...
Would you hire a mathematician with limited statistics knowledge for a data scientist role?
Excellent set of answers to this question on Quora...
Jobs
Data Mining Intern - The Cheesecake Factory - Calabasas, CA You may know us as a company with great food…You may also know us from ‘Fortune’s 100 Best Companies to Work For’ list…What you may not know is our Internship Program is reinventing what it means to be an “interestingly educational experience”...
Training & Resources
Google DeepMind
All the papers in one place...
New Online Tool for Seasonal Adjustment
Useful tool to enable seasonal adjustment of data sets...
10 Common Misconceptions about Neural Networks
As a computer scientist, I often get asked about neural networks because people would like to use them but often don't know how to go about it. Alternatively, they may have tried to use them but were disappointed in the results. Neural Networks don't have to be hard to use...
Books
The Drunkard's Walk: How Randomness Rules Our Lives Excellent book on randomness in day to day lives...
"This smart book will make you think. Academic yet easy to read, it explores how random events shape the world and how human intuition fights that fact...."
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S. Enjoyed the newsletter? Please forward it along to friends and colleagues - we'd love to have them onboard! - All the best, Hannah & Sebastian