Data Science Weekly - Issue 67
Issue #67 March 5 2015
Editor Picks
Deep Learning at Flickr, Pierre Garrigues
Pierre Garrigues is a Researcher in Machine Perception and Learning at Flickr and also spoke at the Deep Learning Summit at the end of January to give an insight into how Flickr are automating the labelling of their image libraries using Deep Learning techniques as well as the 10 million uploads which they receive each day...
The Architecture of a Data Visualization
Information Design is playing an increasingly critical role in everyday journalism. The movement from word and picture to “words within diagrams” is building a new form of truth-telling and storytelling — and with it, a new journalistic aesthetic...
Predicting and Plotting Crime in Seattle
I have recently been watching “The Wire” and along with my Amazon Prime membership looking better and better, it’s actually given me some things to think about. Besides making me an expert police detective, it has steadily been making an impact on how I view a city. ...
Data Science Articles & Videos
Seeing Circles, Sines and Signals
I wrote an interactive essay on signals, sampling, and the Discrete Fourier Transform...
Facebook Data Science Team analyse "The Dress"
Fascinating study by our friends at the Facebook Data Science group: men see "The Dress" black & blue more often than women, who see it gold & white more often...
Beginning Deep Learning with 500 lines of Julia
This post will present (a slightly cleaned up version of) the code as a beginner's neural network tutorial...
The JVM Option For Deep Learning When You Are Not Google Or Netflix
Realising the potential of large datasets through Deep Learning, for some, may be considered the remit of an exclusive club of organisations with deep, deep pockets and the infrastructure to assemble dedicated teams to venture into areas of research which have not yet been explored by humans. Not so, according to Adam Gibson, Founder of SkyMind...
Google, Stanford: Machine Learning on 37.8m data points for drug discovery
Researchers from Google and Stanford University have used machine learning methods – deep learning and multitask networks – to discover effective drug treatments for a variety of diseases...
Mapping Your Music Collection
In this article we'll explore a neat way of visualizing your MP3 music collection. The end result will be a hexagonal map of all your songs, with similar sounding tracks located next to each other...
Analyzing data from City of Montreal with Machine Learning
Range of examples of use of scikit_learn with opendata from Montreal - from analyzing real estate prices to predicting food safety violations...
Q&A: How Rent the Runway dazzles shoppers with data
Rent the Runway, a self-described "fashion company with a technology soul," wants to revolutionize how the average person shops for clothes - using big data...
The Time Everyone “Corrected” the World’s Smartest Woman
When vos Savant politely responded to a reader’s inquiry on the Monty Hall Problem, a then-relatively-unknown probability puzzle, she never could’ve imagined what would unfold: though her answer was correct, she received over 10,000 letters, many from noted scholars and Ph.Ds, informing her that she was a hare-brained idiot...
Jobs
Data Scientist - Showpad - San Francisco, CA We’re designing our jobs around the great people we attract, which means you won’t find a listing of your day-to-day duties here. Instead, here’s a little bit about the problem we’re working on...
We have tons of data about what happens during a sale over the course of weeks or months. In some cases, we know second-by-second what’s happening during and after a meeting. We’re building our Data & Analytics team to start mining that data and answer a simple question: why do some deals close and others flop? Imagine what happens if we actually figure it out. That means we know, scientifically, how to have great business meetings and how to close deals. That’s a total a game-changer!
While we’ve been focused on the Design & UX aspects until now, we know that we’re sitting on a goldmine. We’re hiring Data Engineers to build out our data warehouse and Data Scientists to own data mining projects...
Training & Resources
RAD - Outlier Detection on Big Data - Now Open Source
Outlier detection can be a pain point for all data driven companies, especially as data volumes grow. At Netflix we have multiple datasets growing by 10B+ record/day and so there’s a need for automated anomaly detection tools ensuring data quality and identifying suspicious anomalies. Today we are open-sourcing our outlier detection function, called Robust Anomaly Detection (RAD), as part of our Surus project...
Top 50 Data Science Resources
The best blogs, forums, videos and tutorials to learn all about Data Science...
ComputerWorld's R for Beginners Hands-On Guide
Computerworld's Sharon Machlis has done a great service for the R community — and R especially novices — by creating the on-line Beginner's Guide to R. You can read our overview of her guide from 2013 here, but it's been regularly updated since then and is now available in pdf...
Books
Thinking Statistically Accessible introduction to range of statistical techniques...
"A truly excellent read and far more fun than a book about statistics has any right to be..."
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S. Enjoyed the newsletter? Please forward it along to friends and colleagues - we'd love to have them onboard! - All the best, Hannah & Sebastian