Data Science Weekly - Issue 50
Issue #50 Nov 6 2014
Editor Picks
Hacker's Guide To Neural Networks
You might be eager to jump right in and learn about Neural Networks, backpropagation, how they can be applied to datasets in practice, etc. But before we get there, I'd like us to first forget about all that. Let's take a step back and understand what is really going on at the core....
Text Visualization Browser - A Visual Survey Of Text Visualization Techniques
In this website, we present an interactive visual survey of text visualization techniques...
What Statistical Analysis Should I Use?
The table below covers a number of common analyses and helps you choose among them based on the number of dependent variables and the nature of your independent variables...
Data Science Articles & Videos
Blueprint For An Analytical NFL Franchise, Version 0.1
It feels like we're having another one of those moments where some fans are pining for a more analytical NFL. Here are my very nascent thoughts on this topic...
The State Of Deep Learning In 2014
Overview of some exciting Deep Learning developments as of October 2014...
The Browsemaps: Collaborative Filtering at LinkedIn [PDF]
This paper presents LinkedIn’s horizontal collaborative filtering infrastructure, known as browsemaps...We also present case studies on how LinkedIn uses this platform in various recommendation products, as well as lessons learned in the field over the several years this system has been in production...
Researchers: AI Program Smart Enough To Enter 80% Of Private Universities
An AI program can now pass the entrance examinations of 80 percent of private universities in Japan but still falls short of acceptance to the nation’s most prestigious school...
My Three Ex’s: A Data Science Approach For Applied Machine Learning
Today, I gave a talk at QCon SF entitled “My Three Ex’s: A Data Science Approach for Applied Machine Learning”. The talk wasn’t about machine learning as such, but rather about applying machine learning to solve problems...Hence my three ex’s:...
First Demonstration Of Artificial Intelligence On A Quantum Computer
A Chinese team of physicists have trained a quantum computer to recognise handwritten characters, the first demonstration of “quantum artificial intelligence...
Unit Tests for Stochastic Optimization [PDF]
Optimization by stochastic gradient descent is an important component of many large-scale machine learning algorithms... In this paper we develop a collection of unit tests for stochastic optimization...
Neural Networks Demystified [YouTube Video]
In this short series, we will build and train a complete Artificial Neural Network in python. New videos each friday...
LinkedIn Had One Of The First Data Science Teams. Now It’s Breaking Up The Band
Tthe data science team contained two subsections: the product data science team...and the decision sciences team...the social networking company has divided the subsections and stuck them in separate departments. The decision sciences team now reports to the office of the company’s chief financial officer, while the product data science team is now part of engineering...
2014 Conference On Empirical Methods On Natural Language Processing - Paper List With Mini-Reviews
I'm going to try something new and daring this time. I will talk about papers I liked, but I will mention some things I think could be improved. I hope everyone finds this interesting and productive...
Jobs
DataKind - Director of Programs In this highly visible and impactful role, you’ll oversee our existing DataCorps and DataDive programs and help launch our new In-House Data Science team. You will play a key part in our continued growth through skillful development, direction and technical execution. Our Director of Programs will be located in our New York City offices, working closely with the Executive Director and the Director of Operations to enhance existing programs, design processes, create strategies, and build internal and external data science teams. At times, you will act as liaison to the Executive Director during meetings with external partners and advisors, business development calls, and technical conferences...
Training & Resources
Introduction To Principal Component Analysis (PCA)
Principal Component Analysis (PCA) is a dimensionality-reduction technique that is often used to transform a high-dimensional dataset into a smaller-dimensional subspace prior to running a machine learning algorithm on the data...
Nando de Freitas's Machine Learning Course at The University of British Columbia
Academically, ML is one of the fastest growing fields in all fronts: Theory, methodology and application...it is also revolutionizing biology, astrophysics, engineering, and all other areas of science...[Videos for lectures]...
SVM - Understanding The Math - Part 1
This is the first article from a serie of articles I will be writing about the math behind SVM. There is a lot to talk about and a lot of mathematical background is often necessary. However I will try to keep a slow pace and to provide in-depth explanations, so that everything is crystal clear, even for beginners...
Books
Bad Data Handbook: Cleaning Up The Data So You Can Get Back To Work In this handbook, data expert Q. Ethan McCallum has gathered 19 colleagues from every corner of the data arena to reveal how they’ve recovered from nasty data problems...
Containing nineteen essays on lessons learned regarding bad data and data analysis work flows, this riveting good read is an excellent companion to more straight-forward texts on specific methodologies and technologies used in data science....
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S. Enjoyed the newsletter? Please forward it to friends and peers - we'd love to have them onboard too :-) - All the best, Hannah & Sebastian