Data Science Weekly - Issue 19
Issue #19 April 3 2014
Editor Picks
Meet the Man building an AI that mimics our Neocortex –
and could kill off Neural Networks Jeff Hawkins has bet his reputation, fortune, and entire intellectual life on one idea: that he understands the brain well enough to create machines with an intelligence we recognize as our own. If his bet is correct, the Palm Pilot inventor will father a new technology, one that becomes the crucible in which a general AI is one day forged. If his bet is wrong, then Hawkins will have wasted his life. At 56 years old that might sting a little...
Data scientists need their own GitHub. Here are four of the best options Imagine if a company’s three highly valued data scientists can happily work together without duplicating each other’s efforts and can easily call up the ingredients and results of each other’s previous work. That day has come...
How the rise of the "R" Language is bringing open source to Science Thanks to dwindling research budgets and the rising cost of science software, "open science" advocates may be succeeding at getting science to go open source. And it's thanks in part to a language called R...
Data Science Articles & Videos
META: What Data Scientists are reading. And why.
We recently posted an analysis of the most-read articles on this newsletter for the past two quarters. We were curious to understand what was getting the most clicksand if there were any consistent areas of interest...
Forget the Algorithms and Start Cleaning Your Data
The idea that the combination of predictive algorithms and big data will change the world is a tempting one. And it may end up being true. But for now, the industry is facing a reality check when it comes to big data analytics. Instead of focusing on what algorithms to use, your big data success depends more on how well you cleaned, integrated, and transformed your data...
The Sexiest Job of the 21st Century is Tedious, and that Needs to Change
As organizations collect increasingly large and diverse data sets, the demand for skilled data scientists will continue to rise. In fact, it was dubbed “The Sexiest Job of the 21st Century” by HBR. Unfortunately, the day-to-day reality of the role doesn’t quite match the romanticized version...
Big Data: Are we making a Big Mistake?
Cheerleaders for big data have made four exciting claims, each one reflected in the success of Google Flu Trends... Unfortunately, these four articles of faith are at best optimistic oversimplifications. At worst, according to David Spiegelhalter, Winton Professor of the Public Understanding of Risk at Cambridge university, they can be “complete bollocks. Absolute nonsense"...
Data Science + Crime Prevention = Predictive Policing
We recently caught up with George Mohler, Chief Scientistat PredPol, Inc and Assistant Professor of Mathematics and Computer Science at Santa Clara University. We were keen to learn more about hisbackground, the theory and technology behind predictive policing and the impact PredPol is achieving...
SelfieCity might be the Ultimate Data-Driven Exploration of the Selfie
Understanding what keeps customers engagedis incredibly valuable, as it is a logical foundation from which to develop retention strategies and roll out operational practices aimed to keep customers from walking out the door. Consequently, there's growing interest among companies to develop better churn-detection techniques, leading many to look to data mining and machine learning for new and creative approaches...
Did Nvidia Just Demo SkyNet on GTC 2014? –
Neural Net Based “Machine Learning” Intelligence Explored
Skynet of legend, I remember, was a Neural-Net based Artificial Intelligence. It worked on the concept of “Machine Learning”. It so happens that Nvidia showcased what appears to be the first fully Scalable Deep Neural Network based (Primitive) Intelligence System. A System that can deploy “Machine Learning” and actually learn just like a human...
Differential Equations in Data Science
The ordinary differential equation (ODE) is a tool often overlooked in data science... However, it's a tool that's been in use for centuries, modeling everything from predicting optimal pharmaceutical dosing schedules through estimating options pricing. Here at URX we feel no tool should be left behind. We've re-surfaced the ODE and, as a gentle introduction, would like to show how it relates to a very common data science tool, markov chains...
How the NSA can use Metadata to predict your Personality
The president and congressional leaders want to end NSA bulk metadata collection, but not the use of metadata, which may even be expanded. From a technical perspective, the question of what your metadata can reveal about you, or potential enemies, remains as important as it was since the Edward Snowden scandal. The answer is more than you might think...
Swish Analytics: Algorithmic Sports, Predictions & Betting Recommendations
Swish Analytics Inc., is a sports technology startup based in San Francisco that developed algorithmic sports, predictions and betting recommendations. The three founders raised $300K in March from a group of private angel investors to deliver algorithmic sports predictions to bettors and fans in the underserved data science field...
Jobs
Maps Data Scientist, Apple - Santa Clara, CA The Maps Data Insights team has an opening for a craftsman skilled in Large Scale DataMining and Machine Learning for making significant contributions in improving Apple Maps. The role involves developing models for identifying patterns and anomalies and for mining structured, semi-structured and unstructured data. The person will get an opportunity to contribute to projects ranging from the ones involving massive datasets to the ones solving small scale but very complex problems using machine learning and probabilistic modeling techniques...
Training & Resources
Datasets: Webscope from Yahoo!
We have various types of data available to share. They are categorized into Ratings, Language, Graph, Advertising and Market Data, Computing Systems and an appendix of other relevant data and resources available...
Statistics Done Wrong: The Woefully Complete Guide If you’re a practicing scientist, you probably use statistics to analyze your data. From basic t tests and standard error calculations to Cox proportional hazards models and geospatial kriging systems, we rely on statistics to give answers to scientific problems. This is unfortunate, because most of us don’t know how to do statistics...
Introduction to Artificial Neural Networks Part 2 - Learning
In part 1 we were introduced to what artificial neural networks are and we learnt the basics on how they can be used to solve problems. In this tutorial we will begin to find out how artificial neural networks can learn, why learning is so useful and what the different types of learning are...
An Introduction to Deep Learning: From Perceptrons to Deep Networks In this article, I’ll introduce you to the key concepts and algorithms behind Deep Learning, beginning with the simplest unit of composition and building from there...
Books
Data Smart: Using Data Science to Transform Information into Insight This book from John Foreman (Chief Data Scientist at Mailchimp) makes Data Science extremely practical and accessible - using Excel as a primary means for exploring Data Science concepts. The book introduces the major Data Science techniques, how they work, how to use them, and how they benefit your business, large or small. It's not about coding or database technologies. It's about turning raw data into insight you can act upon, and doing it as quickly and painlessly as possible...
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S. Did you enjoy the newsletter? Do you have friends/colleagues who might like it too? If so, please forward it along - we would love to have them onboard :)