Data Science Weekly - Issue 66
Issue #66 Feb 26 2015
Editor Picks
The History of Machine Learning from the Inside Out
In episode five of Talking Machines, we hear the first part of our conversation with Geoffrey Hinton (Google and University of Toronto), Yoshua Bengio (University of Montreal) and Yann LeCun (Facebook and NYU)...
Why You Should Fear Machine Intelligence
Development of superhuman machine intelligence is probably the greatest threat to the continued existence of humanity...
Google’s AI Is Now Smart Enough to Play Atari Like the Pros
Last year Google shelled out an estimated $400 million for a little-known artificial intelligence company called DeepMind. Since then, the company has been pretty tight-lipped about what’s been going on behind DeepMind’s closed doors, but here’s one thing we know for sure: There’s a professional videogame tester who’s pitted himself against DeepMind’s AI software in a kind of digital battle royale...
Data Science Articles & Videos
Bayes' Theorem with Lego
What's a good blog on probability without a post on Bayes' Theorem? Bayes' Theorem is one of those mathematical ideas that is simultaneously simple and demanding. Its fundamental aim is to formalize how information about one event can give us understanding of another. Let's start with the formula and some lego, then see where it takes us...
From Academia to Metis Data Science Bootcamp: Andy Martens Interview
We recently caught up with Dr. Andy Martens, former professor and researcher in social psychology and physiology who is in the process of moving from academia to data science. We were keen to learn more about his background, his move to data science, his choice of going to a data science bootcamp, and why he chose Metis Data Science Bootcamp in particular...
Data Mining Indian Recipes Reveals New Food Pairing Phenomenon
By studying the network of links between Indian recipes, computer scientists have discovered that the presence of certain spices makes a meal much less likely to contain ingredients with flavors in common...
Bringing Big Data to the Fight Against Benefits Fraud
A few years ago, the New York City Human Resources Administration decided to try a new way to root out fraud among people receiving government benefits. Data detectives began running benefit recipients through a computerized pattern-recognition system...
Proving that Android’s, Java’s and Python’s sorting algorithm is broken
(and showing how to fix it)
After we had successfully verified Counting and Radix sort implementations in Java with a formal verification tool called KeY, we were looking for a new challenge...
Demystification of DIY — Defining Basketball Analytics Down
The furor over analytics (re-)sparked by Charles Barkley’s pre-All-Star Week tirade has gotten us here at Nylon Calculus talking. Moving past the immediate, defensive reactions to Barkley’s particular perspective1, there was something to be taken from the discussion. Barkley and many others don’t really know what “analytics” entail...
The Ethical Risks of Detecting Disease Outbreaks With Big Data
An over-prediction could cause panic, misallocation of limited supplies of vaccines or medical resources, and, as some reactions to the recent Ebola outbreak demonstrated, damaging stigmatization of people or communities who don't pose a risk...
"Data Science: Where are We Going?" - Dr. DJ Patil (Strata 2015)
Data Science, where are we going? What impact can we expect? With a special introduction from President Barack Obama....
The Best Stats You've Ever Seen
You've never seen data presented like this. With the drama and urgency of a sportscaster, statistics guru Hans Rosling debunks myths about the so-called "developing world."...
Jobs
Data Scientist/Modeler, Yield Management - Twitter - San Francisco, CA The Yield Management team at Twitter (within the Sales Operations group) is being built to partner with our Global Sales, Product and Sales Finance teams to extract maximum value from Twitter’s ad inventory. This is a strongly data-driven team that provides analytical, strategic insights and recommendations to support monetization improvements and revenue growth and also helps execute them in collaboration with other teams.
As we’re building and growing the team, we’re looking for a Data-Scientist/Modeler to help mine advertising and auction data to build data models identifying drivers of monetization and monitor marketplace performance and dynamics. The focus of projects will be to uncover revenue and monetization improvement opportunities for the company...
Training & Resources
Dataset Inventorying Tool
oday we’re releasing Let Me Get That Data For You (LMGTDFY), a free, open source tool that quickly and automatically creates a machine-readable inventory of all the data files found on a given website...
Optimizing Python in the Real World: NumPy, Numba, and the NUFFT
Too often, tutorials about optimizing Python use trivial or toy examples which may not map well to the real world. Here, I'm going to take a different route: in this post I will outline the process of understanding, implementing, and optimizing a non-trivial algorithm in Python...
Topic Modeling for the Uninitiated
Topic models provide a nice way to explore a collection of documents that share a single common theme so-called “topic”...
Books
Thinking Statistically Accessible introduction to range of statistical techniques...
"A truly excellent read and far more fun than a book about statistics has any right to be..."
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S. Enjoyed the newsletter? Please forward it along to friends and colleagues - we'd love to have them onboard! - All the best, Hannah & Sebastian