Data Science Weekly - Issue 111
Issue #111 January 2016
Editor Picks
Machine Learning is Fun! Part 2
Using Machine Learning to generate Super Mario Maker levels..
Why too much evidence can be a bad thing
Under ancient Jewish law, if a suspect on trial was unanimously found guilty by all judges, then the suspect was acquitted. This reasoning sounds counterintuitive, but the legislators of the time had noticed that unanimous agreement often indicates the presence of systemic error...
AMA Data Scientist: Jake Porway of DataKind
Kick off 2016 with DataKind’s founder and executive director Jake Porway for his first ever Reddit AMA January 13. Join in for a candid discussion of what it takes to apply data science for social good. (Hint - much more than good intentions.) Hope to see you on /r/DataScience!...
A Message from this week's Sponsor:
Build real-time apps.
Syncano. Database. Backend. Middleware. Real-time. Support. Start for free!
Data Science Articles & Videos
Bayes's Theorem: What's the Big Deal?
Bayes’s theorem, touted as a powerful method for generating knowledge, can also be used to promote superstition and pseudoscience...
10 More lessons learned from building real-life Machine Learning systems
I now decided to follow up with 10 new lessons that built upon the original ones. The present two-part blog post includes new lessons not only learned directly at Quora but also from talking to many people at different companies...
Netflix Recommender System: Algorithms, Business Value, and Innovation
This article discusses the various algorithms that make up the Netflix recommender system, and describes its business purpose. We also describe the role of search and related algorithms, which for us turns into a recommendations problem as well...
Who Controls Your Facebook Feed
A small team of engineers in Menlo Park. A panel of anonymous power users around the world. And, increasingly, you...
Analyzing networks of characters in 'Love Actually'
Every Christmas Eve, my family watches Love Actually. Even on the eighth or ninth viewing, it’s impressive what an intricate network of characters it builds. This got me wondering how we could visualize the connections quantitatively, based on how often characters share scenes. So last night, while my family was watching the movie, I loaded up RStudio, downloaded a transcript, and started analyzing...
Unearthing Data to Unleash Impact: Unique Data Sources to Drive Change
At DataKind and Tableau Foundation, we regularly work with nonprofits that have important questions to answer, but not necessarily the data to do so. Or, at least, they don’t think they have the data to do so....
The Myth Of AI
A Conversation With Jaron Lanier - Computer Scientist; Musician; Author of Who Owns the Future?...
Attention And Memory In Deep Learning And NLP
A recent trend in Deep Learning are Attention Mechanisms. In an interview, Ilya Sutskever, now the research director of OpenAI, mentioned that Attention Mechanisms are one of the most exciting advancements, and that they are here to stay. That sounds exciting. But what are Attention Mechanisms?...
Winning The Bias-Variance Tradeoff
Machine learning is a strange mix of math and weird heuristics. When I started studying machine learning, I was SO FRUSTRATED. Everything was “well it works in practice” and so little of it was math. I was a pure math major at the time, so arguments like “well it works in practice” made me REALLY MAD...The bias-variance decomposition is a small piece of math that actually explains why some things in machine learning work!...
Machine Learning for Artists
This spring I will be teaching a course at NYU’s Interactive Telecommunications Program (ITP) called “Machine Learning for Artists.” Since the subject is fairly uncommon outside of the realm of scientific research, I thought it would be helpful to outline my motivations for offering this class...
Jobs
Data Scientist - Johns Hopkins Health System - Glen Burnie, MD Upon joining Johns Hopkins Health System, you become part of a diverse organization dedicated to its patients, their families, and the community we serve, as well as to our employees. The Data Scientist is responsible for monitoring data quality and for analyzing data using statistical tools like R, SAS, SPSS, WEKA, Rapidminer. Presenting data in charts, graphs, tables and leveraging relational databases for collecting data. Through innovation the data scientist will find meaningful patterns, trends, and relationships by evaluating large amounts of healthcare data and be able to interpret and explain the findings to the organization....
Training & Resources
bayes.js: A Small Library For Doing MCMC In The Browser
Bayesian data analysis is cool, Markov chain Monte Carlo is the cool technique that makes Bayesian data analysis possible, and wouldn’t it be coolness if you could do all of this in the browser? That was what I thought, at least, and I’ve now made bayes.js: A small JavaScript library that implements an adaptive MCMC sampler and a couple of probability distributions, and that makes it relatively easy to implement simple Bayesian models in JavaScript...
Introducing Guesstimate, a Spreadsheet for Things That Aren’t Certain
A spreadsheet that’s as easy to use as existing spreadsheets, but works for uncertain values. For any cell you can enter confidence intervals (lower and upper bounds) that can represent full probability distributions. 5000 Monte Carlo simulations are performed to find the output interval for each equation, all in the browser...
5 More arXiv Deep Learning Papers, Explained
Top recent deep learning papers on arXiv are presented, summarized, and explained with the help of a leading researcher in the field...
Books
Prime Obsession:
Bernhard Riemann and the Greatest Unsolved Problem in Mathematics Fascinating account of a mathematical mystery that continues to challenge...
"Prime Obsession is a delight: a book about a hypothesis on the distribution of prime numbers that reads like a gripping mystery..."
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S. Interested in reaching fellow readers of this newsletter? Consider sponsoring! Email us for details :) - All the best, Hannah & Sebastian