Data Science Weekly - Issue 113
Issue #113 January 21 2016
Editor Picks
Data Science on State of the Union Addresses: Obama (2016) vs. Obama (2015) vs. ... vs. George Washington (1790)
Barack Obama recently gave his final State of the Union address, and since we’re interested in analyzing text data at Civis Analytics, I figured I ought to see if I could discover anything interesting...
Why video games are essential for inventing artificial intelligence
Having concrete problems to try to solve with AI is necessary in order to make progress; if you try to invent AI without having something to use it for, you will not know where to start. My chosen domain is games, and I will explain why this is the most relevant domain to work on if you are serious about AI...
R Users Will Now Inevitably Become Bayesians
There are several reasons why everyone isn’t using Bayesian methods for regression modeling. One of these reasons has recently been shattered in the R world by not one but two packages: brms and rstanarm...
A Message from this week's Sponsor:
Want to Read a Hiring Manager's Mind?
“I thought I prepared myself well to enter this field but the reality is much different than I imagined, and it's been really discouraging.”
Landing a Data Science interview isn't just about skill-building. You need to make a Hiring Manager want you.
Learn how with this actionable (and totally free) 5 day course on How to Read a Hiring Manager’s Mind. Get Lesson #1 right now!
Data Science Articles & Videos
The Unreasonable Reputation of Neural Networks
It is hard not to be enamoured by deep learning nowadays, watching neural networks show off their endless accumulation of new tricks. There are, as I see it, at least two good reasons to be impressed...
Analyzing Canada-US Border Crossing Data
Recently I found an open data source containing lots of interesting attributes from the last 9 years from the four major border crossings here on the West Coast. For each crossing, the dataset had Volume, Delay, Service Rate and Queue Length attributes for each lane type (cars, trucks, buses, NEXUS) in 5-minute intervals. I decided to look at three different questions...
T-Shirts Unravelled
We washed, dried, measured and weighed 800 of the most popular men's t-shirts available online. The shirts included a wide variety of price points ($5-$50), sizes (XXS up to 6XL) and fits ("slim", "tall", "relaxed", etc.). After compiling the data, we worked with beta testers in NYC to develop an algorithm that could recommend t-shirt brands and sizes for a wide range of body types...
Cash for A.I. startups
Interview with Shivon Zilis of Bloomberg Beta on the emerging wave of machine intelligence startups...
A Word is Worth A Thousand Vectors
Presentation from Chris Moody of Stitchfix: word2vec, LDA, and introducing a new hybrid algorithm: lda2vec...
Understanding Deep Convolutional Networks
Deep convolutional networks provide state of the art classifications and regressions results over many high-dimensional problems. We review their architecture, which scatters data with a cascade of linear filter weights and non-linearities. A mathematical framework is introduced to analyze their properties...
Visualizing CNN architectures side by side with mxnet
Convolutional Neural Networks can be visualized as computation graphs with input nodes where the computation starts and output nodes where the result can be read...
China’s Baidu Releases Its AI Code
“China’s Google” is joining U.S. tech giants in giving away some of its code...
Anthony Goldbloom gives you the secret to winning Kaggle competitions
Who better than Kaggle CEO and Founder, Anthony Goldbloom, to dish out advice? We caught up with him at Extract SF 2015 in October to pick his brain about how best to approach a Kaggle competition...
How should a Data Scientist's resume differ from an Academic CV?
Your academic cv is very coursework and research focused. You've heard business resumes need to be more action and results oriented, but you're not sure what that means for you. You're looking for advice on how to re-work your academic cv and not finding much advice out here. To help get you started, here are some thoughts on what you'll need to do... ...
Jobs
Data Scientist - Bitly - New York Bitly faces a variety of interesting challenges that are ideally suited for a data scientist to pursue. We see massive amounts of data giving us a fascinating view into what is happening on the internet. With this data, it is our mission to empower marketers to make better decisions by providing insight into the connected world...
Training & Resources
How to Make the Leap from Excel to SQL
Blog post designed for Excel users who are looking to learn some SQL. We walk through how people can translate their Excel knowledge to SQL, and we've included a free workbook of six go-to Excel functions and their SQL equivalents...
Making Causal Impact Analysis Easy
The purpose of this document is to describe a robust approach to intervention analysis based on two key R packages: the CausalImpact package written by Kay Brodersen at Google and the dtw package available in CRAN...
Introduction to Semi-Supervised Learning with Ladder Networks
This is a brief introduction to the implementation of Ladder networks...
Books
Superforecasting: The Art and Science of Prediction Interesting take on prediction, drawing on decades of research and the results of a massive, government-funded forecasting tournament (The Good Judgment Project) involving tens of thousands of ordinary people...
"Superforecasting is the rare book that is both scholarly and engaging. The lessons are scientific, compelling, and enormously practical..."
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S. Interested in reaching fellow readers of this newsletter? Consider sponsoring! Email us for details :) - All the best, Hannah & Sebastian