Data Science Weekly - Issue 163
Issue #163 Jan 5 2017
Editor Picks
Best Data Visualization Projects of 2016
I am sure though that there were a lot of good projects this year. Below are my favorites in no particular order...
The Programmer’s Guide to Booking a Plane
About two months ago, I wanted to go on a vacation. I had the hotel more or less picked out, but the transportation was still up in the air. I began by scouring the web for cheap plane tickets, like normal travel folks seem to do. I scoured all of the fancy airlines, but all of their fares were too high for my liking...
The Moving Sofa Problem
The mathematician Leo Moser posed in 1966 the following curious mathematical problem: what is the shape of largest area in the plane that can be moved around a right-angled corner in a two-dimensional hallway of width 1? This question became known as the moving sofa problem, and is still unsolved fifty years after it was first asked...
A Message from this week's Sponsor: Springboard
Springboard launches Data Science Career Track with job guarantee
Springboard has launched the first data science bootcamp to guarantee you a job — or your money back.
Data Science Articles & Videos
The Robotic Grocery Store of the Future Is Here
Swarm robotics, autonomous delivery vehicles, and machine-learned preferences will help deliver your food faster...
White House Special with DJ Patil, US Chief Data Scientist
We went to the White House to interview DJ Patil, the Chief Data Scientist of the United States. DJ talked with us about the relationship between government and Silicon Valley, the White House’s leadership on data ethics, and why the first US Chief Data Scientist was actually George Washington...
Amazon Alert
Track prices on Amazon and receive email alerts for price drops...
Quantifying and Visualizing “Deep Work”
One of the best books I read in 2016 is Cal Newport’s “Deep Work”. In his book Cal explains that technology and various social practices have eroded our capacity to work without distractions and that we need to find ways to spend more time doing what he calls “Deep Work”: long stretches of time of uninterrupted full-focus work. For the end of 2016 I decided to do a more systematic analysis than usual and to share it with the world thinking that: (1) you may be interested in doing the same and (2) you may have ideas on how to improve the process...
Which programming languages have the happiest (and angriest) commenters?
It’s officially winter, so what could be better than drinking hot chocolate while querying the new Stack Overflow dataset in BigQuery? It has every Stack Overflow question, answer, comment, and more — which means endless possibilities of data crunching. Inspired by Felipe Hoffa’s post on how response time varies by tag, I wanted to look at the comments table (53 million rows!)...
Pharma adopts data-science culture in move toward AI
To whom does pharma turn when facing cutting-edge research challenges such as designing algorithms to comb unstructured EHR data hunting for undiagnosed patients? Who helps glean patient insights from digital data streams such as social media? Who is designing the value-based frameworks behind pharma's latest wave of performance-based agreements with payers? These are just some of the tasks being fielded by data scientists, a new kind of insight and analytics professional showing up in the biopharma ranks...
Building a Brazillian Jiu-Jitsu family tree
Here, I will visualize the lineages of hundreds of elite BJJ practitioners drawn from the website BJJ heroes. I feel that lineages are intrinsically interesting; they allow BJJ practitioners, such as myself, to better understand our place in the context of an increasingly popular sport. Moreover, to the extent that style is transmitted through master-student relationships, practitioners who are nearby in a lineage-network are likely to be more stylistically similar than those from distant lineages...
Motordex
A blog post from Justin Chien, a recent Metis SF graduate, who did a great article on using convolutional neural networks to identify car models, which then has some interesting real-world applications...
Jobs
Data Scientist - New York Power Authority - White Plains, NY The New York Power Authority is making investments in smart grid as a part of NYPA’s Strategic Vision 2014-2020 that is aimed at providing both NYPA and New York State with the most advanced grid in the industry and to ensure that the most modern industry solutions are leveraged to deliver capability in key areas. Improved IT data management will increase benefits to customers by providing the State with market leading management of future technologies, management of enterprise wide systems providing near-real-time access to information and predictive analysis, improved systems management, and increased system efficiency. We are looking for someone who is very curious, who enjoy diving deep into the material to find an answer to a yet unknown question...
Training & Resources
What I Learned Implementing a Classifier from Scratch in Python
In order to demystify some of the magic behind machine learning algorithms, I decided to implement a simple machine learning algorithm from scratch. I will not be using a library such as scikit-learn which already has many algorithms implemented. Instead, I’ll be writing all of the code in order to have a working binary classifier algorithm. The goal of this exercise is to understand its inner workings...
Python Machine Learning: Scikit-Learn Tutorial
Today’s scikit-learn tutorial will introduce you to the basics of Python machine learning: step-by-step, it will show you how to use Python and its libraries to explore your data with the help of matplotlib, work with the well-known algorithms KMeans and Support Vector Machines (SVM) to construct models, to fit the data to these models, to predict values and to validate the models that you have build...
NIPS 2016 Tutorial: Generative Adversarial Networks
This report summarizes the tutorial presented by the author at NIPS 2016 on generative adversarial networks (GANs)...
Books
Weapons of Math Destruction "A former Wall Street quant sounds an alarm on the mathematical models that pervade modern life — and threaten to rip apart our social fabric"...
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S. Interested in reaching fellow readers of this newsletter? Consider sponsoring! Email us for details :) - All the best, Hannah & Sebastian