Data Science Weekly - Issue 159
Issue #159 Dec 8 2016
Editor Picks
The major advancements in Deep Learning in 2016
Deep Learning has been the core topic in the Machine Learning community the last couple of years and 2016 was not the exception. In this article, we will go through the advancements we think have contributed the most (or have the potential) to move the field forward and how organizations and the community are making sure that these powerful technologies are going to be used in a way that is beneficial for all...
Four Experiments in Handwriting with a Neural Network
We’ll start with a fun one that tries to predict your strokes as you write...
Six maps that show the anatomy of America’s vast infrastructure
Trump’s plan to invest about $550 billion in new infrastructure projects across the country was a central theme in his campaign. The maps you are about to see show the massive scope of America’s infrastructure using data from OpenStreetMap and various government sources. They provide a glimpse into where that half-trillion dollars may be invested...
A Message from this week's Sponsor:
Strata + Hadoop World 2017 price goes up soon
The Best Price for Strata + Hadoop World 2017 (happening March 13-16 in San Jose) ends next week. Check out the program and register by midnight PT. December 16 to save up to $400. Use code PCDSWEEKLY for an extra 20% off most passes.
Data Science Articles & Videos
Shoddy Medication? Search Engines May Already Know
People voice concerns about drugs via search queries—and that data that can predict when one might be recalled...
World's longest pub crawl: Team plots route between 25,000 UK boozers
Two-year project maps shortest possible journey to visit thousands of pubs across the country...
Artificial Intelligence Invades the Home … In Toys
The first thing I learned about Cozmo is that it doesn’t like to stay put very long. Roused from slumber, the little robot’s face illuminates, and it begins zooming around the table in front of me. A moment later, it notices I’m watching and turns to greet me, saying my name with a computerized chirp...
Generative Art and Hamiltonian Monte Carlo
[Talking Machines Episode] we talk about Hamiltonian Monte Carlo, we take a listener question about unbalanced data, plus we talk with Doug Eck of Google’s Magenta project...
This AI Boom Will Also Bust
The bottom line here is that while some see this new prediction tech as like a new pipe tech that could improve all pipes, no matter their size, it is actually more like a tech only useful on very large pipes. Just as it would be a waste to force a pipe tech only useful for big pipes onto all pipes, it can be a waste to push advanced prediction tech onto typical prediction tasks. And the fact that this new tech is mainly only useful on rare big problems suggests that its total impact will be limited...
Finding the genre of a song with Deep Learning
A step-by-step guide to make your computer a music expert...
Predicting with confidence: Best machine learning idea you never heard of
One of the disadvantages of machine learning as a discipline is the lack of reasonable confidence intervals on a given prediction. There are all kinds of reasons you might want such a thing, but I think machine learning and data science practitioners are so drunk with newfound powers, they forget where such a thing might be useful...
Diagnosing Disease with a Snapshot
Many genetic conditions come with clues in a person’s face, and new technology can help doctors diagnose them...
Jobs
Data Science Initiative, Scientific Lead - UCSF Library - San Franscisco, CA The UCSF Library’s Data Science Initiative is hiring! We are looking for a biomedical researcher with an entrepreneurial spirit and a passion for programming in R/Python, bioinformatics, data curation, statistics, data visualization - or all of the above – to serve as the Scientific Lead for our Data Science Initiative. We are taking a broad approach to data science, and are looking for someone who will work to identify the data science needs of the UCSF research community, help build a Library-based hub for data science activities, develop programs and events, and teach workshops and classes...
Training & Resources
Tidy Data in Python
In this post, I will summarize some tidying examples Wickham uses in his paper and I will demonstrate how to do so using the Python pandas library...
BOPP: Bayesian Optimization for Probabilistic Programs
BOPP is a package for automated marginal maximum a posteriori inference (MMAP) based around the probabilistic programming system Anglican...
Machine Learning Theory - Part 3: Regularization and the Bias-variance Trade-off
We’ll begin to draw some practical concepts for the process of solving the ML problem. We’ll start by trying to get more intuition about why a more complex hypothesis space is bad...
Books
Guesstimation: Solving the World's Problems on the Back of a Cocktail Napkin "Zed Shaw has perfected the world's best system for learning Python. Follow it and you will succeed-just like the hundreds of thousands of beginners Zed has taught to date"...
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S. Interested in reaching fellow readers of this newsletter? Consider sponsoring! Email us for details :) - All the best, Hannah & Sebastian