Data Science Weekly - Issue 170
Issue #170 Feb 23 2017
Editor Picks
Finding the most depressing Radiohead song with R,
using the Spotify and Genius Lyrics APIs
Radiohead has been my favorite band for a while, so I am used to people politely suggesting that I play something “less depressing.” Much of Radiohead’s music is undeniably sad, and this post catalogs my journey to quantify that sadness, concluding in a data-driven determination of their most depressing song...
Neural Network Learns to Synthetically Age Faces, and Make Them Look Younger, Too
Deep-learning machines can make faces look older but often lose their identity in the process. Now computer scientists have solved this problem...
My Journey From Frequentist to Bayesian Statistics
If I had been taught Bayesian modeling before being taught the frequentist paradigm, I'm sure I would have always been a Bayesian...
A Message from this week's Sponsor:
Harness the business power of big data.
How far could you go with the right experience and education? Find out. At Capitol Technology University. Earn your PhD Management & Decision Sciences — in as little as three years — in convenient online classes. Banking, healthcare, energy and business all rely on insightful analysis. And business analytics spending will grow to $89.6 billion in 2018. This is a tremendous opportunity — and Capitol’s PhD program will prepare you for it. Learn more now.
Data Science Articles & Videos
Tracking my movements on the football pitch with Fitbit
I mostly run and play football, and when it comes to track your movements while jogging, my brand new Fitbit Surge does the job almost perfectly. I decided to test its effectiveness on the football field, so I wore it during a game in Paris. Fitbit allows you to export your data in a .TCX format. I did it, and then imported it in Google Earth to check whether the GPS was accurate or not...
Feel The Kern - Generating Proportional Fonts with AI
A about a year ago I read two blog posts about generating fonts with deep learning; one by Erik Bernhardsson and TJ Torres at StitchFix... So why not just take what Erik and TJ have made and simply use that to generate new fonts? Because their models are lacking something: Even though they manage to capture the styles of individual characters very well, they do not incorporate the styling found between pairs of characters, namely the intended spacing in between them, known as kerning...
Learning from A.I. Duet
Google Creative Lab just released A.I. Duet, an interactive experiment which lets you play a music duet with the computer. You no longer need code or special equipment to play along with a Magenta music generation model. Just point your browser at A.I. Duet and use your laptop keyboard or a MIDI keyboard to make some music...
Twitter researchers offer clues for why Trump won
Two University of Rochester researchers are out with a new study about why the 2016 Presidential election turned out the way it did. Professor Jiebo Luo and PhD candidate Yu Wang conducted an extensive 14-month study of each candidate’s Twitter followers and arrived at some very interesting results...
Learning to generate one-sentence biographies from Wikidata
We investigate the generation of one sentence Wikipedia biographies from facts derived from Wikidata slot-value pairs. We train a recurrent neural network sequence-to-sequence model with attention to select facts and generate textual summaries. These automated 1-sentence "biographies" from Wikidata, are preferred by readers over Wikipedia's 1st sentence in 40% of cases...
Image-Grounded Conversations: Multimodal Context for Natural Question and Response Generation
The popularity of image sharing on social media reflects the important role visual context plays in everyday conversation. In this paper, we present a novel task, Image-Grounded Conversations (IGC), in which natural-sounding conversations are generated about shared photographic images...
Deep Nets Don't Learn Via Memorization
We use empirical methods to argue that deep neural networks (DNNs) do not achieve their performance by memorizing training data, in spite of overlyexpressive model architectures. Instead, they learn a simple available hypothesis that fits the finite data samples...
Playing SNES in the Retro Learning Environment
Mastering a video game requires skill, tactics and strategy. While these attributes may be acquired naturally by human players, teaching them to a computer program is a far more challenging task. As a result, the Arcade Learning Environment (ALE) has become a commonly used benchmark environment allowing algorithms to trainon various Atari 2600 games. In this paper we introduce a new learning environment, the Retro Learning Environment — RLE, that can run games on the Super Nintendo Entertainment System (SNES), Sega Genesis and several other gaming consoles...
Jobs
Data Scientist - SeatGeek - NYC SeatGeek operates a unique business model in a complicated, opaque market. Many of the hardest problems we face have never been tackled at scale and do not have clear questions, let alone answers. Moving forward requires critical thinking, rapid prototyping, and intellectual dexterity...
Training & Resources
Text mining in R: a tutorial
At the end of this tutorial, you’ll have developed the skills to read in large files with text and derive meaningful insights you can share from that analysis. You’ll have learned how to do text mining in R, an essential data mining tool...
rstudio::conf 2017 session recordings are now available
Whether you missed the conference, missed a talk, or just want to refresh your memory, you can find all the recordings from the first ever conference about All Things R & RStudio...
Experiment with Dask and TensorFlow
This post briefly describes potential interactions between Dask and TensorFlow and then goes through a concrete example using them together for distributed training with a moderately complex architecture...
Books
The Number Sense: How the Mind Creates Mathematics "A fascinating look at the crossroads where numbers and neurons intersect"...
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S. Interested in reaching fellow readers of this newsletter? Consider sponsoring! Email us for details :) - All the best, Hannah & Sebastian