Data Science Weekly - Issue 186
Issue #186 June 15 2017
Editor Picks
How Data Science Helps Power Worldwide Delivery of Netflix Content
In this post, we introduce some of the challenges in the content-delivery space where our data science and engineering teams collaborate to optimize the Netflix service...
An Algorithm Summarizes Lengthy Text Surprisingly Well
Training software to accurately sum up information in documents could have great impact in many fields, such as medicine, law, and scientific research...
Roman Roads
It’s finally done. A subway-style diagram of the major Roman roads, based on the Empire of ca. 125 AD. Creating this required far more research than I had expected—there is not a single consistent source that was particularly good for this...
A Message from this week's Sponsor:
Harness the business power of big data.
How far could you go with the right experience and education? Find out. At Capitol Technology University. Earn your PhD Management & Decision Sciences — in as little as three years — in convenient online classes. Banking, healthcare, energy and business all rely on insightful analysis. And business analytics spending will grow to $89.6 billion in 2018. This is a tremendous opportunity — and Capitol’s PhD program will prepare you for it. Learn more now.
Data Science Articles & Videos
Work in progress: Portraits of Imaginary People
For a while now I’ve been experimenting with ways to use generative neural nets to make portraits. Early experiments were based on deepdream-like approaches using backprop to the image but lately I’ve focused on GANs...
Building Dot Density Maps with UK Census Data in R
So I thought I’d try to tap into the burgeoning world of #rstats and see if I could make a version of the ethnic dot map for my recent city of residence, London...
Robot Uses Deep Learning and Big Data to Write and Play its Own Music
Compositions created using database of well-known pop, classical and jazz artists...
Marching neural network:
Visualizing level surfaces of a neural network with raymarching
In this demonstration you can play with a simple neural network in 3 spacial dimensions and visualize the functions the network produces (those are quite interesting despite the simplicity of a network, just click 'randomize weights' button several times)...
An Adversarial Review of “Adversarial Generation of Natural Language”
I’ve been vocal on Twitter about a deep-learning for language generation paper titled “Adversarial Generation of Natural Language” from the MILA group at the university of Montreal (I didn’t like it), and was asked to explain why. Some suggested that I write a blog post. So here it is...
Self-Normalizing Neural Networks
Deep Learning has revolutionized vision via convolutional neural networks (CNNs) and natural language processing via recurrent neural networks (RNNs). However, success stories of Deep Learning with standard feed-forward neural networks (FNNs) are rare. FNNs that perform well are typically shallow and, therefore cannot exploit many levels of abstract representations. We introduce self-normalizing neural networks (SNNs) to enable high-level abstract representations...
TextureGAN: Controlling Deep Image Synthesis with Texture Patches
In this paper, we investigate deep image synthesis guided by sketch, color, and texture. Previous image synthesis methods can be controlled by sketch and color strokes but we are the first to examine texture control. We allow a user to place a texture patch on a sketch at arbitrary location and scale to control the desired output texture...
We analyzed thousands of technical interviews on everything from language to code style. Here’s what we found
If you’re reading this post, there’s a decent chance that you’re about to re-enter the crazy and scary world of technical interviewing....
Jobs
Data Scientist - FactSet - NYC FactSet is a financial data and software company headquartered in Norwalk, CT with offices in 35 locations worldwide. As a global provider of financial information and analytics, FactSet helps the world’s best investment professionals outperform. FactSet was ranked #89 on FORTUNE’s “100 Best Places to Work” list in 2016 and has consistently been recognized as a great workplace by leading publications. This role is to design, build and deploy machine learning models for intelligent trade automation...
Training & Resources
Visual Question Answering in Pytorch
We developed this code in the frame of a research paper called MUTAN: Multimodal Tucker Fusion for VQA which is (as far as we know) the current state-of-the-art on the VQA-1 dataset...
Counting Objects with Faster R-CNN
Below you can find a description of different approaches, common problems, challenges and latest solutions in the Neural Networks object counting field. As a proof of concept, existing model for Faster R-CNN network will be used to count objects on the street with video examples given at the end of the post....
Optimizing Python in the Real World: NumPy, Numba, and the NUFFT
Too often, tutorials about optimizing Python use trivial or toy examples which may not map well to the real world. I've certainly been guilty of this myself. Here, I'm going to take a different route: in this post I will outline the process of understanding, implementing, and optimizing a non-trivial algorithm in Python, in this case the Non-uniform Fast Fourier Transform (NUFFT)...
Books
Probability - A Beginner's Guide To Permutations And Combinations: The Classic Equations, Better Explained "The focus of this book is on understanding why the permutation and combination equations are what they are, which ends up making them a lot easier to understand, remember, and expand than simply memorizing the equations"...
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S. Looking to hire a Data Scientist? Find an awesome one among our readers! Email us for details on how to post your job :) - All the best, Hannah & Sebastian