Data Science Weekly - Issue 169
Issue #169 Feb 16 2017
Editor Picks
Were there more notable deaths than expected in 2016?
After exploring my study population of Wikipedia deaths, I want to analyse the time series of monthly counts of notable deaths. This is not a random interest of mine, my PhD thesis was about monitoring time series of count, the application being weekly number of reported cases of various diseases...
I ranked every Intro to Data Science course on the internet,
based on thousands of data points
A year ago, I dropped out of one of the best computer science programs in Canada. I started creating my own data science master’s program using online resources. I realized that I could learn everything I needed through edX, Coursera, and Udacity instead. And I could learn it faster, more efficiently, and for a fraction of the cost...
Building a deep learning DOOM bot
This article is the first in a series of posts that will focus on an exploratory journey of reinforcement based Deep Learning utilizing the VizDoom platform. In terms of goals, my destination is the creation of a Doom AI capable of thriving in a Deathmatch environment (woohoo killer AI)...
A Message from this week's Sponsor:
Get hired as a data scientist with 1-on-1 mentorship from an expert
In data science, one size does not fit all. That's why Thinkful's 1-on-1 mentorship is so critical. Thinkful's Flexible Data Science Bootcamp teaches you Python, data analysis, and machine learning methods through real-world projects. Start learning 1-on-1 with an experienced data scientist, without quitting your day job.
Data Science Articles & Videos
Predicting gentrification using longitudinal census data
The objective of this research is to take a first step in exploring the feasibility of forecasting neighborhood change using longitudinal census data in 29 Legacy Cities...
Neural Network Learns to Select Potential Anticancer Drugs
Scientists from Mail.Ru Group, Insilico Medicine and MIPT have for the first time applied a generative neural network to create new pharmaceutical medicines with the desired characteristics. By using Generative Adversarial Networks (GANs) developed and trained to "invent" new molecular structures, there may soon be a dramatic reduction in the time and cost of searching for substances with potential medicinal properties...
Music Composition with LSTMs
For my final project at Metis Data Science, I designed a recurrent neural network utilizing Long Short-Term Memory nodes (LSTMs) to learn patterns in the Six Cello Suites by J.S. Bach, and subsequently generate its own musical fragments...
Trump2Cash
This bot watches Donald Trump's tweets and waits for him to mention any publicly traded companies. When he does, it uses sentiment analysis to determine whether his opinions are positive or negative toward those companies. The bot then automatically executes trades on the relevant stocks according to the expected market reaction. It also tweets out a summary of its findings in real time...
'AI brain scans' reveal what happens inside machine learning
Bristol, UK-based Graphcore has used its AI processing units and software to create maps of what happens during a machine learning process...
Data Selfie
A new Chrome extension reveals the unsettling amount of information Facebook might have on you...
Fueling the Gold Rush: The Greatest Public Datasets for AI
Most people in AI forget that the hardest part of building a new AI solution or product is not the AI or algorithms — it’s the data collection and labeling. Standard datasets can be used as validation or a good starting point for building a more tailored solution...
Duplicate Question Detection with Deep Learning on Quora Dataset
Quora recently announced the first public dataset that they ever released. It includes 404351 question pairs with a label column indicating if they are duplicate or not. In this post, I like to investigate this dataset and at least propose a baseline method with deep learning...
Jobs
Data Scientist - SeatGeek - NYC SeatGeek operates a unique business model in a complicated, opaque market. Many of the hardest problems we face have never been tackled at scale and do not have clear questions, let alone answers. Moving forward requires critical thinking, rapid prototyping, and intellectual dexterity...
Training & Resources
6 Deep Learning Applications a beginner can build in minutes (using Python)
Through this article, I will showcase 6 such applications – which might look difficult at the outset, but can be achieved using Deep Learning implementation in less than an hour...
A Simple Choropleth w/ Tangram & Leaflet
I’ll take you through the process of creating a map using Tangram...
R for Excel Users
Excel users have a strong mental model of how data analysis works, and this makes learning to program more difficult. However, learning to program will allow you to do things that you can't do easily in Excel, and it really is worth the pain of learning the new model...
Books
The Number Sense: How the Mind Creates Mathematics "A fascinating look at the crossroads where numbers and neurons intersect"...
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S. Interested in reaching fellow readers of this newsletter? Consider sponsoring! Email us for details :) - All the best, Hannah & Sebastian