Data Science Weekly - Issue 221
Issue #221 Feb 15 2018
Editor Picks
Building a Deep Neural Net In Google Sheets
I want to show you that Deep Convolutional Neural Nets are not nearly as intimidating as they sound. And I’ll prove it by showing you an implementation of one that I made in Google Sheets...
The Secret to Happiness
When Pebble launched the Happiness App, an experimental app that enables users to record their mood and energy levels throughout the day, nearly 10k users installed the app and started logging their responses! So what makes people happy?...
Introduction to Learning to Trade with Reinforcement Learning
In this post, I’m going to argue that training Reinforcement Learning agents to trade in the financial (and cryptocurrency) markets can be an extremely interesting research problem. I believe that it has not received enough attention from the research community but has the potential to push the state-of-the art of many related fields. It is quite similar to training agents for multiplayer games such as DotA, and many of the same research problems carry over. Knowing virtually nothing about trading, I have spent the past few months working on a project in this field....
A Message from this week's Sponsor:
Driverless AI - AI to do AI. Free 21-Day Trial.
Driverless AI speeds up data science workflows by automating feature engineering, model tuning, ensembling, and model deployment. Use Driverless AI to avoid common mistakes such as under or overfitting, data leakage or improper model validation. Try Driverless AI today - request a free 21-day trial
Data Science Articles & Videos
Color Advice for Data Visualization with D3.js:
Using color in a post-category20 world
The field of data visualization is replete with warnings that Color is Hard. But color is powerful. If you don’t feel capable of selecting a color scheme based on the fundamental principles of how humans perceive color, then what makes you think you can select between a hive plot and a Gannt Chart? Category20 might be dead but bad color use will never die. So here are some tips for those who want to actually improve their use of color...
When Waze no longer need humans
When will Waze no longer need people? Currently, Waze users tag vehicles stopped on the side of road, debris, slowed traffic, and those pesky hidden police. However, the state of the art in computer vision, powered by deep learning, means that it should not be too long before onboard cameras will enable the vehicle themselves to sense and interpret what is happening on the road around them...
Probabilistic Cookies!
In the spirit of Valentine’s Day, we thought it would be fun to bake cookies for our sweethearts. Being DIY nerds, we thought we’d math it up a bit. We used python to generate probability distributions and matplotlib to check our distributions. Then we wrote a python function to generate a SCAD file defining three-dimensional shapes from the distributions...
AI in the court: When algorithms rule on jail time
The centuries-old process of releasing defendants on bail, long the province of judicial discretion, is getting a major assist ... courtesy of artificial intelligence...
As China Marches Forward on A.I., the White House Is Silent
In July, China unveiled a plan to become the world leader in artificial intelligence and create an industry worth $150 billion to its economy by 2030. To technologists working on A.I. in the United States, the statement, which was 28 pages long in its English translation, was a direct challenge to America’s lead in arguably the most important tech research to come along in decades...
Deep Reinforcement Learning Doesn't Work Yet
Deep reinforcement learning is surrounded by mountains and mountains of hype. And for good reasons! Reinforcement learning is an incredibly general paradigm, and in principle, a robust and performant RL system should be great at everything. Merging this paradigm with the empirical power of deep learning is an obvious fit. Deep RL is one of the closest things that looks anything like AGI, and that’s the kind of dream that fuels billions of dollars of funding. Unfortunately, it doesn’t really work yet...
Processing a Trillion Rows Per Second on a Single Machine: How Can Nested Loop Joins be this Fast?
This blog post describes our experience debugging a failing test case caused by a cross join query running “too fast.” Because the root cause of fail test case spans across multiple layers—from Apache Spark to the JVM JIT compiler— we wanted to share our analysis in this post...
From zero to hero: Creating a chatbot with Rasa NLU and Rasa Core
AI assistants are a hot topic these days. Chances are that you have already had an encounter with at least one of them, as a user or as a developer. In this post, I would like to talk about a stack of software called Rasa, which you should definitely include in your toolbox if you would like to build conversational assistants yourself...
Jobs
VP, Data Science - Diply - Toronto
Diply VP of Data Science, Machine Learning and AI, will have the opportunity to build a superstar data science team from the ground up, both setting the strategy and ensuring tactical execution.
You will partner with business stakeholders to identify and prioritize top Data Science and AI opportunities, create business/technical requirements, transform over 50B monthly records of data into scientific models and AI-driven solutions, lead ML strategy and roadmap planning, and build out the data science and AI teams. The ideal leader will combine expert Data Science/AI/ML knowledge with hands on experience building algorithms/models/programming and outstanding management skills in managing teams and delivering complex/critical projects. Media industry experience is an asset...
Training & Resources
Initialize A TensorFlow Variable With NumPy Values
Learn how to initialize a TensorFlow Variable with NumPy values by using TensorFlow's get_variable operation and setting the Variable initializer to the NumPy values, via a screencast video and full tutorial transcript...
autoplotly - One Line of R Code to Build Interactive Visualizations
Automatic generation of interactive visualizations for popular statistical results in ggplot2 and plotly styles...
Samplers, samplers, everywhere...
This notebook aims to provide a basic example of how to run a variety of MCMC and nested sampling codes in Python...
Books
Bit by Bit: Social Research in the Digital Age
"The book goes well beyond "big data" to unpack the possibilities of doing social science research at a massive scale, and relatively inexpensively. This book should be read by social scientists who want to expand their research horizons, data scientists who want to understand how to incorporate the insights of social science, and anyone in a line of work in which they have potential data that can give them insights into how people behave..."
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S., Want to reach our audience / fellow readers? Consider sponsoring - grab a spot now; first come first served! All the best, Hannah & Sebastian