Data Science Weekly - Issue 255
Issue #255 Oct 11 2018
Editor Picks
The ML Engineering Loop
In this article, we’ll describe our conception of the “OODA Loop” of ML: the ML Engineering Loop, where ML Engineers iteratively: Analyze; Select an approach; Implement; Measure to rapidly and efficiently discover the best models and adapt to the unknown. In addition, we will give concrete tips for each of these phases, as well as to optimize the process as a whole...
The hacker's guide to uncertainty estimates
It started with a tweet: "New years resolution: every plot I make during 2018 will contain uncertainty estimates". Nine months in and I have learned a lot...
A look at how we built the Emoji Scavenger Hunt using TensorFlow.js
In this post we’ll discuss the inner workings of the experimental game, Emoji Scavenger Hunt. We’ll show you how we used TensorFlow to train a custom model for object recognition and how we use that model on the web front-end with TensorFlow.js....
A Message from this week's Sponsor:
Move ahead of your peers in the most in-demand field
Business Analytics at Clark University will give you the skills employers demand by teaching you how to synthesize data into powerful information. Whether you enroll in a full- or part-time master’s or accelerated certificate program, you will be equipped to transform data into something meaningful.
You don’t need a background in statistics or science to succeed here. We offer:
Blended curriculum
Career-ready courses
Generous scholarships
Get started. Learn more at clarku.edu/analytics
Data Science Articles & Videos
Entropy is a measure of uncertainty
Eight properties, several examples and one theorem...
YouTube Trending Videos Analysis
YouTube is the most popular and most used video platfrom in the world today. YouTube has a list of trending videos that is updated constantly. Here we will use Python with some packages like Pandas and Matplotlib to analyze a dataset that was collected over 205 days...
Multiple Comparisons in Induction Algorithms
A little-known paper explaining how the multiple comparisons problem underlies several important datascience problems....
RecSys 2018: recommender systems that care!
Summary of RecSys2018...
Reinforcement Learning for Improving Agent Design
In many reinforcement learning tasks, the goal is to learn a policy to manipulate an agent, whose design is fixed, to maximize some notion of cumulative reward. The design of the agent’s physical structure is rarely optimized for the task at hand. In this work, we explore the possibility of learning a version of the agent’s design that is better suited for its task, jointly with the policy...
Deep Learning just tipped into Exascale Territory
Today, researchers from Berkeley Lab and Oak Ridge, along with development partners at Nvidia demonstrated some rather remarkable results using deep learning to extract weather patterns based on existing high-res climate simulation data. This places the collaboration in the running for this year’s Gordon Bell Prize, an annual award based on high performance, efficient use of real-world applications that can scale on some of the world’s most powerful supercomputers...
SOTAWHAT - A script to keep track of state-of-the-art AI research
I often get frustrated searching for the latest research results on Google and Arxiv so I wrote SOTAwhat, a script to query Arxiv for the latest abstracts and extract summaries from them...
Holodeck
Holodeck is a high-fidelity simulator for reinforcement learning built on top of Unreal Engine 4, with an OpenAI-Gym interface...
Jobs
Data Scientist - Pear Therapeutics - San Francisco or Boston
At Pear Therapeutics, we have the privilege of building the world’s first-ever class of prescription digital therapeutics. By nature of our therapeutics as digital applications, we have access to rich datasets and unique opportunities to drive clinical outcomes. As a Data Scientist, you will be responsible for shaping and delivering data-driven insights. We are looking for data scientists with a deep product sense, who have an innate curiosity, and are eager to dive into large, complex datasets and create actionable insights....
Training & Resources
How To Define A Sequential Neural Network Container In PyTorch
Learn how to use PyTorch's nn.Sequential and add_module operations to define a sequential neural network container, via a screencast video and full tutorial transcript...
LensKit
Today, we're presenting the next generation of LensKit: Python tools for recsys experiments...
Communication Primitives in Deep Learning Frameworks
For large-scale training of neural networks you need to specify how the work is broken up across machines. This is where communication primitives come in play. There are three common approaches used in modern frameworks: MPI-collectives, task-based and computation-graph based...
Books
Data Visualization with Python and JavaScript:
Scrape, Clean, Explore & Transform Your Data Learn how to turn raw data into rich, interactive web visualizations with the powerful combination of Python and JavaScript. With this hands-on guide, author Kyran Dale teaches you how build a basic dataviz toolchain with best-of-breed Python and JavaScript libraries—including Scrapy, Matplotlib, Pandas, Flask, and D3—for crafting engaging, browser-based visualizations...
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S., Want to reach our audience / fellow readers? Consider sponsoring - grab a spot now; first come first served! All the best, Hannah & Sebastian