Data Science Weekly - Issue 304
Issue #304 Sept 19 2019
Editor Picks
How Backpropagation Works
Backpropagation is the beating heart of neural networks. Here's how it works...
The inside story of how AI got good enough to dominate Silicon Valley
Interview with Alex Krizhevsky about the story behind AlexNet and ImageNet competition...
Podcast: Juergen Schmidhuber
Godel Machines, Meta-Learning, and LSTMs Audio Player
Juergen Schmidhuber is the co-creator of long short-term memory networks (LSTMs) which are used in billions of devices today for speech recognition, translation, and much more. Over 30 years, he has proposed a lot of interesting, out-of-the-box ideas in artificial intelligence including a formal theory of creativity...
A Message from this week's Sponsor:
Data scientists are in demand on Vettery
Vettery is an online hiring marketplace that's changing the way people hire and get hired. Ready for a bold career move? Make a free profile, name your salary, and connect with hiring managers from top employers today.
Data Science Articles & Videos
The A.I. Boom Helped This Data Cleaning Startup Collect $100 Million From Investors
Trifacta, a startup that specializes in cleaning corporate data so it can be analyzed, has raised $100 million in funding, underscoring current investor appetite for data-crunching startups amid the artificial intelligence boom...
Forecaster: A Graph Transformer for Forecasting Spatial and Time-Dependent Data
Spatial and time-dependent data is of interest in many applications. This task is difficult due to its complex spatial dependency, long-range temporal dependency, data non-stationarity, and data heterogeneity. To address these challenges, we propose Forecaster, a graph Transformer architecture. ...
Cloud Text-to-Speech:
Text-to-speech conversion powered by machine learning.
Google Cloud Text-to-Speech converts text into human-like speech in more than 180 voices across 30+ languages and variants. It applies groundbreaking research in speech synthesis (WaveNet) and Google's powerful neural networks to deliver high-fidelity audio. With this easy-to-use API, you can create lifelike interactions with your users that transform customer service, device interaction, and other applications...
A Step Toward Quantifying Independently Reproducible Machine Learning Research
What makes a paper independently reproducible? Debates on reproducibility center around intuition or assumptions but lack empirical results. Our field focuses on releasing code, which is important, but is not sufficient for determining reproducibility. We take the first step toward a quantifiable answer by manually attempting to implement 255 papers published from 1984 until 2017, recording features of each paper, and performing statistical analysis of the results. For each paper, we did not look at the authors code, if released, in order to prevent bias toward discrepancies between code and paper...
A deep learning system for differential diagnosis of skin diseases
Rash decisions can be made better using a deep learning system.Nice work from Yuan Liu and Peggy Bui and many others on the dermatology research team at GoogleAI...
Emergent Tool Use from Multi-Agent Interaction
We've observed AIs discovering complex tool use while competing in a simple game of hide-and-seek. They develop a series of six distinct strategies and counterstrategies, ultimately using tools in the environment to break our simulated physics...
Software engineering for machine learning: a case study
The success of ML-centric projects depends heavily on data availability, quality, and management...
101 Data Science Interview Questions, Answers, and Key Concepts
Wanted to share a list of data science interview questions that my team created from top employers like Amazon, Microsoft, and Facebook...
The Simple Process To Get Real World Data Science Experience
You have a good combination of academic background and non-traditional training (Coursera, Udacity, and several other MOOC's) under your belt. Yet, when you look at job requirements for data science jobs, it causes you to get discouraged because you don't feel like you don't have enough or even any "real world experience"...
Conference*
DWCC Machine Learning Conference: state of the art - also online
Applied machine learning & real use cases. At DWCC2019 Conference engineering meets business and art. Meet experts who implement cutting-edge ML solutions in areas such as finance, medicine, industry, logistics, conversations, design, and art.
Full package of DWCC Online Conference includes video pack (with all presentations), the post-conference publication (extra summary written by speakers with details, like links to the library, Github or math), online networking. with code DS_WEEKLY you have 20% discount.
*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!
Jobs
Data Scientist - Crossix - Greater NYC Area
Crossix is the market leader in delivering hard-to-come-by insights that enable healthcare marketers to plan, measure, and optimize their marketing campaigns with confidence. Using our own proprietary technology and network of health and non-health data, our analyses pinpoint the tactics, programs, and channels that improve performance and boost sales, enabling better healthcare communications. And we do it all while protecting consumer privacy.
Crossix is seeking an intellectually curious, resourceful, and collaborative Data Scientist to join our Advanced Analytics team. This is an excellent opportunity to help us build out the technology and data science products that power our business...
Want to post a job here? Email us for details >> team@datascienceweekly.org
Training & Resources
PyWarm: A cleaner way to build neural networks for PyTorch.
PyWarm is a lightweight, high-level neural network construction API for PyTorch. It enables defining all parts of NNs in the functional way....
Bayesian Neural Networks
Pytorch implementations of Bayes By Backprop, MC Dropout, SGLD, the Local Reparametrization Trick, KF-Laplace and more...
Unsupervised Adversarial Training (UAT)
This repository contains the trained model and dataset used for Unsupervised Adversarial Training (UAT) from the paper Are Labels Required for Improving Adversarial Robustness?...
Books
The Book of R: A First Course in Programming and Statistics "The Book of R is a comprehensive, beginner-friendly guide to R, the world’s most popular programming language for statistical analysis. Even if you have no programming experience and little more than a grounding in the basics of mathematics, you’ll find everything you need to begin using R effectively for statistical analysis"...
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian