[in case you missed it] Data Science Weekly - Issue 311

Nov 09, 2019

Issue #311 Nov 7 2019

Editor Picks

Computers Evolve a New Path Toward Human Intelligence
Neural networks that borrow strategies from biology are making profound leaps in their abilities. Is ignoring a goal the best way to make truly intelligent machines?...

History’s message about regulating AI
As we consider artificial intelligence, we would be wise to remember the lessons of earlier technology revolutions—to focus on the technology’s effects rather than chase broad-based fears about the technology itself...

Highlights from the 2019 Google AI Residency Program
The program’s latest installment was our most successful yet, as residents advanced progress in a broad range of research fields, such as machine perception, algorithms and optimization, language understanding, healthcare and many more. Below are a handful of innovative projects from some of this year’s alumni...

A Message from this week's Sponsor:

Introducing Helix: the first dynamic data engine for data science teams

Helix is the first instant responsive data engine that creates a dual backbone of modern business intelligence and interactive data science. Now, self-serve dashboards can be built with a single query. Every report or ad hoc exploration can be visually explored and extended by stakeholders...

Data Science Articles & Videos

AI and Compute
We’re releasing an analysis showing that since 2012, the amount of compute used in the largest AI training runs has been increasing exponentially with a 3.4-month doubling time (by comparison, Moore’s Law had a 2-year doubling period)...

Predicting Airbnb prices with machine learning and location data
A case study using data from the City of Edinburgh, Scotland...

Key challenges for delivering clinical impact with artificial intelligence
Artificial intelligence (AI) research in healthcare is accelerating rapidly, with potential applications being demonstrated across various domains of medicine. However, there are currently limited examples of such techniques being successfully deployed into clinical practice. This article explores the main challenges and limitations of AI in healthcare, and considers the steps required to translate these potentially transformative technologies from research to clinical practice...

UR-FUNNY: A Multimodal Language Dataset for Understanding Humor
They used TED Talk transcripts with laughter cues to create a humor dataset that can be used for humor detection and other humor analyses...

The Measure of Intelligence
I've just released a fairly lengthy paper on defining & measuring intelligence, as well as a new AI evaluation dataset, the "Abstraction and Reasoning Corpus". I've been working on this for the past 2 years, on & off...

Releasing Spleeter: Deezer Research source separation engine
Spleeter is an open-source project from Deezer (the French Spotify) that uses Deep Learning to do source separation on musical tracks. Built with Keras and TensorFlow. It runs out-of-the-box on CPU!...

The AI hiring industry is under scrutiny—but it’ll be hard to fix
An artificial-intelligence tool has already been used on over a million applicants. But critics worry that these types of algorithm are trained on limited data and so will be more likely to mark “traditional” applicants (white, male) as more employable...

Fast Transformer Decoding: One Write-Head is All You Need
Multi-head attention layers, as used in the Transformer neural sequence model, are a powerful alternative to RNNs for moving information across and between sequences. While training these layers is generally fast and simple, due to parallelizability across the length of the sequence, incremental inference (where such paralleization is impossible) is often slow, due to the memory-bandwidth cost of repeatedly loading the large "keys" and "values" tensors. We propose a variant called multi-query attention, where the keys and values are shared across all of the different attention "heads", greatly reducing the size of these tensors and hence the memory bandwidth requirements of incremental decoding...

How should a Data Scientist's resume differ from an Academic CV?
Your academic cv is very coursework and research focused. You've heard business resumes need to be more action and results oriented, but you're not sure what that means for you. You're looking for advice on how to re-work your academic cv and not finding much advice out here. To help get you started, here are some thoughts on what you'll need to do...

Webcast*

As more models go to production, managing and monitoring them becomes more onerous. Without proactive monitoring of production models, organizations are exposed to the risk of poor predictions on evolving data affecting their business outcomes. Learn how Domino helps you monitor your models at scale.

Register here to attend or receive the recording.

*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!

Jobs

Data Scientist - Datadog - NYC

At Datadog, we’re on a mission to build the best monitoring platform in the world. We operate at high scale—trillions of data points per day—and high availability, providing always-on alerting, visualization, and tracing for our customers' infrastructure and applications around the globe.

Our engineering culture values pragmatism, honesty, and simplicity to solve hard problems the right way. We need you to design and build machine learning-powered products that help our customers learn from their data and make better decisions in real-time....

Want to post a job here? Email us for details >> team@datascienceweekly.org

Training & Resources

Knowledge Graphs & NLP @ EMNLP 2019 Part I
The review post of the papers from ACL 2019 on knowledge graphs (KGs) in NLP was well-received so I thought maybe it would be beneficial for the community to look through the proceedings of EMNLP 2019 for the latest state of the art in applying knowledge graphs in NLP. Let’s start!...

Introductory GANs Course
This course covers GAN basics, and also how to use the TF-GAN library to create GANs....

2019 LookML Open-Source State of the Union
With this growth in open-source projects, and little in the way to organize and discover them, we saw a need to put together a comprehensive survey. We presented this overview at JOIN, and now bring it to you in the first (of hopefully many such) LookML Open-Source State of the Union reports...

Books

The Lady Tasting Tea:
How Statistics Revolutionized Science in the Twentieth Century An insightful, revealing history of how mathematics transformed our world...

"I have taken courses in statistics, taught it many times and solved several statistical problems that have appeared in journals. But until I read this book, I never really thought about it in so deep and philosophical a manner..."

For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.

P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian

Data Science Weekly Newsletter

Discussion about this post