Data Science Weekly - Issue 303
Issue #303 Sept 12 2019
Editor Picks
Need Some Fashion Advice? Just Ask the Algorithm
Stitch Fix is launching a new feature, driven by machine learning, that builds an outfit to suit your personal style...
An important quantum algorithm may actually be a property of nature
Evidence that quantum searches are an ordinary feature of electron behavior may explain the genetic code, one of the greatest puzzles in biology...
Dungeon crawling or lucid dreaming?
Playing a dungeon game where a neural net is DMing has a lot in common with lucid dreaming...
A Message from this week's Sponsor:
40% off at Manning
Do more with your data! If you're looking to make your data skills stand out, then be sure to check out Manning's range of books and video courses.
They're offering 40% off everything in their catalog, so there's no better time to learn something new...
Data Science Articles & Videos
AI-first biology
In this post, I explain why biology is experiencing its "AI moment". That is to say, many areas of biology, and in particular, imaging, are being significantly transformed by the use of AI. I walk through several examples of AI-first imaging analysis in biology through the lens of technical developments in computer vision. Finally, I make the case for building full-stack solutions in biology using the pharmaceutical industry as an example. I present an analogy for reasoning about where value accrues in AI-first biology......
Why American Workers Need to Be Protected From Automation
As President, I would issue a robot tax for corporations displacing humans, and create a federal agency to oversee automation...
Generative Dog Images
Experiment with creating puppy pics...
Three-D Safari: Learning to Estimate Zebra Pose, Shape, and Texture from Images "In the Wild"
Our new work in 3D vision towards animal conservation: Automatic 3D digitization of Grevy's zebra, one of the most endangered species in Africa...
DeepPrivacy: A Generative Adversarial Network for Face Anonymization
DeepPrivacy is a fully automatic anonymization technique for images. The DeepPrivacy GAN never sees any privacy sensitive information, ensuring a fully anonymized image. It utilizes bounding box annotation to identify the privacy-sensitive area, and sparse pose information to guide the network in difficult scenarios. ...
AI thinks this flood photo is a toilet.
Fixing that could improve disaster response.
A new data set aims to teach computer vision systems to recognize images from disasters...
Dances with HiPPOs
Decision making, especially at technology companies, is supposed to be data-driven. Unfortunately, even in this wondrous age of science, decisions often depends on what Avinash Kaushik and Ronny Kohavi call the Highest Paid Person’s Opinion, or HiPPO. How should a data-driven developer deal with the HiPPO in the room?...
Achieving Verified Robustness to Symbol Substitutions via Interval Bound Propagation
Neural networks in NLP are vulnerable to adversarially crafted inputs. We show that they can be trained to become certifiably robust against input perturbations such as typos and synonym substitution in text classification...
Get Up To Speed Fast As A Junior Data Scientist
You are a new junior data scientist and you want to get started the right way. You want to make sure you don't make the same mistakes others have made early in their data scientist careers because you want to prove to your employers that they made the right choice. As such, you need to figure out how to get up to speed as fast as possible...
Webinar*
Join this technical webinar, as Domino Chief Data Scientist Josh Poduska will dive into popular open source AutoML tools such as auto-sklearn, TPOT, MLBox, and AutoKeras. Register here.
*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!
Jobs
Data Scientist - Crossix - Greater NYC Area
Crossix is the market leader in delivering hard-to-come-by insights that enable healthcare marketers to plan, measure, and optimize their marketing campaigns with confidence. Using our own proprietary technology and network of health and non-health data, our analyses pinpoint the tactics, programs, and channels that improve performance and boost sales, enabling better healthcare communications. And we do it all while protecting consumer privacy.
Crossix is seeking an intellectually curious, resourceful, and collaborative Data Scientist to join our Advanced Analytics team. This is an excellent opportunity to help us build out the technology and data science products that power our business...
Want to post a job here? Email us for details >> team@datascienceweekly.org
Training & Resources
Document Embedding Techniques
A review of notable literature on the topic...
Larq
Larq is an open-source deep learning library for training neural networks with extremely low precision weights and activations, such as Binarized Neural Networks (BNNs)...
Introducing LCA:
Loss Change Allocation for Neural Network Training
Uber AI Labs proposes Loss Change Allocation (LCA), a new method that provides a rich window into the neural network training process...
Books
The Book of R: A First Course in Programming and Statistics "The Book of R is a comprehensive, beginner-friendly guide to R, the world’s most popular programming language for statistical analysis. Even if you have no programming experience and little more than a grounding in the basics of mathematics, you’ll find everything you need to begin using R effectively for statistical analysis"...
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian