[in case you missed it] Data Science Weekly - Issue 312
Issue #312 Nov 14 2019
Editor Picks
PhDs: the tortuous truth
Nature’s survey of more than 6,000 graduate students reveals the turbulent nature of doctoral research.\...
Scientists have found a way to decode brain signals into speech
It’s a step towards a system that would let people send texts straight from their brains....
Neurons spike back
This article retraces the history of artificial intelligence through the lens of the tension between symbolic and connectionist approaches. From a social history of science and technology perspective, it seeks to highlight how researchers, relying on the availability of massive data and the multiplication of computing power have undertaken to reformulate the symbolic AI project by reviving the spirit of adaptive and inductive machines dating back from the era of cybernetics...
A Message from this week's Sponsor:
Data scientists are in demand on Vettery
Vettery is an online hiring marketplace that's changing the way people hire and get hired. Ready for a bold career move? Make a free profile, name your salary, and connect with hiring managers from top employers today.
Data Science Articles & Videos
Engineering Uncertainty Estimation in Neural Networks for Time Series Prediction at Uber
In this article, we introduce a new end-to-end Bayesian neural network (BNN) architecture that more accurately forecasts time series predictions and uncertainty estimations at scale. We also discuss how Uber has successfully applied this model to large-scale time series anomaly detection, enabling us to better accommodate rider demand during high-traffic intervals.4...
Neutrinos Lead to Unexpected Discovery in Basic Math
Three physicists wanted to calculate how neutrinos change. They ended up discovering an unexpected relationship between some of the most ubiquitous objects in math...
Exploiting GAN Internal Capacity for High-Quality Reconstruction of Natural Images
We propose to exploit the representation in intermediate layers of the generator, and we show that this leads to increased capacity. In particular, we observe that the representation after the first dense layer, present in all state-of-the-art GAN models, is expressive enough to represent natural images with high visual fidelity. It is possible to interpolate around these images obtaining a sequence of new plausible synthetic images that cannot be generated from the latent space...
Detecting Glaucoma Using 3D Convolutional Neural Network of Raw SD-OCT Optic Nerve Scans
We propose developing and validating a three-dimensional (3D) deep learning system using the entire unprocessed OCT optic nerve volumes to distinguish true glaucoma from normals in order to discover any additional imaging biomarkers within the cube through saliency mapping. The algorithm has been validated against 4 additional distinct datasets from different countries using multimodal test results to define glaucoma rather than just the OCT alone...
The never-ending issues around AI and bias. Who’s to blame when AI goes wrong?
We’ve seen it before, we’re seeing it again now with the recent Apple and Goldman Sachs alleged credit card bias issue, and we’ll very likely continue seeing it well into 2020 and beyond. Bias in AI is there, it’s usually hidden, (until it comes out), and it needs a foundational fix...
How to turn the complex mathematics of vector calculus into simple pictures
Feynman diagrams revolutionized particle physics. Now mathematicians want to do the same for vector calculus...
Bridging the Patient-Physician Gap with ML and Expert Systems w/ Xavier Amatriain
With the goal of providing the world’s best primary care to patients via their smartphone, Xavier turned to machine learning and AI to bring down costs and make Curai accessible and scaleable. In our conversation, we touch on the shortcomings of traditional primary care, and how Curai fills that role, and some of the unique challenges his team faces in applying this use case in the healthcare space. We also discuss the use of expert systems, how they develop and train these systems with synthetic data through noise injection, and how NLP projects like BERT, Transformer, and GPT-2 fit into what Curai is building...
How 20th Century Fox uses ML to predict a movie audience
Understanding the market segmentation of the movie-going public is a core function of movie studios. Over the years, studios have invested in high-level data processes to try to map out customer segments, and to make predictions for future films. However, to date, granular predictions at the segment level, not to mention at the customer level, have remained elusive because of technological and institutional barriers...
How To Start A Data Science Project When You Are A Beginner
You know you should have some data science projects on your resume/portfolio to show what you know. The only problem is that although you've taken some intro courses at your school, gone through some MOOC's, and read a few blog posts, when you look to other people's work you think it's out of your league...
Training*
Create D3 Data Visualizations As Fast As You Can Sketch
You need to create a D3.js data visualization to communicate your insights. But... #d3BrokeAndMadeArt! This time, your data join appears to have broken and the JavaScript console shows an error you don't recognize. Last time, you got stuck trying to figure out how to make axes that didn't look like 3rd graded made them. It makes you want to strangle D3 with your bare hands. Just how steep does the D3 learning curve need to be?!
What if you could learn and master D3 quickly and deeply?
Great news! - You can ... Check out DashingD3js.com Screencasts today!
*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!
Jobs
Data Scientist - Driven Brands - Charlotte, NC
The Data Scientist for Driven Brands focus will be responsible for providing reliable marketing, media and promotional performance analysis and reporting to Senior Executives and Business Unit Management to be used to make decisions impacting the performance of the business...
Want to post a job here? Email us for details >> team@datascienceweekly.org
Training & Resources
XLM-R: State-of-the-art cross-lingual understanding through self-supervision
A new model, called XLM-R, that uses self-supervised training techniques to achieve state-of-the-art performance in cross-lingual understanding, a task in which a model is trained in one language and then used with other languages without additional training data. Our model improves upon previous multilingual approaches by incorporating more training data and languages — including so-called low-resource languages, which lack extensive labeled and unlabeled data sets...
All you need to know about text preprocessing for NLP and Machine Learning
I thought of shedding some light around what text preprocessing really is, the different methods of text preprocessing, and a way to estimate how much preprocessing you may need...
New R Support in Azure Machine Learning
A new R package azuremlsdk (available to install from Github now, and from CRAN soon), provides the interface to the Azure Machine Learning service. ...
Books
The Lady Tasting Tea:
How Statistics Revolutionized Science in the Twentieth Century An insightful, revealing history of how mathematics transformed our world...
"I have taken courses in statistics, taught it many times and solved several statistical problems that have appeared in journals. But until I read this book, I never really thought about it in so deep and philosophical a manner..."
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian