[in case you missed it] Data Science Weekly - Issue 369
Issue #369 Dec 17 2020
Editor Picks
How to Talk When a Machine is Listening:
Corporate Disclosure in the Age of AI
This paper analyzes how corporate disclosure has been reshaped by machine processors, employed by algorithmic traders, robot investment advisors, and quantitative analysts. Our findings indicate that increasing machine and AI readership, proxied by machine downloads, motivates firms to prepare filings that are more friendly to machine parsing and processing. Moreover, firms with high expected machine downloads manage textual sentiment and audio emotion in ways catered to machine and AI readers, such as by differentially avoiding words that are perceived as negative by computational algorithms as compared to those by human readers, and by exhibiting speech emotion favored by machine learning software processors...
Production Machine Learning Monitoring: Outliers, Drift, Explainers & Statistical Performance
A practical deep dive on production monitoring architectures for machine learning at scale using real-time metrics, outlier detectors, drift detectors, metrics servers and explainers...
Imitating Interactive Intelligence
Here we study how to design artificial agents that can interact naturally with humans using the simplification of a virtual environment. This setting nevertheless integrates a number of the central challenges of artificial intelligence (AI) research: complex visual perception and goal-directed physical control, grounded language comprehension and production, and multi-agent social interaction...we approximate the role of the human with another learned agent, and use ideas from inverse reinforcement learning to reduce the disparities between human-human and agent-agent interactive behaviour...our results in this virtual environment provide evidence that large-scale human behavioural imitation is a promising tool to create intelligent, interactive agents, and the challenge of reliably evaluating such agents is possible to surmount...
A Message from this week's Sponsor:
Data scientists are in demand on Vettery
Vettery is an online platform that connects you with thousands of actively hiring startups and Fortune 500 companies. Create a free profile, name your salary, and get discovered by hiring managers looking to grow their teams.
Get started - it’s completely free for job-seekers!
Data Science Articles & Videos
The depth-to-width interplay in self-attention
In our recent NeurIPS paper, we theoretically predict a width-dependant transition between depth-efficiency and depth-inefficiency in self-attention networks...We conduct extensive empirical ablations that clearly reveal the theoretically predicted behaviors, and provide explicit quantitative suggestions regarding the optimal depth-to-width allocation for a given self-attention network size...Informed guidelines for increasing depth and width in tandem have boosted performance of convolutional networks...The race towards beyond 1-Trillion parameter language models renders such guidelines an essential ingredient in the case of self-attention. Our guidelines elucidate the depth-to-width tradeoff in self-attention networks of sizes up to the scale of GPT3 (which is too deep for its size), and beyond...
NeurIPS 2020 Papers: Takeaways for a Deep Learning Engineer
Advances in Deep Learning research are of great utility for a Deep Learning engineer working on real-world problems as most of the Deep Learning research is empirical with validation of new techniques and theories done on datasets that closely resemble real-world datasets/tasks...Therefore, I went through all the titles of NeurIPS 2020 papers (more than 1900!) and read abstracts of 175 papers, and extracted DL engineer relevant insights from the following papers...
AI Picture Restorer
Restore pictures with AI in seconds. Hotpot builds on the latest research to automatically remove scratches, sharpen colors, and enhance faces, transforming tattered photos into cherished memories. Color photos with faded parts are also repairable...
Unsupervised Cross-lingual Representation Learning for Speech Recognition
This paper presents XLSR which learns cross-lingual speech representations by pretraining a single model from the raw waveform of speech in multiple languages. We build on wav2vec 2.0 which is trained by solving a contrastive task over masked latent speech representations and jointly learns a quantization of the latents shared across languages. The resulting model is fine-tuned on labeled data and experiments show that cross-lingual pretraining significantly outperforms monolingual pretraining... Analysis shows that the latent discrete speech representations are shared across languages with increased sharing for related languages. We hope to catalyze research in low-resource speech understanding by releasing XLSR-53, a large model pretrained in 53 languages...
The History of Data Exchange
IBM and General Electric invented the first databases in the early 1960s. It was only by the early 1970s that enough data had accumulated in databases that the need to transfer data between databases emerged. Enter the Comma Separated Values (CSV) file format, supported by the IBM Fortran compiler in 1972. Dump the contents of a table to a CSV, import it into another database. Sound familiar? That's because it's still the most common method of data distribution today...
Neural ODEs with PyTorch Lightning and TorchDyn
Effortless, Scalable Training of Neural Differential Equations: Traditional neural network models are composed of a finite number of layers. Neural Differential Equations (NDEs), a core model class of the so-called continuous-depth learning framework, challenge this notion by defining forward inference passes as the solution of an initial value problem...TorchDyn, part of the broader DiffEqML software ecosystem, offers an intuitive access-point to model design for continuous-depth learning. The library follows core design ideals driving the success of modern deep learning frameworks; namely modular, object-oriented and with a focus on GPUs and batched operations...
Giving more tools to software engineers: the reorganization of the factory
It's a popular attitude among developers to rant about our tools and how broken things are. Maybe I'm an optimistic person, because my viewpoint is the complete opposite! I had my first job as a software engineer in 1999, and in the last two decades I've seen software engineering changing in ways that have made us orders of magnitude more productive...
Object-based attention for spatio-temporal reasoning: Outperforming neuro-symbolic models with flexible distributed architectures
Neural networks have achieved success in a wide array of perceptual tasks, but it is often stated that they are incapable of solving tasks that require higher-level reasoning. Two new task domains, CLEVRER and CATER, have recently been developed to focus on reasoning, as opposed to perception, in the context of spatio-temporal interactions between objects. Initial experiments on these domains found that neuro-symbolic approaches...substantially outperform fully-learned distributed networks...we show on the contrary that a fully-learned neural network with the right inductive biases can perform substantially better than all previous neural-symbolic models on both of these tasks, particularly on questions that most emphasize reasoning over perception...
Forgetting in Deep Learning
Neural network models suffer from the phenomenon of catastrophic forgetting: a model can drastically lose its generalization ability on a task after being trained on a new task...For realistic applications of deep learning, where continual learning can be crucial, catastrophic forgetting would need to be avoided. However, there is only limited study about catastrophic forgetting and its underlying causes. In this project, we will explore how commonly used deep learning methods mitigate or exacerbate the degree of forgetting (e.g. batch-norm, dropout, data augmentation, weight decay, etc.). Further, we would like to select one or several methods and try to learn about the cause of effects...
Training*
Become a Leading Data Scientist
The Data Incubator Spring 2021 Data Science Fellowship Is Now Open!
The Data Incubator isn't your ordinary data science bootcamp—it's immersive. Work with live code, real data sets, and experienced instructors to master the skills you need to succeed in the business world.
Attend full-time or part-time and take advantage of our career services. Attend now and pay later with Income Sharing Agreements.
Apply early to increase your chances of earning a scholarship. Early applications are due January 15, 2021.
Apply Now
*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!
Jobs
Data Scientist - Apple Pay Analytics - NYC
You will play a key role improving the Apple Pay product experience. As a member of the analytics team you will be supporting a product function. You will partner with business owners, understand goals, craft KPIs and measure ongoing performance. You will initially engage with the product and engineering teams in ensuring that we have the appropriate instrumentation in place to deliver on these metrics. You will subsequently use advanced statistical, ML and analytical techniques to analyze product performance and identify key insights that inform product improvements and business strategy. The role requires a high degree of independence, ownership and collaboration working cross functionally across all levels of a highly matrixed organization...
Want to post a job here? Email us for details >> team@datascienceweekly.org
Training & Resources
Simulating the Pandemic in Python
A Step-by-Step Data Science Project: In real life, there are hundreds of factors that affect how fast a contagion spreads, both from person to person and on a broader population-wide scale. I’m no epidemiologist but I’ve done my best to set up a fairly basic simulation that can mimic how a virus can infect people and spread throughout a population...In my program, I will be using object-based programming. With this method, we could theoretically customize individual people and add in more events and factors — such as more complicated social dynamics...
Full Stack Deep Learning Course [Free, YouTube Videos]
Full Stack Deep Learning helps you bridge the gap from training machine learning models in a notebook to deploying AI systems in the real world...
Qlib
Qlib is an AI-oriented quantitative investment platform, which aims to realize the potential, empower the research, and create the value of AI technologies in quantitative investment...It contains the full ML pipeline of data processing, model training, back-testing; and covers the entire chain of quantitative investment: alpha seeking, risk modeling, portfolio optimization, and order execution...With Qlib, user can easily try ideas to create better Quant investment strategies...
Books
Hands-On Machine Learning with scikit-learn and Scientific Python Toolkits
Integrate scikit-learn with various tools such as NumPy, pandas, imbalanced-learn, and scikit-surprise and use it to solve real-world machine learning problems...
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian