[in case you missed it] Data Science Weekly - Issue 425

Jan 16, 2022

Issue #425 January 13 2022

Editor Picks

🚩 red flags 🚩 to look out for when considering whether to join a data / tech team
A Twitter Thread...

Navigate Through the Current AI Job Market: A Retrospect
Inspired by the fantastic talk focusing on career path doing AI research by Rosanne Liu and the amazing blog post on landing a job at top-tier AI labs by Aleksa Gordić, I want to share my recent experience to offer a more pragmatic perspective. The position specturm in the current AI industry can be roughly depicted in the figure below...

NeurIPS Anthology Visualization
The NeurIPS conference has been around for more than 35 years, and interest in the fields of AI/ML is still rapidly growing. A diversification of interests has birthed many sub-fields within the fields, making it harder for novices and senior researchers alike to orient themselves and their work within the historical context of research published at NeurIPS. We created an interactive visualization to investigate the papers from the last 35+ years...

A Message from this week's Sponsor:

Live Webinar | How to Align AI & BI to Business Outcomes

Wednesday, Jan 26 at 2PM ET (11AM PT)

Get practical advice from Global 1000 data leaders at Visa, Cigna, Amazon, and HCL technologies on how they are aligning AI & BI toward business outcomes at their organizations.

Data Science Articles & Videos

Transformers
Transformer models have become the go-to model in most of the NLP tasks. Many transformer-based models like BERT, ROBERTa, GPT series, etc are considered as the state-of-the-art models in NLP. While NLP is overwhelming with all these models, Transformers are gaining popularity in Computer vision also...While transformer models are taking over the AI field, it is also important to have a low-level understanding of these models. This blog aims to give an understanding of Transformer and Transformer based models. This includes the model components, training details, metrics and loss function, performance, etc...

XManager: A framework for managing machine learning experiments
XManager is a platform for packaging, running and keeping track of machine learning experiments. It currently enables one to launch experiments locally or on Google Cloud Platform (GCP)...

Online public discourse on artificial intelligence and ethics in China: context, content, and implications
The societal and ethical implications of artificial intelligence (AI) have sparked vibrant online discussions in China. This paper analyzed a large sample of these discussions which offered a valuable source for understanding the future trajectory of AI development in China as well as implications for global dialogue on AI governance...

Automated Reinforcement Learning (AutoRL): A Survey and Open Problems
In this survey we seek to unify the field of AutoRL, we provide a common taxonomy, discuss each area in detail and pose open problems which would be of interest to researchers going forward...

Medical AI: Why Clinicians Swipe Left
The most common reason your medical AI will be rejected by clinicians and how to overcome it...

The Use and Practice of Scientific Machine Learning [Video]
Scientific machine learning (SciML) methods allow for the automatic discovery of mechanistic models by infusing neural network training into the simulation process. In this talk we will start by showcasing some of the ways that SciML is being used, from discovery of extrapolatory epidemic models to nonlinear mixed effects models in pharmacology. From there, we will discuss some of the increasingly advanced computational techniques behind the training process, focusing on the numerical issues involved in handling differentiation of highly stiff and chaotic systems. The viewers will leave with an understanding of how compiler techniques are being infused into the simulation stack to increasingly automate the process of developing mechanistic models...

Open source projects to contribute [Reddit Discussion]
I'm at the point where I'd like to start contributing in a more meaningful way to the community. Does anyone have idea of good open source projects related to DL (maybe even classic machine learning) that are looking for contributors?...

Why Google Treats SQL Like Code and You Should Too
For the past 2 years as a vendor working at Google, I’ve been observing the way Data Engineers at Google treat SQL the same way Software Engineers treat code. This winning mentality can be integrated into the data strategy of any company of any size. I’m going to walk through the ways that Google benefits from treating SQL like code and provide specific ways that all organizations can benefit from these principles...

Introduction to variational autoencoders
Overview of the training setup for a variational autoencoder with discrete latents trained with Gumbel-Softmax. By the end of this tutorial, this diagram should make sense!...

Intro to Probabilistic Programming with PyMC [Video]
In the last ten years, there have been a number of advancements in the study of Hamiltonian Monte Carlo and variational inference algorithms that have enabled effective Bayesian statistical computation for much more complicated models than were previously feasible...This talk will give an introduction to probabilistic programming with PyMC, with a particular emphasis on the how open source probabilistic programming makes Bayesian inference algorithms near the frontier of academic research accessible to a wide audience...

Tools*

Free Course: Natural Language Processing (NLP) for Semantic Search

Learn how to build semantic search applications by making machines understand language as people do. This free course covers everything you need to build state-of-the-art language models, from machine translation to question-answering, and more. Brought to you by Pinecone. Start reading now.

*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!

Jobs

(Senior) Analytics Engineer - Fabulous - Remote

Fabulous is a mobile app helping thousands of people every day to change their lifestyles by integrating healthy habits into their lives. Fabulous is using a behavioral economics lens to help everyone achieve their fullest potential. We work closely with researchers based at Duke University and our advisor is Dan Ariely, author of NYT bestseller Predictably Irrational. We are looking for an experienced Analytics Engineer to consolidate the Data Science team and lead the development and enrichment of our Data Pipelines. We have a modern Data-Stack based on Fivetran, dbt, BigQuery, Amplitude, Metabase...

Want to post a job here? Email us for details >> team@datascienceweekly.org

Training & Resources

PlotNeuralNet: Latex code for making neural networks diagrams
Latex code for drawing neural networks for reports and presentation. Have a look into examples to see how they are made...

Updated version of our free online course "Statistical Learning" is now available on EdX
It features new lectures on DL, Survival Analysis and Multiple Testing (with Gareth James & Daniela Witten) in addition to the already existing information...

Linear Algebra — Survival Kit for Machine Learning
A quick reference guide of the most common concepts with examples in NumPy...

Books

Hands-On Machine Learning with scikit-learn and Scientific Python Toolkits
Integrate scikit-learn with various tools such as NumPy, pandas, imbalanced-learn, and scikit-surprise and use it to solve real-world machine learning problems...

For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.

P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian

Data Science Weekly Newsletter