Data Science Weekly - Issue 650
Curated news, articles and jobs related to Data Science, AI, & Machine Learning
Issue #650
May 07, 2026
Hello!
Once a week, we write this email to share the links we thought were worth sharing in the Data Science, ML, AI, Data Visualization, and ML/Data Engineering worlds.
And now…let’s dive into some interesting links from this week.
Editor's Picks
Notes from inside China’s AI labs
Lessons from my trip to talk to most of the leading AI labs in China…
Learning & Exploring Survival Analysis Part 1 - A Note To Myself
A note to myself on survival analysis — KM curves, log-rank tests & Cox models 🧮 If I wrote it the way I understood it, maybe I’ll actually remember it 🤞…Two Cartography Expert Review MOVIE MAPS
John Nelson and Peter Attwood review maps in films. What they get right, what they might get wrong, and what we think of them as cartography nerds. Includes thoughts on: Indiana Jones, The Lord of the Rings, Avatar, War Games, Prometheus, Harry Potter, The Muppets, Pirates of the Caribbean, Game of Thrones, Moonrise Kingdom, The Goonies, and The Emperor’s New Groove…
What’s on your mind
This Week’s Poll:
.
Last Week’s Poll:
.
Data Science Articles & Videos
Bad Weather and the Subway
I’ve been looking at hourly ridership data from the New York City Subway. Last time we learned that people go to work in the morning and come home in the evening, for example. (All together now: “Only in New York, baby!”) Today, we’ll learn that bad weather makes people stay at home. Except, sometimes it doesn’t…A decade of being an average Data Scientist! My personal experience. [Reddit]
Hello! I know there’s people here with PhDs, working in FAANG, on top of the newest tech, and are absolutely brilliant Data Scientists. I’m not one of them…I just wanted to share my positive experience from someone who is painfully average lol!! I wanted to show people, especially new grads and/or people pivoting into the field, that you don’t have to be the smartest person in the room to get hired. You need to drill into the solid foundations and a have a drive to make change/bring value to a company…
The World Inside Neural Networks
How neural geometry will unlock understanding and control of AI…If we understand how a model carves up and represents the world conceptually (i.e., its ontology), this will unlock a far deeper understanding of both its algorithms that operate over that ontology and the intelligent behaviors produced by those algorithms. We thus need new methods to gain that understanding; this series of posts details our early efforts to develop them, building upon – and alongside – numerous related efforts from others…Quantum Machine Learning: The Pragmatic Guide for classical ML Engineers
Part 1 of the “Quantum ML for Engineers” series: From Transformers and GPUs to QPUs and Hybrid Intelligence…Bringing OpenTelemetry to R in production
Posit has instrumented Shiny, plumber2, mirai, httr2, ellmer, knitr, testthat, and DBI with OpenTelemetry, and created tools for you to instrument your own packages, bringing production-grade observability to R…Data 100: Principles and Techniques of Data Science UC Berkeley, Spring 2026
This intermediate level class bridges between Data 8 and upper division computer science and statistics courses as well as methods courses in other fields. In this class, we explore key areas of data science including question formulation, data collection and cleaning, visualization, statistical inference, predictive modeling, and decision making. Through a strong emphasis on data centric computing, quantitative critical thinking, and exploratory data analysis, this class covers key principles and techniques of data science…Inverse Probability Weighting: From Survey Sampling to Evidence Estimation
We consider the class of inverse probability weight (IPW) estimators, including the popular Horvitz–Thompson and Hájek estimators used routinely in survey sampling, causal inference and for Bayesian computation. We focus on the ‘weak paradoxes’ for these estimators due to two counterexamples by Basu (1988) and Wasserman (2004) and investigate the two natural Bayesian answers to this problem: one based on binning and smoothing: a ‘Bayesian sieve’ and the other based on a conjugate hierarchical model that allows borrowing information via exchangeability…The Magic of In-Context Learning (ICL): When Your Model Already Knows Your Data
As an experienced data scientist, you have seen thousands of datasets in your career. When confronted with new data, your natural neural network (a.k.a. brain) simply draws on this vast library of past mathematical shapes and immediately recognizes the pattern. But what if an artificial neural network could do exactly the same thing? What if it could predict your data without actually being trained on it?…Welcome to the mind-bending world of In-Context Learning (ICL) for tabular data, brought to R via the incredible newTabPFNpackage (on CRAN)…Three ways to differentiate ReLU
When a function is not differentiable in the classical sense there are multiple ways to compute a generalized derivative. This post will look at three generalizations of the classical derivative, each applied to the ReLU (rectified linear unit) function. The ReLU function is a commonly used activation function for neural networks. It’s also called the ramp function for obvious reasons…
You Don’t Need to Learn All the Weights on tabular data: The Case for rvflnet (a nonlinear expressive glmnet) on regression, classification and survival analysis
Random Vector Functional Link (RVFL) networks offer a simple yet powerful alternative to traditional neural networks for tabular data. Instead of learning hidden layers through backpropagation, RVFL generates them randomly (or not, if using a deterministic sequence of quasi-random numbers) and focuses all learning effort on a final, regularized linear model…Comparing R’s {targets} and dbt for Data Engineering
I’m getting more and more into data engineering these days and having used R for a long time, I’m seeing a lot of problems that look nail-shaped to my R-shaped hammer. The available tools to solve those problems exist for (presumably) very good reasons, so I wanted to take some time to dig into how to use them and compare their workflows to what I would otherwise naively do in R…
DS market is kind of insane right now [Reddit]
So here’s the story: another team in my company opened an associate-level DS role last week, we got 300+ applications, and somehow there were 30+ senior-level guys applying for it. Not fake senior either. Like actually senior all with 10+ yoe….Curious that are other people & teams seeing the same thing, or is this just a weird sample on our side?…Visualizing History: The Polish System
For the Polish educator Antoni Jażwiński, history was best represented by an abstract grid — or at least it was for the purposes of remembering it. The so-called “Polish System” originated in the 1820s and was later brought to public attention in the 1830s and 1840s by General Józef Bem, a military engineer with a penchant for mnemonics…
.
Last Week's Newsletter's 3 Most Clicked Links
.
* Based on unique clicks.
** Please take a look at last week's issue #649 here.
Cutting Room Floor
.
Whenever you're ready, 3 ways we can help:
Go deeper each week (paid subscription)
Get 3 additional posts per week designed to help you:Statistics → understand the math behind ML
AI Agents → build with modern AI tools
Career → become more valuable at your job
Looking to get a job?
A practical guide to landing your first (or next) data science role, based on thousands of reader questions.
👉 Check out our “Get A Data Science Job” CoursePromote your organization/project/event to ~68,500 subscribers
Sponsor this newsletter and reach a highly engaged data science audience (30–35% open rate).
👉 Reply to this email to learn more
Thank you for joining us this week! :)
Stay Data Science-y!
All our best,
Hannah & Sebastian


