Data Science Weekly - Issue 610
Curated news, articles and jobs related to Data Science, AI, & Machine Learning
Issue #610
July 31, 2025
Hello!
Once a week, we write this email to share the links we thought were worth sharing in the Data Science, ML, AI, Data Visualization, and ML/Data Engineering worlds.
And now…let's dive into some interesting links from this week.
Editor's Picks
Vibe code is legacy code
We already have a phrase for code that nobody understands: legacy code…Programming is fundamentally theory building, not producing lines of code. We know this. This is why we make fun of business people who try to measure developer productivity in lines of code…When you vibe code, you are incurring tech debt as fast as the LLM can spit it out. Which is why vibe coding is perfect for prototypes and throwaway projects: It's only legacy code if you have to maintain it!…
How to name files
Low-tech common sense about filenames. The holy trinity is:machine readable
human readable
plays well with default ordering
Lightning talk for NormConf, by Jenny Bryan…
Agentic Coding Things That Didn’t Work
Using Claude Code and other agentic coding tools has become all the rage. Not only is it getting millions of downloads, but these tools are also gaining features that help streamline workflows. As you know, I got very excited about agentic coding in May, and I’ve tried many of the new features that have been added. I’ve spent considerable time exploring everything on my plate. But oddly enough, very little of what I attempted I ended up sticking with. Most of my attempts didn’t last, and I thought it might be interesting to share what didn’t work. This doesn’t mean these approaches won’t work or are bad ideas; it just means I didn’t manage to make them work. Maybe there’s something to learn from these failures for others…
What’s on your mind
This Week’s Poll:
Last Week’s Poll:
.
Data Science Articles & Videos
You have a drawer with some red socks and some blue socks. You know that if you pull out 2 socks randomly, the probability they're both red is 1/2…Given this information, what's the probability that the first sock you pull is red?…
An Even Easier Introduction to CUDA (Updated)
This post is a super simple introduction to CUDA, the popular parallel computing platform and programming model from NVIDIA. I wrote a previous post, An Easy Introduction to CUDA in 2013 that has been popular over the years. But CUDA programming has gotten easier, and GPUs have gotten much faster, so it’s time for an updated (and even easier) introduction…ML System Design Case Studies Repository
This repository is a comprehensive collection of 300+ case studies from over 80 leading companies, showcasing practical applications and insights into machine learning (ML) system design. Companies like Netflix, Airbnb, and Doordash have shared their experiences, providing a valuable resource for anyone interested in learning how ML is used to improve products and processes…Public Perspectives on AI Governance: A Survey of Working Adults in California, Illinois, and New York
This report presents findings from a March 2025 survey examining public attitudes toward specific artificial intelligence (AI) policy objectives among working-class adults in three US states: California, Illinois, and New York. The survey collected responses from 300 participants, measuring their degree of support for 18 specific AI regulatory proposals…Frequently Asked Questions (And Answers) About AI Evals
A list of every AI evals question ever…Foundational Data Structures (and Their Weird Cousins) Explained
This article not only provides a clear explanation of the core data structures but also explores their more complex and less-known variants such as B-Trees, Radix Trees, Ropes, Bloom Filters, and Cuckoo Hashing…What four math books had a big influence on your mathematical thinking? (X)
What four math books had a big influence on your mathematical thinking? I'll start…TidyBot++: An Open-Source Holonomic Mobile Manipulator for Robot Learning
This paper proposes an open-source design for an inexpensive, robust, and flexible mobile manipulator that can support arbitrary arms, enabling a wide range of real-world household mobile manipulation tasks. Crucially, our design uses powered casters to enable the mobile base to be fully holonomic, able to control all planar degrees of freedom independently and simultaneously…
Animated Maps with {ggplot2} and {gganimate}
In this blog post, we are going to use data from the {gapminder} R package, along with global spatial boundaries from ‘opendatasoft’. We are going to plot the life expectancy of each country in the Americas and animate it to see the changes from 1957 to 2007….
Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)…
AdamW Optimizer from Scratch in Python
In this tutorial, we will review the AdamW optimizer, which is currently the state of the art in most deep learning training regimens. AdamW is a variant of Adam which apply regularization in the form of weight decay in a very specific way that makes it much more stable than the regular Adam optimizer. We'll break down where this regularization helps and how it differs from the traditional Adam + L2 regularization….When calibration beats metrics
Having a classifier with great metrics is good, but it is not enough for it to be useful in production. One reason why it might still fail is because it could be that you are dealing with a badly calibrated model. The predictions might be fine, but the probability estimates can be way off. In this video we talk about how to think about calibration and what it means…Understanding ASTs
Learn about abstract syntax trees (ASTs) and how they are used in codemods. This page covers the basics of ASTs, including what they are and how they are used to represent the structure of code. We'll also discuss how to read and manipulate ASTs in your codemods to automatically refactor your codebase…
.
Last Week's Newsletter's 3 Most Clicked Links
.
* Based on unique clicks.
** Find last week's issue #609 here.
Cutting Room Floor
“Understanding Uncertainty”, a course in statistical thinking and data science
Flexflex: A typeface that responds to spatial requirements rather than imposing them
On The Role of Pretrained Language Models in General-Purpose Text Embeddings: A Survey
.
Whenever you're ready, 2 ways we can help:
Looking to get a job? Check out our “Get A Data Science Job” Course
It is a comprehensive course that teaches you everything related to getting a data science job based on answers to thousands of emails from readers like you. The course has 3 sections: Section 1 covers how to get started, Section 2 covers how to assemble a portfolio to showcase your experience (even if you don’t have any), and Section 3 covers how to write your resume.Promote yourself/organization to ~68,500 subscribers by sponsoring this newsletter. 30-40% weekly open rate.
Thank you for joining us this week! :)
Stay Data Science-y!
All our best,
Hannah & Sebastian