Data Science Weekly - Issue 655
Curated news, articles and jobs related to Data Science, AI, & Machine Learning
Issue #655
June 11, 2026
Hello!
Once a week, we write this email to share the links we thought were worth sharing in the Data Science, ML, AI, Data Visualization, and ML/Data Engineering worlds.
And now…let’s dive into some interesting links from this week.
Editor's Picks
An Overview of Modern AI Robotics from First Principles
There is a deceptively simple way to describe what physical AI is all about, a way in which anyone with a STEM background will intuitively understand. Like all other AI models, a model which controls a robot is also a function. It takes in observations (camera pixels, joint angles, the felt resistance of a gripper, etc) and it outputs actions, the next set of positions and torques for its motors…If you’ve ever trained a model that maps inputs to outputs, you can already grasp the shape of the problem. The interesting part is what happens when you take this familiar shape and drop it into a moving, active world…This sounds like ordinary machine learning, and for a while you can pretend it is. But robotics introduces a third axis that classic ML never had to respect: inference time…
My unvarnished guide to solution engineering
Nowadays I feel more or less comfortable interacting with customers. But I was awful at first. I know because one of the cofounders gave me harsh feedback after a call with our first serious customer. I still remember slamming the lid of my computer when we debriefed. What I perceived as harsh feedback at the time turned out to help me grow quickly…I used to be a regular data scientist assigned to internal projects. Talking to prospects and customers got me out of my comfort zone. You owe them a service, and they expect you to deliver something. If something goes wrong they’ll go above your head to your founders, at which point you start feeling the heat. It can be quite harsh. But it can also be rewarding when things go well…Navier-Stokes fluid simulation explained with Godot game engine
Let me start with the mathematical description of what we will do in this blog post. This description might sound daunting, but don’t worry - we’ll explain everything as we go. Here goes: we will simulate fluid flow by moving a scalar density field through a vector velocity field. We’ll simulate velocity diffusion and advection as well as density diffusion and advection. Then we will add velocity projection with the goal of making the fluid obey the law of mass conservation - which will happen by balancing divergence with a pressure field. We will use bilinear interpolation and Gauss-Seidel relaxation for approximating values where needed…
What’s on your mind
This Week’s Poll:
.
Last Week’s Poll:
.
Data Science Articles & Videos
The Anti-Scaling Law in Biology, and Why AI Could Make Crowding Worse Before Making Drug Development Better
One of the main reasons for the tech community’s optimism is the scaling-law. Once you demonstrated 0-1, you can do 1-100 much quicker. The internet, social media, and so on…In biology and drug development, I think there is a mirror image, the anti-scaling law. Because of that, here’s my contrarian view: AI could make crowding in drug development worse, before making it better. And that’s my perspective as a genuine believer in the transformative power of AI, and an AI practitioner who used $14,000 worths of AI tokens in the past 2 months…What is there besides Frequentist and Bayesian stats? [Reddit]
I am wondering whether there are lesser known statistical paradigms. like most people, I was first acquainted with the Frequentist framework, and later got introduced to Bayesian stats. I really like the way this made me reconsider some of what I thought were basic assumptions, so now I’m wondering what the next thing could be? Are there any other branches/frameworks which are not as well known?…
Forecasting: Principles and Practice, the Pythonic Way
This textbook is based on Forecasting: Principles and Practice (3rd ed) and is intended to provide a comprehensive introduction to forecasting methods and to present just enough information about each method for readers to be able to use them sensibly. We don’t attempt to give a thorough discussion of the theoretical details behind each method, although the references at the end of each chapter will hopefully fill in many of those details…The Simplest Learning Machine, Pt.2
In the previous article I outlined the concept of the Simplest Learning Machine. It’s an imaginary algorithm that uses one byte of persistent memory and learns to predict something about a stream of binary events…Can we actually write something like that? How would it work?…One semi-obvious thing we can learn is the rate of positive events in the stream. This would give us some predictive power, as long as that rate is different from 50%. A bit of a stretch to call this “machine learning”, sure, but I’ll get to the questions of usefulness later…On Training Data for Bio AI Models
As we advance biological foundation models, which lessons from LLM data curation transfer, and which need rethinking?…
Our goodpractice Package Has New Superpowers
The goodpractice package has been recommended by rOpenSci since it was first started just over 10 years ago by Gábor Csárdi. We used to ask our editors to manually run goodpractice on all packages submitted to software peer-review, and then to ask authors to fix any notable issues flagged by the package…We’re really pleased to share that we’ve recently rolled out a host of updates and extensions to the package. These make it both easier to use, and more powerful…I think that is the aspect that I found most surprising: That the use of Claude made our collaboration feel less technical, and therefore somehow even more human. And that gave us the ability to work though 70 pull requests representing over 100 new checks, all ready for everybody to use…The Warehouse
The ProblemWith over 23,000 packages on CRAN alone, finding the right package for your task is overwhelming:
Searching by keywords often misses relevant packages
No easy way to compare similar packages
Quality indicators are scattered or missing
GitHub-only packages are hard to discover
The Warehouse Solution provides:
Function-first search: “estimate serial interval” → find all relevant packages
Quality scores: Automated assessment of tests, documentation, and maintenance
All sources: CRAN, GitHub, Bioconductor in one place
Community reviews: Real user experiences and recommendations
Smart categorization: Browse by what packages actually do…
Robot Learning: From Fundamentals to Foundation Models
This course provides a comprehensive introduction to modern robot learning, combining classical techniques with the latest advances in large-scale models: Students will start by learning the fundamentals of imitation learning, reinforcement learning, and policy optimization, and gradually progress to advanced topics including Vision-Language-Action (VLA) models and foundation models for robotics The objectives of this course are:Understand the core principles of imitation learning, reinforcement learning, and policy learning.
Implement basic robot learning systems in simulation and on real robots.
Explore state-of-the-art Vision-Language Action and foundation models for robotics.
Design and evaluate scalable robot learning pipelines integrating perception, control, and multi-modal reasoning…
ML Job Interviews: The Ultimate Guide
How I found a Research Scientist role after a PhD in Machine Learning…My process was, overall, successful: I received offers from every company I completed interviews with including: DeepMind (which I accepted), Isomorphic Labs, Cohere, Meta, and a startup in stealth. A few caveats to the first claim: Anthropic, Mistral, and TeslaAI got back to me too late and I didn’t complete those processes. ReflectionAI, the one genuine rejection: they didn’t like me for the RS role but switched me to their Engineering track instead…
stata-mpl - Give your matplotlib and seaborn charts the Stata 19 look
Give your matplotlib and seaborn charts the look of Stata 19 (the stcolor scheme, Stata’s colorblind-friendly default). Calibrated against the official SVG files exported by Stata 18/19…I mounted a tiny microphone on my apartment balcony to listen for any birds passing by and built a site to collage them as they’re heard…so I’ve thrown together this short writeup for any of you who want to monitor any avian visitors that may be passing by your own place. It’s short and sweet for now in an attempt to get something out quickly, but this work is part of a longer chain of bird-tangent projects i’ll write something up about soon!…
Ask HN: What are tools you have made for yourself since the advent of AI?
Ask HN: What are tools you have made for yourself since the advent of AI?…Why Academics Should Use AI for Writing: A Case Study
I violently dislike the idea of AI taking over my writing. My writing is my own, and having it done by AI makes the final product lose its soul. Also, whenever I have used AI to write several paragraphs independently (which I admit to doing for bureaucratic tasks) I ended up rewriting most of it anyway. However, over the past year or so I have become increasingly impressed with what AI can do, and rather than talk about this in abstract terms I would like to present you with a concrete demonstration that shifted my opinion a great deal…
.
Last Week's Newsletter's 3 Most Clicked Links
.
* Based on unique clicks.
** Please take a look at last week's issue #654 here.
Cutting Room Floor
DiffusionBlocks: Training Neural Networks One Block at a Time
Data from 66,000+ practice sessions. How much does the typical musician actually practice? [Reddit]
What is the most common reason data science projects fail to deliver business value? [Reddit]
.
Whenever you're ready, 3 ways we can help:
Go deeper each week (paid subscription)
Get 3 additional posts per week designed to help you:Statistics → understand the math behind ML
AI Agents → build with modern AI tools
Career → become more valuable at your job
Looking to get a job?
A practical guide to landing your first (or next) data science role, based on thousands of reader questions.
👉 Check out our “Get A Data Science Job” CoursePromote your organization/project/event to ~68,500 subscribers
Sponsor this newsletter and reach a highly engaged data science audience (30–35% open rate).
👉 Reply to this email to learn more
Thank you for joining us this week! :)
Stay Data Science-y!
All our best,
Hannah & Sebastian


