Data Science Weekly

Sep 02, 2021

Issue #406 September 02 2021

Editor Picks

Intelligent carpet gives insight into human poses
The sentient Magic Carpet from Aladdin might have a new competitor. While it can’t fly or speak, a new tactile sensing carpet from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) can estimate human poses without using cameras, in a step towards improving self-powered personalized healthcare, smart homes, and gaming...

Google’s New AI Photo Upscaling Tech is Jaw-Dropping
Photo enhancing in movies and TV shows is often ridiculed for being unbelievable, but research in real photo enhancing is actually creeping more and more into the realm of science fiction. Just take a look at Google’s latest AI photo upscaling tech...

The 7 Reasons Most Machine Learning Funds Fail Marcos Lopez de Prado [Video]
This talk, titled The 7 Reasons Most Machine Learning Funds Fail, looks at the particularly high rate of failure in financial machine learning. The few managers who succeed amass a large number of assets, deliver consistently exceptional performance to their investors. However, that is a rare outcome. This presentation will go over the 7 critical mistakes underlying most financial machine learning failures based off of Marcos López de Prado’s experiences and observations...

A Message from this week's Sponsor:

Get Retool free for up to a year and $160,000 in startup discounts

Why spend so much time on internal tooling, CRUD apps, and dashboards built from scratch? Retool is a 10x faster way to build custom internal tools, and now it's free for early-stage startups to use for up to a year. They've also created a deal book worth $160K in startup discounts to give startups access to the tools they need for great internal tools, for free. Get your discount here.

Data Science Articles & Videos

Jeff Hammerbacher — From data science to biomedicine
Jeff talks about building Facebook's early data team, founding Cloudera, and transitioning into biomedicine with Hammer Lab and Related Sciences...

Building a Vanilla Artificial Neural Network from Scratch (in R)
I’ll be walking through a simple guide for building a “vanilla” ANN from scratch (without using any built-in modules). The type of model we’ll be building will be a regressor since the predictor and response variables are numeric....

What is the secret formula for MLOps success?
Putting one foot before the other on the way to uncover the secret I spoke with more than 100 ML professionals to learn about their MLOps journeys. My expedition is far from over but I saw three ‘signposts’ along the way that helped me better understand what the formula for MLOps success looks like. I hope you find them helpful on your journey too...

Deep Reinforcement Learning at the Edge of the Statistical Precipice
Deep reinforcement learning (RL) algorithms are predominantly evaluated by comparing their relative performance on a large suite of tasks. Most published results on deep RL benchmarks compare point estimates of aggregate performance such as mean and median scores across tasks, ignoring the statistical uncertainty implied by the use of a finite number of training runs. Beginning with the Arcade Learning Environment (ALE), the shift towards computationally-demanding benchmarks has led to the practice of evaluating only a small number of runs per task, exacerbating the statistical uncertainty in point estimates. In this paper, we argue that reliable evaluation in the few run deep RL regime cannot ignore the uncertainty in results without running the risk of slowing down progress in the field...

Multiplying Matrices Without Multiplying
Multiplying matrices is among the most fundamental and compute-intensive operations in machine learning. Consequently, there has been significant work on efficiently approximating matrix multiplies. We introduce a learning-based algorithm for this task that greatly outperforms existing methods. Experiments using hundreds of matrices from diverse domains show that it often runs 100× faster than exact matrix products and 10× faster than current approximate methods...

Nine Tools I Wish I Mastered before My PhD in Machine Learning
Whether you are building a start up or making scientific breakthroughs these tools will bring your ML pipeline to the next level...

Music Composition with Deep Learning: A Review
Generating a complex work of art such as a musical composition requires exhibiting true creativity that depends on a variety of factors that are related to the hierarchy of musical language. Music generation have been faced with Algorithmic methods and recently, with Deep Learning models that are being used in other fields such as Computer Vision. In this paper we want to put into context the existing relationships between AI-based music composition models and human musical composition and creativity processes...

11 Short Videos About AI Ethics
I made a playlist of 11 short videos (most are 6-13 mins long) on Ethics in Machine Learning. This is from my ethics lecture in Practical Deep Learning for Coders v4. I thought these short videos would be easier to watch, share, or skip around...

Autoregressive Transformer Decoder in JAX from scratch
This implementation builds a transformer decoder from ground up. This doesn’t use any higher level frameworks like Flax and I have used labml for logging and experiment tracking...

A guide to Knowledge Graphs
A consolidation of notes that briefly but gently introduces Knowledge graphs and shines a light on several practical aspects...

Conference*

TransformX Conference: Driving AI from Experimentation to Reality

Join Scale AI for our two-day, virtual conference featuring 100+ speakers and 60+ sessions. We’re bringing together a community of leaders, visionaries, practitioners, and researchers across industries as we explore the shift from research to reality within AI and Machine Learning. Register now to secure your free ticket...

*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!

Jobs

Senior Data Scientist - TikTok - LA

TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy by offering a home for creative expression and an experience that is genuine, joyful, and positive.
- Generate useful features from large amount of data
- Apply supervised and unsupervised machine learning techniques, such as linear and logistic regression, decision trees, and k-means clustering
- Develop segmentation models, classification models, propensity models, LTV models, experimental design, optimization models
- Perform statistical analysis such as KPI deep dives, performance marketing efficiency, behavioral clustering, and user journey analytics
- Curate audiences and inform engagement tactics to enable differentiated, relevant marketing touches across channels (social, email, in app, push)
- Synthesize analytics and statistical approaches into easy-to-consume storylines, both visually and verbally, and provide indicated actions for executive audiences
- Capture business requirements for data and analytic solutions and collaborate XFN to ensure business requirements align with business needs
- Analyze creatives and surface insights that will help drive engagement and retention
- Support day-to-day collaboration with performance marketing to communicate insights and recommend data informed strategies

Want to post a job here? Email us for details >> team@datascienceweekly.org

Training & Resources

Dask and pandas: There’s No Such Thing as Too Much Data
Do you love pandas, but hate when you reach the limits of your memory or compute resources? Dask gives you the chance to use the pandas API with distributed data and computing. In this article, you’ll learn how it really works, how to use it yourself, and why it’s worth the switch...

CLabel
CLabel is a terminal-based cluster labeling tool that allows you to explore text data interactively and label clusters based on reviewing that data...

Model Training and Gradient Descent for Absolute Beginners
In this article, I will explain what is meant by machine learning, and how gradient descent works. Let's get right into it...

Books

Hands-On Machine Learning with scikit-learn and Scientific Python Toolkits
Integrate scikit-learn with various tools such as NumPy, pandas, imbalanced-learn, and scikit-surprise and use it to solve real-world machine learning problems...

For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.

P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian

Data Science Weekly Newsletter

Discussion about this post