[in case you missed it] Data Science Weekly - Issue 252
Issue #252 Sept 20 2018
Editor Picks
Pattern to the Seemingly Random Distribution of Prime Numbers Discovered
Often known as “the building blocks of mathematics,” prime numbers have fascinated mathematicians for centuries due to their highly unpredictable and seemingly random nature. However, a team of researchers at Princeton University have recently discovered a strange pattern in the primes’ chaos. Their novel modelling techniques revealed a surprising similarity between primes and certain naturally occurring crystalline material...
How DeepMind's biggest AI project is fixing bad Android batteries
Google's Android Pie operating system uses DeepMind's AI in a bid to improve your phone's battery life. But is it making any difference?...
Machine learning — Is the emperor wearing clothes?:
A behind-the-scenes look at how machine learning works
Machine learning uses patterns in data to label things. Sounds magical? The core concepts are actually embarrassingly simple. I say “embarrassingly” because if someone made you think it’s mystical, they should be embarrassed. Here, let me fix that for you...
A Message from this week's Sponsor:
Mode Studio: SQL, Python, R, & charts in one platform
No more jumping between applications. Mode Studio is the analytics toolkit that brings everything together, and gets out of the way. Explore data in our SQL editor, and pass results to integrated Python or R notebooks for deeper exploration and visualization. You can also layer charts over results quickly with built-in visualization tools, and sharing is easy—just send the report URL to teammates when you're ready...
Data Science Articles & Videos
Interview with Jeff Clune, Senior Research Scientist at Uber AI Labs
A wide-ranging interview on AI, research at UberAILabs, Jeff's background, and thoughts on the future directions of AI conducted at the ReWork Conference...
AI for cybersecurity is a hot new thing—and a dangerous gamble
Machine learning and artificial intelligence can help guard against cyberattacks, but hackers can foil security algorithms by targeting the data they train on and the warning flags they look for...
Patent analysis using the Google Patents Public Datasets on BigQuery
Google Patents Public Datasets is a collection of compatible BigQuery database tables from government, research and private companies for conducting statistical analysis of patent data. This is a great starting point if you need to do technical document comparison in ML...
Deep Reinforcement Learning Doesn't Work Yet
Deep reinforcement learning is surrounded by mountains and mountains of hype. And for good reasons! Reinforcement learning is an incredibly general paradigm, and in principle, a robust and performant RL system should be great at everything. Merging this paradigm with the empirical power of deep learning is an obvious fit. Deep RL is one of the closest things that looks anything like AGI, and that’s the kind of dream that fuels billions of dollars of funding. Unfortunately, it doesn’t really work yet...
The Trinity Of Errors In Financial Models:
An Introductory Analysis Using TensorFlow Probability
At Hedged Capital, an AI-first financial trading and advisory firm, we use probabilistic models to trade the financial markets. In this first blog post, we explore three types of errors inherent in all financial models, with a simple example of a model in Tensorflow Probability (TFP)...
Deep learning made easier with transfer learning
In this article, we’re going to look at transfer learning. Rather than developing an entirely customized solution to your problem, transfer learning allows you to transfer knowledge from related problems to help solve your custom problem more easily. By transferring that knowledge, you are taking advantage of the expensive resources that were used to acquire it - training data, hardware, researchers - without the incurring the cost. Let’s see how and when this approach is effective...
A foundation for scikit-learn at Inria
This is an exciting turn for us, because it enables us to receive private funding. As a result, we will be able to have secure employment for some existing core contributors, and to hire more people on the team. The goal is to help sustaining quality (more frequent releases?) and to tackle some ambitious features...
The Enterprise Starting Point: Data Science and Artificial Intelligence
Article outlining the key differences between RPA and AI, and when each approach should be chosen - giving a concrete example from the insurance value chain...
Jobs
Entry Level Data Scientist - IBM - Multiple locations
Entry-Level Data Scientists extract knowledge or insights from structured or unstructured data. They draw upon the practice of data analysis, using predictive analytics, data mining, pattern recognition, data modeling, machine learning and various statistical methods in order to solve large scale optimization problems and to understand the meaning behind vast data sets.
Entry-Level Data Scientists are in demand across IBM's growth areas. You'll be matched and deployed to a team in a strategic business, based on your offered location and fit...
Training & Resources
tf.random_uniform:
Create TensorFlow Tensor With Random Uniform Distribution
Learn how to use TensorFlow's random_uniform operation to create a TensorFlow Tensor with a random uniform distribution, via a screencast video and full tutorial transcript...
Creating PDF Reports with Python, Pdfkit, and Jinja2 Templates
Once in a while as a data scientist, you may need to create PDF reports of your analyses...
Tabular Data in Scikit-Learn and Dask-ML
Scikit-Learn 0.20.0 will contain some nice new features for working with tabular data. This blogpost will introduce those improvements with a small demo. We'll then see how Dask-ML was able to piggyback on the work done by scikit-learn to offer a version that works well with Dask Arrays and DataFrames...
Books
Data Visualization with Python and JavaScript:
Scrape, Clean, Explore & Transform Your Data Learn how to turn raw data into rich, interactive web visualizations with the powerful combination of Python and JavaScript. With this hands-on guide, author Kyran Dale teaches you how build a basic dataviz toolchain with best-of-breed Python and JavaScript libraries—including Scrapy, Matplotlib, Pandas, Flask, and D3—for crafting engaging, browser-based visualizations...
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S., Want to reach our audience / fellow readers? Consider sponsoring - grab a spot now; first come first served! All the best, Hannah & Sebastian