Data Science Weekly

Oct 26, 2017

Issue #205 Oct 26 2017

Editor Picks

AlphaGo, in context
I had a chance to talk to several people about the recent AlphaGo matches with Ke Jie and others. In particular, most of the coverage was a mix of popular science + PR so the most common questions I’ve seen were along the lines of “to what extent is AlphaGo a breakthrough?”, “How do researchers in AI see its victories?” and “what implications do the wins have?”. I thought I might as well serialize some of my thoughts into a post...

In China, a Store of the Future—No Checkout, No Staff
Wheelys tests a 24-hour store run entirely by technology...

How to unit test machine learning code.
Over the past year, I’ve spent most of my working time doing deep learning research and internships. And a lot of that year was making very big mistakes that helped me learn a lot about not just about ML, but about how to engineer these systems correctly and soundly. One of the main principles I learned during my time at Google Brain was that unit tests can make or break your algorithm and can save you weeks of debugging and training time...

A Message from this week's Sponsor:

Transform data into something meaningful.

A master’s in business analytics from Clark University’s Graduate School of Management will equip you with the skills needed for the high-demand field of data analysis. You’ll learn how to formulate insights and communicate your knowledge effectively to data scientists, executives and peers. Our hybrid online and on-campus model allows you to earn your degree in only one year. With a strong commitment to the Principles for Responsible Management Education (PRME), ethics and corporate responsibility are integrated into our program, providing graduates a skill set that uniquely positions them among their peers.

Data Science Articles & Videos

How AI Helps The Intelligence Community Find Needles In The Haystack
Technology from the startup Primer helps analysts find even the most obscure events they need to know about in a sea of data....

Andrew Ng Has a Chatbot That Can Help with Depression
Woebot combines cognitive behavior therapy with advances in natural language to create a virtual counselor...

How Adversarial Attacks Work
Recent studies by Google Brain have shown that any machine learning classifier can be tricked to give incorrect predictions, and with a little bit of skill, you can get them to give pretty much any result you want. This fact steadily becomes worrisome as more and more systems are powered by artificial intelligence — and many of them are crucial for our safe and comfortable life. Lately, safety concerns about AI were revolving around ethics — today we are going to talk about more pressuring and real issues...

Stop Using word2vec
Word vectors are awesome but you don’t need a neural network – and definitely don’t need deep learning – to find them2. So if you’re using word vectors and aren’t gunning for state of the art or a paper publication then stop using word2vec...

First assessment of learning-to-rank
The Search Platform Team has been working on improving search on Wikimedia projects with machine learning. Machine learned-ranking (MLR) enables us to rank relevance of pages using a model trained on implicit and explicit judgements. In the first test of the learning-to-rank (LTR) project, we evaluated the performance of a click-based model on users searching English Wikipedia...

How Columns in Neocortex Enable Learning the Structure of the World
It is widely observed that movement affects how we sense objects in the world, but how this happens in the brain has remained a mystery. In this paper, we propose a network model that learns the structure of objects through movement. Our model is based on the known biology of cortical columns and layers, and helps explain their function...

Dissolving the Fermi Paradox
The Fermi question is not a paradox: it just looks like one if one is overconfident in how well we know the Drake equation parameters...

Switching from Unsupervised LDA to Semi-Supervised GuidedLDA
This is the story of how and why we had to write our own form of Latent Dirichlet Allocation (LDA). I also talk about why we needed to build a Guided Topic Model (GuidedLDA), and the process of open sourcing everything on GitHub...

Jobs

Data Scientist - MealPal - New York Are you passionate about helping an organization make smart decisions in order to deliver the best product and user experience? Do you want to join a fast-paced, growing company? As a Data Scientist at MealPal, you will focus on using data to drive business strategy and take our company to the next level. You will have the opportunity to think critically and problem solve in order to drive valuable and executable insights...

Training & Resources

Word embeddings in 2017: Trends and future directions
This post will focus on the deficiencies of word embeddings and how recent approaches have tried to resolve them...

Announcing PlaidML: Open Source Deep Learning for Every Platform
Today Vertex.AI is releasing PlaidML, our open source portable deep learning engine. Our mission is make deep learning accessible to every person on every device, and we’re building PlaidML to help make that a reality...

Lasso with d3 v4 and Canvas
Example of a lasso technique using data drawn on canvas (or really anywhere) and an SVG interaction layer for drawing the lasso...

Books

Statistics Done Wrong: The Woefully Complete Guide "... a pithy, essential guide to statistical blunders in modern science that will show you how to keep your research blunder-free..."

For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.

P.S., Want to reach our audience / fellow readers? Consider sponsoring. We've just opened up booking for November & December - grab a spot now; first come first served! Email us for more details - All the best, Hannah & Sebastian

Data Science Weekly Newsletter

Discussion about this post