Data Science Weekly - Issue 31
Issue #31 June 26 2014
Editor Picks
Extreme Learning Machines With Julia There is a concept known as Liquid State Machine, and a relatively better known Echo State Network which is used for training Recurrent Neural Nets. Both of them are based on reservoir computing. On the lines of reservoir computing and very similar in concept is the topic of this post, Extreme Learning Machine...
Businesses Can Now Use Same Stats Language As Universities,
Thanks To "Pandas" The Python number-crunching toolkit gives programmers the statistical tools they need in a computer language that is familiar to businesses...
Square's Machine Learning Infrastructure and Applications In this talk, Dr. Rong Yan (Director of Data Science and Infrastructure, Square), gives a high-level overview of data applications at Square followed by a deep dive on how machine learning is used in our industrial leading fraud detection models...
Data Science Articles & Videos
Natural Language Processing in Investigative Journalism
Journalists frequently have far too many documents to read manually, whether it's a 10,000 page response to a Freedom of Information Request or 250,000 leaked diplomatic cables. We've spent the last three years applying NLP and visualization techniques to this problem, building a system called Overview which has now been used by journalists all over the world. In this talk I'll show you exactly how Overview's language processing pipeline works.
What is the Difference Between Artificial Intelligence, Machine Learning, Statistics, and Data Mining
I assume the author of that question is trying to get a clear picture by understanding the line of separation that distinguish each field from the other. So here is my take to explain it in a more simplified way that I ever could do...
Machine Learning Isn't Kaggle Competitions
Doing Kaggle problems is fun! It means you can focus on machine learning algorithm nerdery and get better at that. But it’s pretty far removed from my job...where I do (among other things) machine learning! What gives?
A Great Catchy Name: Semi-Supervised Recognition of Sarcastic Sentences in Online Product Reviews
Sarcasm is a sophisticated form of speech act widely used in online communities. Automatic recognition of sarcasm is, however, a novel task. Sarcasm recognition could contribute to the performance of review summarization and ranking systems. This paper presents SASI, a novel Semi-supervised Algorithm for Sarcasm Identification that recognizes sarcastic sentences in product review...
Can We See the Arrow of Time? Algorithm Can Determine, with 80 Percent Accuracy, Whether Video is Running Forward or Backward
At the IEEE Conference on Computer Vision and Pattern Recognition this month, an international group of computer scientists will present a new algorithm that can, with roughly 80 percent accuracy, determine whether a given snippet of video is playing backward or forward.
Neural Networks and Deep Learning Book
Neural networks and deep learning currently provide the best solutions to many problems in image recognition, speech recognition, and natural language processing. This book will teach you the core concepts behind neural networks and deep learning...
Association Rule Mining to Find Unbeatable Strategy in a Tic Tac Toe Game
This is the short overview what I am going to discuss today: a) Goal of our task, b) What is the Association Rule and the main objective of Association Rule mining, c) The Apriori algorithm, d) The description of our data, e) And finally some results after we extract some association rules out of the data...
Rapid User Testing with Mechanical Turk
How We Supercharged Our User Testing Using Mechanical Turk, Google Forms and Usability Hub...We’ve been experimenting with different methods for getting rapid user feedback and we’d like to share some of our explorations...We’ll start by designing the test, followed by distributing the test, and we’ll finish with organizing the data...
Probabilistic Models of Cognition
In this book, we explore the probabilistic approach to cognitive science, which models learning and reasoning as inference in complex probabilistic models. In particular, we examine how a broad range of empirical phenomena in cognitive science (including intuitive physics, concept learning, causal reasoning, social cognition, and language understanding) can be modeled using a functional probabilistic programming language called Church....
Jobs
Insight Data Engineering Fellows Today we are announcing the opening of applications for the September 2014 session of the Insight Data Engineering Fellows Program. Insight is a free, full-time, six week program based in Silicon Valley that helps engineers and computer scientists transition to a career in big data engineering. Data engineers from Facebook, LinkedIn, Twitter, Yelp, Square, Microsoft, Intuit, AT&T, Climate Corporation, Beats Music, Jawbone, RelateIQ, and Airbnb will be mentoring and hiring from the program. Additionally, community leaders from open-source projects such as Apache Storm, Apache Spark, and Apache Cassandra will be mentoring as well...
Training & Resources
Introduction to Deep Learning on Hadoop
As the data world undergoes its cambrian explosion phase our data tools need to become more advanced to keep pace. Deep Learning has emerged as a key tool in the non-linear arms race of machine learning. In this session we will take a look at how we parallelize Deep Belief Networks in Deep Learning on Hadoop’s next generation YARN framework with Iterative Reduce. We’ll also look at some real world examples of processing data with Deep Learning such as image classification and natural language processing...
Using Python's sci-packages to Prepare Data for Machine Learning Tasks and Other Data Analyses
In this short tutorial I want to provide a short overview of some of my favorite Python tools for common procedures as entry points for general pattern classification and machine learning tasks, and various other data analyses...
AI on the Web
This page links to 868 pages around the web with information on Artificial Intelligence. Some of the links will pop up additional information when you move the mouse over them. Links in Bold* followed by a star are especially useful and interesting sites...
Books
Naked Statistics: Stripping the Dread from the Data Interesting take on the importance of statistics...
"While a great measure of the book’s appeal comes from Mr. Wheelan’s fluent style—a natural comedian, he is truly the Dave Barry of the coin toss set—the rest comes from his multiple real world examples illustrating exactly why even the most reluctant mathophobe is well advised to achieve a personal understanding of the statistical underpinnings of life " - New York Times
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S. Did you enjoy the newsletter? Do you have friends/colleagues who might like it too? If so, please forward it along - we would love to have them onboard :)