Data Science Weekly - Issue 27
Issue #27 May 29 2014
Editor Picks
The Flaw Lurking In Every Deep Neural Net A recent paper "Intriguing properties of neural networks" by Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow and Rob Fergus, a team that includes authors from Google's deep learning research project outlines two pieces of news about the way neural networks behave that run counter to what we believed - and one of them is frankly astonishing. I'm going to tell you about both...
Benchmarking Apache Kafka: 2 Million Writes Per Second
(On Three Cheap Machines) I wrote a blog post about how LinkedIn uses Apache Kafka as a central publish-subscribe log for integrating data between applications, stream processing, and Hadoop data ingestion...
Quantum Machine Learning - Seth Lloyd at Google Tech Talks Machine learning algorithms find patterns in big data sets. This talk presents quantum machine learning algorithms that give exponential speed-ups over their best existing classical counterparts...
Data Science Articles & Videos
Building Data Science Teams
Three roles your Data Analytics Team must have...
Twitter Increasingly Used By Hedge Funds For Trading Clues
Much like quantitative traders look at market pricing data to develop trading algorithms, hedge funds are now looking at social media traffic...
Data Mining Reveals How Wording Influences Tweet Propagation
If you’ve ever painstakingly crafted a tweet in the hope it would be retweeted around the world, only to find it flopped, then read on…...
How Pinterest Developed Its New Search Engine
Jason Wilson, Pinterest’s lead designer, tells BuzzFeed in an interview how the social network’s visual search engine was born....
Companies using R in 2014
I decided to curate a list of some companies that are currently using open-source R and describe the applications they are using it for...
MyFitnessPal - Data Science to improve Health & Fitness: Chul Lee Interview
We recently caught up with Chul Lee, Director of Data Engineering & Science at MyFitnessPal. We were keen to learn more about his background, how Data Science is shaping the Health and Fitness industry and what he is working on at MyFitnessPal...
The Gigaom interview:
Why synthetic biology and the Netflix model are the future of medicine
Molecular biologist and futurist Andrew Hessel envisions a world in which every individual receives pharmaceutical drugs perfectly formulated to their genetic and medical needs for a fraction of what treatment would currently cost...
What Does a Neural Network Actually Do?
To gain an intuitive understanding of what a learning algorithm does, I usually like to think about its representational power, as this provides insight into what can, if not necessarily what does, happen inside the algorithm to solve a given problem. I will do this here for the case of multilayer perceptrons. By the end of this informal discussion I hope to provide an intuitive picture of the surprisingly simple representations that NNs encode...
Algorithmic Tagging of HackerNews (or any other site)
Part of making algorithms more discoverable is creating meta-data tags to classify them. Often sites will allow users to pick their own tags but what if the content had already been generated? This is the problem we faced when trying to tag all the algorithms in our API. Each algorithm had a description page and we believed that using some simple machine learning algorithms already in our API we could generate tags for each one...
Jobs
Data Scientist - Yodle, New York NY We are currently seeking a talented Data Scientist for our team. The ideal candidate possesses strong quantitative abilities and the capacity to express those ideas in code. This person must enjoy math and programming and be capable of using them both in a practical setting. The successful candidate will be team oriented and feel comfortable with the dynamic pace of an Internet startup, participating in all phases of product development from research to implementation and maintenance....
Training & Resources
Deep learning from the bottom up
Detailed breakdown of the different concepts...
Where are the Deep Learning Courses? List of classes and resources...
IPython Notebooks for StatLearning Exercises
Exercises for the Stanford Stat Learning course rewritten as IPython notebooks using scikit_learn...
Books
Understanding Machine Learning: From Theory to Algorithms Just released!...
"This is a timely text on the mathematical foundations of machine learning, providing a treatment that is both deep and broad, not only rigorous but also with intuition and insight. It presents a wide range of classic, fundamental algorithmic and analysis techniques as well as cutting-edge research directions. This is a great book for anyone interested in the mathematical and computational underpinnings of this important and fascinating field."
- Avrim Blum, Carnegie Mellon University, Editoral Review
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S. Did you enjoy the newsletter? Do you have friends/colleagues who might like it too? If so, please forward it along - we would love to have them onboard :)