Data Science Weekly - Issue 59
Issue #59 Jan 8 2015
Editor Picks
Nvidia's demo of real-time object recognition using deep learning
NVIDIA CEO Jen-Hsun Huang showcases the computer vision capabilities of the NVIDIA DRIVE PX auto-pilot computer, at the company's press event kicking off CES 2015...
The Perilous World of Machine Learning for Fun and Profit:
Pipeline Jungles and Hidden Feedback Loops
I got inspired to write a quick post by this excellent short paper out of Google, "Machine Learning: The High Interest Credit Card of Technical Debt." Anyone who plans on building production mathematical modeling systems for a living needs to keep a copy of that paper close. And while I don't want to recap the whole paper here, I want to highlight some pieces of it that hit close to home....
The Emerging Science of Human-Data Interaction
The rapidly evolving ecosystems associated with personal data is creating an entirely new field of scientific study, say computer scientists. And this requires a much more powerful ethics-based infrastructure...
Data Science Articles & Videos
Is Deep Learning a Revolution in AI?
Can a new technique known as deep learning revolutionize artificial intelligence, as yesterday’s front-page article at the New York Times suggests?...
Weight initialization in Deep Nets
Interesting G+ discussion on weight initialization in deep nets...
Machine learning best practices we've learned from hundreds of competitions
Talk by Ben Hamner, Chief Scientist at Kaggle, leading its data science and development teams. He is the principal architect of many of Kaggle's most advanced machine learning projects including current work in Eagle Ford and GE's flight arrival prediction and optimization modeling....
How To Choose A Data Science Project For Your Data Science Portfolio
You want to create a data science portfolio to showcase you can “do” data science. That you know how to take in a data set, clean it up, use various techniques to extract useful information from it, and then communicate the results. The problem is that you aren’t sure where to start, what projects to do, what languages to use, or even what techniques to use...
Talk to R
Here's a neat demo from Yihui Xie: you can talk to this R graph and customize it with voice commands...
How Software in Half of NYC Cabs Generates $5.2M a Year in Extra Tips
So a story in Businessweek caught my eye the other day. It discussed NYC taxi rider tipping habits and concluded that riders usually tip between 20% and 25% using the histogram below...
I wish I knew these things when I learned Python
I sometimes found myself asking myself how I cannot know simpler way of doing “this” thing in Python 3. When I seek solution, I of course find much more elegant, efficient and more bug-free code parts over time. In total(not just this post), the total sum of “those” things were far more than I expect/admit, but here is the first crop of features that was not obvious to me and learned later as I sought more efficient/simple/maintainable code...
How Big Data Will Transform Our Economy And Our Lives In 2015
Let’s face it: gazing into the crystal ball is a time-honored, end-of-year parlor game. And it’s fun. So in the spirit of the season, I have identified five big data themes to watch in 2015. ...
Linear SVC example with Scikit-Learn SVM and Python
This tutorial video and sample code are a part of a practical machine learning with Python and Scikit-learn (sklearn) series, using investing into stocks based on fundamental features as an example...
Jobs
Financial Data Scientist - Bloomberg, NYC The R&D BVAL Quant group is seeking a motivated data scientist to contribute towards modeling liquidity aspects of fixed income markets. The position revolves at the intersection of machine learning and finance. Knowledge of financial markets, especially of fixed income, is highly desirable but not required. Key responsibilities include creating robust predictive models and explaining modeling rationale to a non-technical audience. The ideal candidate will have a strong machine learning background, experience in dealing with noisy datasets and interest in learning more about financial markets...
Training & Resources
Introduction to Probability, Statistics, and Random Processes
Open access peer-reviewed textbook intended for undergraduate as well as first-year graduate level courses on the subject...
PyToolz API Documentation
Toolz provides a set of utility functions for iterators, functions, and dictionaries. These functions interoperate well and form the building blocks of common data analytic operations. They extend the standard libraries itertools and functools and borrow heavily from the standard libraries of contemporary functional languages...
Deep Learning in Neural Networks: An Overview
This historical survey compactly summarises relevant work, much of it from the previous millennium...
Books
Practical Data Science with R Explanation of basic principles with real use cases ...
"A well rounded, occasionally high-level introductory text that will leave you feeling prepared to participate in the Data Science conversation at work, from earliest planning to presentation and maintenance..."
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S. Happy New Year to all! Wishing you a wonderful 2015 :)
- All the best, Hannah & Sebastian