Data Science Weekly - Issue 23
Issue #23 May 1 2014
Editor Picks
Why becoming a Data Scientist is NOT actually easier than you think I was just doing some late night reading and came across this article. TL;DR - You can take the ML course on Coursera and you're magically a Data Scientist, because three really intelligent people did it. I disagree...
A Weekend With Julia: An R User's Reflections First off, I'm not going to talk much about Julia's speed. Everybody has seen the tables and graphs showing how in this benchmark or another, Julia is tens times or a hundred times faster than R. Enough said about machine speed! Let's talk about intuitive appeal, compactness of notation, and aesthetics...
Deep Learning for Natural Language Processing Presentation, Deep Learning for Natural Language Processing, by Stephen Pulman, University of Oxford and TheySay, at the March 6, 2014 Sentiment Analysis Symposium in New York...
Data Science Articles & Videos
How One Woman Hid Her Pregnancy From Big Data
For the past nine months, Janet Vertesi, Assistant Professor of Sociology at Princeton University, tried to hide from the Internet the fact that she's pregnant — and it wasn't easy...
Why building a Data Science Team is deceptively hard
More and more startups are looking to hire Data Scientists who can work autonomously to derive valuable insights from data. In principle, this sounds great: engineers and designers build the product, while Data Scientists crunch the numbers to gain insights. In practice, finding these Data Scientists and enabling them to be productive are very challenging tasks...
What makes an Image popular?
Hundreds of thousands of photographs are uploaded to the internet every minute through various social networking and photo sharing platforms. Even from the same users, different photographs receive different number of views. This begs the question: What makes a photograph popular? Can we predict the number of views a photograph will receive even before it is uploaded? These are some of the questions we address in this work...
The First Rule of Data Science
“The first rule of Data Science is: don’t ask how to define Data Science.” So says Josh Bloom, a UC Berkeley professor of astronomy and a lead principal investigator (PI) at the Berkeley Institute for Data Science (BIDS)...
Twitter Can Now Predict Crime, and This Raises Serious Questions
Police departments in New York City may soon be using geo-tagged tweets to predict crime. It sounds like a far-fetched sci-fi scenario a la Minority Report, but when I contacted Dr. Matthew Greber, the University of Virginia researcher behind the technology, he explained that the system is far more mathematical than metaphysical...
This Software Can Write A Grade-A College Paper In Less Than A Second
If you've ever been stumped when trying to write the perfect college entrance essay, rest assured. Scientists have created software than can generate a near-perfect paper in less than a second...
Using F# and R Provider with Kaggle’s Facial Keypoints Detection
This post, I will show how we can use F# and R Provider with the Facial Keypoints Detection. I won’t try to solve the problem yet :-), but I will follow the R tutorial that let me learn R as well as getting familiar with the data...
Simpson's Paradox is Back
The latest issue of the American Statistician has a set of thought-provoking point/counterpoint papers on Simpson’s Paradox, with a tie-in to the controversial issue of causality. (I will not address the causality issue here.) Since I have long had my own thoughts about Simpson’s, I’ll postpone the topic I had planned to post this week, and address Simpson’s...
Jobs
Twitch: Data Scientist - San Franscisco, CA Twitch is building the biggest live video broadcasting platform and community for gamers. Twitch is 4th in peak internet traffic in the U.S, right above Hulu and below Apple. Join the team as the 3rd data scientist, and you'll get to leverage the 2.5 TB of data coming in everyday...
Training & Resources
A Tutorial on Learning With Bayesian Networks
In this paper, we discuss methods for constructing Bayesian networks from prior knowledge and summarize Bayesian statistical methods for using data to improve these models...
Machine Learning in Go using GoLearn GoLearn is a machine learning library for Golang. I couldn't find any comprehensive ML library for Go, so I decided to write one...
30 Best Online Books for Artificial Intelligence
We have prepared a list of some of the best free ebooks on AI...
A Large set of Machine Learning Resources for Beginners to Mavens Detailed list of resources by Machine Learning sub-topic...
Books
Managerial Analytics:
An Applied Guide to Principles, Methods, Tools, and Best Practices Recommended by one of our readers, this book is also very well rated on Amazon (4.8 out of 5 stars)...
"A manager can’t be expected to learn all what a data analyst or a data scientist knows, otherwise he or she becomes one of them. But the manager probably has to work closely with them or even manage them. For managers, no matter what industries they come from, who really want to understand what analytics means to management, this is a must-read book."
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S. Did you enjoy the newsletter? Do you have friends/colleagues who might like it too? If so, please forward it along - we would love to have them onboard :)