Data Science Weekly - Issue 14
Issue #14 February 27 2014
Editor Picks
Training Deep Learning Models in a Browser: Andrej Karpathy Interview We recently caught up with Andrej Karpathy, Machine Learning PhD student at Stanford and the man behind the innovative ConvNetJS - a JS library for training Deep Learning models (mainly Neural Networks) entirely in your browser. We were keen to learn more about his background, the motivation and potential applications for ConvNetJS, and his research agenda...
Machine Learning Application: Job Classification at LinkedIn I am fascinated by Machine Learning (ML) and keep looking for case studies were ML solves real world problems. This Talk – Machine Learning: The Basics by Ron Bekkerman( video), provides a great overview of machine learning and how it is being used by LinkedIn for Job Analysis...
Machine Learning: Why isn't Supervised Machine Learning more automated? Why doesn't ML provide a big red "Go" button that automatically applies all the commonly effective techniques (SVM, random forests, ANN, ...) with automatic feature selection and parameter tuning and validation and overfitting-avoidance and whatever else, and return a predictive function that's a weighted average of the most effective models?...
Data Science Articles & Videos
How to Interview a Data Scientist
I’ve been evaluating some opportunities in the data science field of late. For the first time in many years, I’m on the interviewee side of the table. My observation (based on an admittedly very limited sample) is that there is still ample room for improvement in the process firms undergo to evaluate analytics talent. Rather than bury the lead, here’s my punchline: Giving a prospective data scientist new-hire a challenge to solve, and a night or two (at minimum) to work and sleep on it, is usually the way to go...
10 Surprising Machine Learning Applications
You may have heard that today's tech companies are using machine learning to identify and filter email spam (Google), blacklist and penalize spam blogs so that users get good search results (also Google), recommend products specifically for you (Amazon), and fight fraud (IBM). Today's post isn't about that. It's about the new, perhaps surprising ways that companies (and non-profits) are using machine learning to make smarter, faster, better products...
Chicago's New Police Computer Predicts Crimes, but is it Racist?
Chicago police say its computers can tell who will be a violent criminal, but critics say it's nothing more than racial profiling...
Big Data Analytics and Netflix's House of Cards
"House of Cards" was so successful that arguably the busiest man in the US, President Barrack Obama had something to say when it came to the show. Now what does "Big Data" have to do with all this? A ton actually...
Deep Learning/High-Energy Physics: Improving Search for Exotic Particles
Collisions at high-energy particle colliders are a traditionally fruitful source of exotic particle discoveries. Finding these rare exotic particles requires solving difficult signal-versus-background classification problems, hence ML approaches are often used. Recent advances in deep learning, particularly with artificial neural networks, make it possible to learn more complex functions and better discriminate between signal and background classes...
Interview with Yann LeCun, Deep Learning Expert, Facebook AI Lab
We discuss what enabled Deep Learning to achieve remarkable successes recently, his argument with Vapnik about (deep) neural nets vs kernel (support vector) machines, and what kind of AI we can expect from FB...
Deep Learning Opportunities in Fashion
Traditionally, models in fashion use handcrafted features like HOG, SIFT, SURF and other methods for ‘understanding’ the data. In contrast to other domains, fashion images are usually annotated with one or more categories (label) since these pictures are often used directly in some kind of on-line shop or catalog website/app. Such an image database can be used to train models to classify unknown images, but also to train models that are able to capture the general concept of clothes (like shirts, pants, shoes and so on)...
Social Media Could Predict the World's Next Mass Protest
After social networks' presumably instrumental role in the Arab Spring came to light, researchers started digging into whether social media platforms could also be used to anticipate major social uprisings. A new study our of MIT is the latest to give it a whirl...
Got Data, Mine It Yourself: Ian Witten on Data Mining, Weka, and his MOOC
Ian Witten is a professor of Computer Science at the University of Waikato in New Zealand. He is the original creator of Weka, a popular open-source data mining tool (downloaded a total of 4.9 million times so far) which allows end users to analyze their own data. Class Central spoke with Ian about his thoughts on data mining and his MOOC...
Jobs
Quantitative Research Analyst (Data Scientist), SEC, New York Serve as a Quantitative Research Analyst working with SEC staff in building sophisticated predictive analytics, determining proper empirical methodology, organizing data collection, writing unique programs, preparing written reports, and summarizing studies in formal and informal presentations...
Training & Resources
Ten Most Popular MOOCs starting in March, 2014 Upcoming MOOCs featuring Machine Learning, Data Mining, Python...
On the Shelf: Data Science Books Want to know more about the business, sociology, or nitty—gritty of data science? Here are some great books on the discipline...
Python Tools for Machine Learning
This post aims to list and describe the most useful machine learning tools and libraries that are available for Python. To make this list, we did not require the library to be written in Python; it was sufficient for it to have a Python interface. We also have a small section on Deep Learning at the end as it has received a fair amount of attention recently...
Learning Data Science in Total Immersion
San Francisco based Zipfian Academy approaches data science education the way some approach learning a new language – total immersion. The company offers a 12-week intensive advanced data science training program in a modern lab environment focusing on practical skills in the realms of machine learning, statistical analysis, software engineering, and big data...
P.S. Did you enjoy the newsletter? Do you have friends/colleagues who might like it too? If so, please forward it along - we would love to have them onboard :)