Top machine learning writers on Quora give their advice on learning machine learning, including specific resources, quotes, and personal insights, along with some extra nuggets of information.
How do I learn machine learning?
Straightforward question. Not-so-straightforward answer.
There are obviously a number of ways to go about learning machine learning, with books, courses, and degree programs all being great places to start. Like any number of topics a newcomer may delve into, however, there are a vast number of options in each of these categories, and attempting to narrow one's focus alone can often prove futile.
A Quora post, aptly titled 'How Do I Learn Machine Learning?
,' ends up being a robust resource. The FAQ has generated a lot of attention during the course of its life, with 93 answers and more than 468,000 views, and has contributions from a number of well-known personalities in the machine learning world. The idea of learning from others whom have previously undertaken the same task has special significance for the learning of machine learning.
In this post we will take a look at advice from the top answers of the Quora post. We will find recommended courses and books relevant to learning machine learning, garner specific advice from experts, and see what other nuggets we can pick up along the way.
Our advisors today are the authors of the 3 most-upvoted FAQ answers, and come in the form of 3 well-known machine learning personalities:
Top Book Recommendations
Taken together, our advisors' recommendations compose a strong collection of introductory texts, covering statistical learning, the theoretical underpinnings of machine learning, and the practical implementation of algorithms and model-building in the most popular programming languages (Python & R
) and framework (Spark
Top MOOC Recommendations
It's nearly unanimous in most circles which machine learning MOOC is best for newcomers: Andrew Ng's Coursera offering. Beyond that, 2 other Coursera courses are also given specific mention. Incidentally, all 3 MOOC recommendations come from Xavier, with Sean co-signing the Ng selection.
Here is a collection of some interesting and less-often heard pieces of advice from our advisors.
My recommended next step is the following. Get a good ML book (my list below), read the first intro chapters, and then jump to whatever chapter includes an algorithm you are interested. Once you have found that algo, dive into it, understand all the details, and, especially, implement it. In the previous online course you would already have implemented some algorithms in Octave. But, here I am talking about implementing an algorithm from scratch in a "real" programming language. You can still start with an easy one such as L2-regularized Logistic Regression, or k-means, but you should also push yourself to implement more interesting ones such as LDA (Latent Dirichlet Allocation) or SVMs.
Get scikit-learn or respective framework in the programming language you chose. Run algorithms for every chapter in the above book. Advantage with Scikit is it gives you some sample data too to test.
Get a grip on Statistics (academic discipline) and Probability. Communities in Quora or Kaggle exercises etc will help you in getting up to the speed. Also you can get this book The Elements of Statistical Learning. I haven't seen anyone disappointed with this one. It's a bit of math but self explanatory mostly.
It is easy to get lost in all the languages and technologies that allow one to practice machine learning on real-world data. They allow us to execute our ideas and build our models. When integrated into real applications they engender software with the ability to learn and distill high-dimensional problems down to focused results. But languages and technology come and go. Knowing R or Python really well might amount to building a model faster or allow you to integrate it into software better, but it says nothing about your ability choose the right model, or build one that truly speaks to the challenge at hand. The art of being able to do machine learning well comes from seeing the core concepts inside the algorithms and how they overlap with the pain points trying to be addressed. Great practitioners start to see interesting overlaps before ever touching a keyboard.
Extra Nuggets (TM)
In his answers, Sean mentions www.datascienceontology.com
, a site that lives up to its name. With top level categories such as learning algorithms, databases, data cleaning, and languages, a sufficiently broad and deep ontology of data science terms are presented and explained, with links to relevant resources.
Xavier points out that machine learning is about breadth and depth, and balancing the learning of both is important. He suggests surveying the basics of the most important algorithms, but also learning the low-level details of as many as possible. Xavier also links to his answer to 'What are the top 10 data mining or machine learning algorithms?
' to help drive home his point of learning the most important algorithms, a question thread that is a useful resource in its own right.
Sean also says to "think like a researcher," as the pursuit of a PhD trains students in the discipline of advanced research. A PhD holder is able to confidently state that they have solved an original problem and defended that solution to others in the field. According to Sean, Non-PhDs can model their approach to machine learning after PhDs by embodying this mentality of 'research thought.'
To balance Sean's views on the importance of research, Raviteja stresses practice. He rightfully states that all the theory in the world is useless if you can't make an educated selection between algorithms when it comes time to implement a model. He advises picking up scikit-learn (though R would also suffice) and gaining real world experience of tackling problems, choosing appropriate algorithms, and building models that have a purpose.
The Article was originally published in kdnuggets