Machine Learning is a subfield of computer science which gives computers the ability to learn without being explicitly programmed. It is concerned with construction of algorithms than can learn and make predictions from data.
How Do I learn Machine Learning?
To master Machine Learning (ML) one has to be good at maths, programming and domain knowledge. Domain knowledge (For example, how to deal with images, audio, financial time series etc.) changes from one class of problem to another, so let us focus on first two.
We need maths to understand the machine learning algorithms or models or to implement new ones. There are large number of models which are already built. Even when we are using existing models we need to understand the internal working of the algorithm so that we can tune the hyper parameters. Single model may not give best results for all the problems (no free lunch). Which model to use for the given problem is very important and to choose the right model, one needs to understand the internal working/ maths.
Thankfully you don't need all the maths but only some sub-branches:
- Linear algebra
- Probability theory
- Information theory and decision theory
Programming is needed to use machine learning models (or build new one), get the data from various sources, clean the data, choose the right features and to validate if the model has learned correctly. Thankfully you don't have to be an expert programmer. Some programming languages are preferred for doing machine learning than others because they have large number of libraries with most of the machine learning models already implemented.
Languages suited for Machine Learning:
- Python: Best for both beginner and advanced level.
- R: Good but slow run time.
- Matlab: Good but costly and slow.
- Julia: Very fast, good and limited libraries, as it is new.
- C++: Difficult, very fast and used in production.
- Recommend beginners to start with Python and learn only the required math from book and online courses. Some good books in Machine Learning and Deep Learning.
Artificial Intelligence is a very big field and it encompasses sub fields like NLP, Speech Recognition, Computer Vision, Robotics etc. These sub fields that I just mentioned are applications of Machine Learning, which basically is nothing more than applied statistics and probability. Thus, in order to begin to understand any of the abo ve areas, one has to start with basics probability theory, statistics, linear algebra, optimization and information theory. Mastery on these 5 subjects, in my opinion, is absolutely essential to understand and appreciate theory behind the fields such as NLP, Computer Vision and Machine Learning in general.
If you are just starting and have not taken any of the courses I mentioned, I would suggest you spend a year or two learning about them, solving problems from behind the chapters and gaining confidence in these areas. Believe me, these first few years that you invest in getting your basics clear will pay off really well in future. If you have your foundation strong, you will not only appreciate the theory behind these fields but also may be come up with innovative new ideas.
There are very many books that teach probability and statistics and information theory and so on but I personally found the following very useful in my graduate school days.
- For probability and statistics: Book by Stark and Woods 'Probability and Random Processes with Applications to Signal Processing'.
- For information theory: Book by Thomas and Cover 'Elements of Information Theory'.
- For linear algebra, apart from many great books such as 'Matrix Analysis' by 'Horn and Johnson', I would also recommend Gilbert Strang's video lectures on linear algebra (courtesy of OCW-MIT).
As for the programming languages, one could use any. I would, however, choose Python. Python is a well supported high level language with many 3rd party libraries written in order to do machine learning research.