Developers Should have Strong Understanding of "Machine Learning Algorithms"

By Jyoti Nigania |Email | Jul 26, 2018 | 10053 Views

What are some machine learning algorithms that you should always have a strong understanding of, and why?

Answered by Badri Narayan on Quora:
A lot of the answers in this thread are certainly very good and provide lists of specific algorithms and this is probably what the OP wants. However, I would like to advocate a different point of view it is more important to have a solid understanding of some of the theory before getting married to individual algorithms or pet areas. Here is one opinionated list of how to go about having a strong understanding of machine learning. 

Data is messy, noisy and reality is often unknown. Probability theory is the basic tool to reason under uncertainty. You can get some mileage by learning to construct probabilistic models, understand Bayes theorem and know how to do inference. As a bonus, you can learn about probabilistic graphical models and learn the art of viewing Machine learning problems as latent variable models, algorithms for exact inference and methods of approximate Bayesian inference like sampling, variational methods etc. Far too many people stop here and view everything from a strict Bayesian perspective, may be adopt some non-parametric Bayesian techniques for extra brownie points. However, that would be missing out a lot.

Understanding generalization deeply is very important. This is because, supervised learning is not memorizing examples, but learning generalizable patterns.  Some rookie mistakes in machine learning stem from an algorithm-first approach and not clearly understanding over-fitting and generalization. I recommend learning the basics of Statistical Learning Theory which unites various perspectives in supervised learning. Whatever your weapon is be it Bayesian methods, convex techniques, applying SVM to every problem, neural nets, topological data analysis etc. ultimately in every supervised machine learning problem, you  construct a model and balance model fitting error and model complexity. At the end of it, you will have a richer perspective say when you argue about Naive Bayes vs SVM, actually be able to quantify how the number of samples needed scales with model complexity etc, and as a bonus, if you learn Information theory, you can also know the ultimate limits of learnability.

To this end, I'd suggest taking the pedagogical approach in, for example, Foundations of Machine Learning by Mehryar Mohri. Also, check out Learning from Data by Yasser Abu-Mostafa for a gentler under-graduate introduction. In the age of "Big Data", it is important to be able to reason about algorithmic complexity and at the least be able to tell scalable algorithms from more complex algorithms. You may have a fantastic model, but what if it is computationally hard? Can you develop something that works most of the time. Take a course in algorithms understand how to design and analyze algorithms, hardness and various reductions, greedy techniques, memoization etc., 

Optimization algorithms are a useful addition to your toolset as a lot of supervised methods easily translate to minimizing an objective function which is combination of a loss function (to encourage model fitting) and a regularization. When this objective function is convex, you have a lot of theory in Convex Optimization and efficient algorithms at your disposal and you can design and analyze provable machine learning algorithms and provide guarantees about algorithmic and statistical complexity. When it is not convex and even NP-Hard, convex relaxations still work very well and in recent years, we have been gaining an increasing understanding of this phenomenon. I would especially recommend learning about sparsity, L1 minimization, atomic norm etc.,

So far, we have been concentrating on learning from examples. Finally, unsupervised learning or discovering patterns is another theme in machine learning. This is basically clustering and learning mixtures and includes algorithms like k-means++, topic models, matrix factorization techniques, spectral clustering etc., that have some new theoretical guarantees and dimensionality reduction. I'd recommend also looking into some semi-supervised methods to exploit both labeled and unlabeled examples, reinforcement learning (to learn from feedback, for ex, bandit algorithms), online learning, active learning.
After learning the fundamentals and gaining a good overall perspective, it doesn't matter whether you lean towards Deep Learning, Bayesian Methods, Optimization or something else. You can do great by picking from any suite, and learn them fairly quickly.

Another answer given by Sandeep Dayananda on Quora:
For any tech enthusiast, knowing certain Machine Learning Algorithms and its applications have now become very important. Tech giants like Google, Amazon, Facebook, Walmart are using Machine Learning significantly to keep their business tight enough to compete with their rivalries. Machine Learning is now being used in various fields to ease human tasks. Every major sector in business, starting from IT to E-Commerce, from Banking to Manufacturing, use Machine Learning to predict the kind of business they would make in the coming quarters or months. This helps them take prior actions to improve their business if there is any possibility of facing a crisis in the future.

Some of the popular algorithms which are widely used in industries as well as by data scientists and are recommended to be known includes the following:

Linear Regression: It is a linear modeling technique that is often used to predict numeric values. It comes under Supervised Machine Learning Algorithms. Based on a set of Predictors (X), Linear Regression can be used to predict the Target variable (Y). The applications of linear regression extend to a wide range of domains like predicting economic growth, sales of a certain item, expected amount of rainfall, score prediction, etc.
Logistic Regression: It is a Classification algorithm that is used to predict categorical values. It comes under Supervised Machine Learning Algorithms. Based on certain attributes, you can predict in which category does your Output variable lie. The applications of Logistic Regression are credit card risk management, Weather Forecasting, predicting who will win an election, etc.
Naive Bayes: It is also a Supervised Classification algorithm that is used to predict categorical values. It is among the most popular algorithms that works on the popular Bayes Theorem of Probability. Some of its applications are Spam filtering, classifying emotions on status updates, classifying the type of news articles, etc.
Support Vector Machine: It is a Supervised machine learning algorithm for classification or regression problems where the dataset teaches SVM about the classes so that SVM can classify any new data. It works by classifying the data into different classes by finding a line (hyperplane) which separates the training data set into classes. SVM is widely used for stock market forecasting by various financial institutions and text and image classification.
K Means Clustering: It is a popularly used Unsupervised Algorithm for cluster analysis. The algorithm operates on a given data set through pre-defined number of clusters, k. The output of K Means algorithm is k clusters with input data partitioned among the clusters. For instance, letâ??s consider K-Means Clustering for Wikipedia Search results. The search term â??Jaguarâ?? on Wikipedia will return all pages containing the word Jaguar which can refer to Jaguar as a Car,
K Means clustering algorithm can be applied to group the webpages that talk about similar concepts. So, the algorithm will group all web pages that talk about Jaguar as an Animal into one cluster, Jaguar as a Car into another cluster and so on.
Apriori Algorithm: It is an Unsupervised machine learning algorithm that generates association rules from a given data set. Association rule implies that if an item A occurs, then item B also occurs with a certain probability. For example, if people buy an iPad then they also buy an iPad Case to protect it. For the algorithm to derive such conclusions, it first observes the number of people who bought an iPad case while purchasing an iPad.The applications of this algorithm are market basket analysis and auto complete application.
Decision Tree and Random Forest: These 2 algorithms are used for both Classification and Regression problems. These are used to separate a data set into different classes, based on the response variable. These are generally used when the response variable is categorical in nature. Some of its applications are classifying loan applicants by their probability of defaulting payments and Remote sensing is an application area for pattern recognition based on decision trees.

Source: HOB