So if you are willing to start with machine learning you need to keep in mind two basic things.

Machine learning is covered with large algebra. And of course, nobody can study the entire field easily. If you are willing to achieve your goal, you will have to be clear with every concept of machine learning. And for this, you need to know the whole algebra concept, go into bit depth to understand the concepts and think why it is used that particular algorithm.

Here are some points where you will understand, where linear algebra is used in machine learning:

Data Set and Data Files

In machine learning, you fit a model on a data set. This data set can be a vector(1*n / n*1) or matrix (m*n). If you split the data set into input and output you have Matrix (X) and vector (y). Here X is the independent variable on X-axis and y is a dependent variable on the Y-axis. The size of both must be the same.

Images and Photographs

In case of Image processing and computer vision applications, each image we work with is a table structure (or matrix) with width and height of 1 Pixel for black and white images and 3 pixels (RGB, each 1 pixel) for color images.

Operations on the image, such as scaling, cropping, shearing, etc are all described using the notation and operations of linear algebra.

Linear Regression

This is commonly used in the predictive analysis in machine learning. some examples are predicting house rent in a specific location based on size, the price of oil, predicting the number of goals by Cristiano Ronaldo or Messi in Soccer world cup 2018 or who will win the world cup 2018, etc.

Regularization

Think of a training data set, which is noisy, overfitted (means unnecessary columns) or biased, therefore poor prediction.

Techniques of Generalization

L1 Regularization (Lasso penalization): L1 regularization shrinks some parameters to zero. Hence some variables will not play any role in the model.

L2 Regularization (Ridge penalization): L2 regularization forces the parameters to be relatively small.

Principal Component Analysis

Often a data set has many columns, perhaps tens, hundreds, thousands or more. Modeling data with many features is challenging. Methods for automatically reducing the number of columns of a data set are called dimensionality reduction and the most popular one is PCA. It has 2 goals:

identify patterns in data &

detect the correlation between variables

Singular Value Decomposition

The Singular Value Decomposition is a highlight of linear algebra

This is another popular dimensionality reduction method also known as matrix decomposition method. It can be used directly in applications such as feature selection, visualization, noise reduction and more.

Latent Semantic Analysis

In machine learning working with text and language is called Natural Language Process (NLP). so we can use NLP for text review to predict if the review is a good one or a bad one, to predict the genre of the book, build machine translator or a speech recognition system, etc.

Recommender System

Predictive modeling problems that involve the recommendation of products are called recommender systems, a subfield of machine learning.

For example, books based on previous purchases and purchases by customers like you on Amazon, and the recommendation of movies and TV shows to watch based on your viewing history and viewing history of subscribers like you on Netflix, etc.

Deep Learning

Inspired by the ability of information processing in the human brain, Deep Learning is a nonlinear machine learning algorithms. Deep learning uses a deep neural network (DNN), which is an artificial neural network with multiple hidden layers between the input and output layers.

Deep learning methods work with vectors, matrices and even tensors of inputs and coefficients. Tensor is is a matrix with more than two dimensions and Google uses TensorFlow Python library for machine learning.