The five questions asked in data science and based on these questions you will decide which algorithm will be used so following are the five questions which can be asked:
Is this A or B? (Classification Algorithm)
Is this a or b or is this apple or pineapple is this a pen or is it a pencil is it mouse or is it is an elephant so when you have these kind of questions the algorithms with these kind of question is the classification algorithm.
Is this weird? (Anomaly Detection Algorithm)
This question basically deals with patterns so whenever there is a change in pattern the algorithm detects and the algorithm which it deals with these kinds of problems are called anomaly detection algorithms.
How much or how many? (Regression Algorithm)
Then you have questions which are quantifiable when you ask numbers how much or many. For example what will be the temperature for tomorrow or after, how many days it will rain so these kinds of questions are tackled by algorithms which are called regression algorithms.
How is this organized? (Clustering Algorithms)
Then you have questions like how is this organized so basically deals with clustering and algorithms which deals with these kind of problems are called clustering algorithms.
What should I do next? (Reinforcement Learning)
Lastly the questions we have as in what should I do next right so if you don't know when you have to make a decision are basically done by algorithms which are called reinforcement learning.
So using these algorithms we can take a decision as in what to do next so these are the five questions which are asked in data science and these are the algorithms which are made to tackle these kinds of questions.
What is Regression?
Regression is basically trying to establish a relationship between two variables. Regression analysis is a predictive modeling technique. It estimates the relationship between a dependent and an independent variable.
Dependent variable is related to the independent variable so Y is dependent variable and X is the independent variable. The value of x increases Y also increases but Y value is dependent on X. X can increase as much as it want but Y will increase according to X, so Y is dependent on X, this all prediction is done using regression algorithm.
Types of Regression:
There are three types of regression:
1. Linear Regression: when there is linear relationship between independent and dependent variables.
2. Logistics Regression: When the dependent variable is categorical (True or False, 0/1)
3. Polynomial Regression: When the power of independent variable is more than 1.
Logistic Algorithms as the name suggest it comes under regression algorithms but with logistics regression the answer which comes is categorical as in the answer is either yes or no, either true or false so it is classified the values of fixed values are categorical the dependent variable the output, this is what we getting like this answers this will also categorize under the classification algorithms.
Classifying the output we get the probability and based on that probability will decide whether it will be yes or no, or it A or B so that is the reason it is categorized under both of these algorithms.
When the outcome of the dependent variable (Y), is discrete, like 0 or 1, Yes or No we use logistic regression. Logistic Regression can also solve multi-classification problems like whether it belongs to category A, B, C or D
Logistic Regression or logic regression is a regression model where the dependent variable DV is categorical.
Variables that can have only fixed values such as Yes or No, Right or Wrong come under categorical.
Y=f(X) here, Y is dependent on X, this is called dependent variable.
For more insights watch the video about logistic regression.