What is KNN Algorithms? Understand with the help of Examples

By Jyoti Nigania |Email | Jun 29, 2018 | 20604 Views

By now, we all know Machine Learning models makes predictions by learning from the past data available. So here we have input value, a machine learning model based on those inputs understands and gives out the predicted output.

For example, two persons walking and one of them said is it that a dog, but that is the black cat passing across the path and another person said no dear, you can differentiate between a cat and a dog based on their characteristics. Cat has sharp claws, uses to climb, smaller length of ears, meows and purrs and doesn't love to play around while dog has dull claws, bigger length of ears, barks and loves to run around. 

So based on these characteristics one can easily identify whether it is a dog or a cat. As KNN is based on feature similarity, we can do classification using KNN classifier! Fore more insights watch video:
What is KNN?
KNN-K nearest neighbors is one of the simplest supervised machine learning algorithms mostly used for classification as it classifies a data point based on how its neighbors are classified. KNN stores all the available cases and classifies new cases based on a similarity measure. K in KNN is a parameter that refers to the number of nearest neighbors to include in the majority voting process. A data point is classified by majority votes from its nearest neighbors. 
How do we choose 'K' and when we use KNN Algorithms?
KNN algorithm is based on feature similarity: Choosing the right value of k is a process called parameter tuning, and is important for better accuracy. 
Where n is the total number of data points and odd value of K is selected to avoid confusion between the two classes of data. 

We can use KNN algorithm when:
  • Data is labeled
  • Data is noise free
  • Dataset is small

Thus, KNN algorithm is one of the simplest classification algorithms. Even with such simplicity, it can give highly competitive results. KNN algorithm can also be used for regression problems. The only difference from the discussed methodology will be using averages of nearest neighbors rather than voting from nearest neighbors. KNN can be coded in a single line on R. 

Source: HOB