K-Nearest Neighbours

The idea behind K-nearest neighbours is simple: Examples that have similar features are likely to have similar labels. If we have a training set and wish to assign a label to a new example using K-nearest neighbours, we search the training set for the K examples which are closest to our new example and aggregate the labels of those K examples.

In spite of its simplicity, K-nearest neighbours often performs surprisingly well on small datasets and you can implement it for yourself in the notebooks below.

Online resources

  • A visualisation tool made by the Stanford vision lab;
  • A stackexchange post which explains why K-nearest neigbours suffers from 'the curse of dimensionality', making it less effective on high dimensional datasets.

Click the links below to access the Jupyter Notebooks for K-Nearest Neighbours