CS231N Lec. 2 | Image Classification pipline

please find lecture reference from here1

CS231N Lec. 2 | Image Classification pipline

KNN classifier

K-nearest neighbor classifier

this algorithm Big-O is ,
Train : O(1) just copying all data Predict : O(n) checking all data

meaning, train fast but predict slow. In contrast, users expect fast prediction while slow train is acceptable. means KNN is not suitable

Then, how calculate distance ?

L1 ( Manhattan ) distance dependency on “coordinate system”

L2 ( Euclidean ) distance

split data

which one is better? it’s hyperparameter, so it’s better to try both.

Hyperparameter

Choices about the algorithm that we set rather than learn.
Quite problem dependent. should try all possible cases.
e.g.) in case of KNN, value of k and distance.

How set Hyperparameter ?

  1. split data
    When you setting Hyperparameter, split data into train, validation and test
    choose hyperparameters on validation and evaluate on test.
    cv

  2. cross validation
    split data into folds, try each fold as validation and average the result. cv

useful for small datasets. not commonly used at deep learning.

KNN on image never used

pros:

cons :

  1. very slow at test
  2. distance metirx is not informative
  3. curse of dimension : big-O increase exponentially

cons#2 :
Even though below 4 images are clearly different, they have same L2 distance. cons2

Linear classification

Concept of Linear classification

Linear classification

Limit of Linear classification hard to classify non-linear cases like below: Linear classification limit


© 2020. All rights reserved.