YOLO
https://arxiv.org/abs/1506.02640
For object detection.
Objection direction as a regression problem to spatially separated bounding boxes and associated class probabilities using a single neural network.

Advantages:
Extremely fast - since it is an regression problem.
Reasons globally about the image when making predictions.
Learns generalizable representations of objects.
Method/Process.
First, divide input image into an SΓS grid. If center of an object falls into cell - cell is responsible for detecting the object. Each ggrid cell predicts B bounding boxes and confident score for the boxes. - IOU between the predicted box and the truth (score)
Each box consist of 5 predictions. x,y,w,h and confidence.
Each grid cell also predicts class probabilities Pr(Classiββ£Object).
At test time, we multiply conditional class probabilities with box confidnece prediction, the result is then


Training: details in paper. Final layer predict both class probabilities and bounding box. Final layer is linear activation, other is ReLu. Also dropout
Loss function:

Last updated