YOLO
https://arxiv.org/abs/1506.02640
For object detection.
Objection direction as a regression problem to spatially separated bounding boxes and associated class probabilities using a single neural network.

Advantages:
Extremely fast - since it is an regression problem.
Reasons globally about the image when making predictions.
Learns generalizable representations of objects.
Method/Process.
First, divide input image into an grid. If center of an object falls into cell - cell is responsible for detecting the object. Each ggrid cell predicts bounding boxes and confident score for the boxes. - IOU between the predicted box and the truth (score)
Each box consist of 5 predictions. and confidence.
Each grid cell also predicts class probabilities .
At test time, we multiply conditional class probabilities with box confidnece prediction, the result is then


Training: details in paper. Final layer predict both class probabilities and bounding box. Final layer is linear activation, other is ReLu. Also dropout
Loss function:

Last updated