你只看一次(YOLO)是最新的实时物体检测系统。 在Pascal Titan X上,它以30 FPS的速度处理图像,并且在COCO测试开发中的mAP为57.9%。
You only look once (YOLO) is a state-of-the-art, real-time object detection system. On a Pascal Titan X it processes images at 30 FPS and has a mAP of 57.9% on COCO test-dev.
How It Works
Prior detection systems repurpose classifiers or localizers to perform detection. They apply the model to an image at multiple locations and scales. High scoring regions of the image are considered detections.
We use a totally different approach. We apply a single neural network to the full image. This network divides the image into regions and predicts bounding boxes and probabilities for each region. These bounding boxes are weighted by the predicted probabilities.
Our model has several advantages over classifier-based systems. It looks at the whole image at test time so its predictions are informed by global context in the image. It also makes predictions with a single network evaluation unlike systems like R-CNN which require thousands for a single image. This makes it extremely fast, more than 1000x faster than R-CNN and 100x faster than Fast R-CNN. See our paper for more details on the full system.
What’s New in Version 3?
YOLOv3 uses a few tricks to improve training and increase performance, including: multi-scale predictions, a better backbone classifier, and more. The full details are in our paper!
Detection Using A Pre-Trained Model
This post will guide you through detecting objects with the YOLO system using a pre-trained model. If you don’t already have Darknet installed, you should do that first. Or instead of reading all that just run:
git clone https://github.com/pjreddie/darknet
cd darknet
make