YOLOBench: How to Find the Best YOLO Model for Your Edge Device

This blog is a summary of a Medium article that you can find here.

We are excited to release YOLOBench, a latency-accuracy benchmark of over 900 YOLO-based object detectors for embedded use cases. Accepted at the ICCV 2023 RCV workshop, you can read the full paper on arXiv.

Check out the interactive YOLOBench app on HuggingFace Spaces where you can find the best YOLO model for your edge device!  

YOLO (You Only Look Once) is a popular family of deep learning models that can perform object detection with unprecedented inference speed. Object detection is the task of locating and identifying objects in an image or video, such as cars, pedestrians, animals, etc. Object detection has many applications in fields like security, surveillance, autonomous driving, robotics, and more.

However, not all YOLO models are created equal. Depending on the model architecture, the dataset, and the hardware platform, some YOLO models may perform better than others in terms of accuracy and speed. How can you find the best YOLO model for your edge device, such as a security camera, smartphone or a drone?

That's where YOLOBench by Deeplite comes in. YOLOBench is a comprehensive benchmark that evaluates over 900 YOLO-based object detection models on four different datasets (COCO, PASCAL VOC, WIDERFACE, and SKU-110K) as well as five initial embedded hardware platforms (x86 CPU, ARM CPU, Nvidia GPU, Khadas VIM3 NPU and Orange Pi NPU). You can see this for yourself through our interactive YOLOBench HuggingFace Spaces App!

YOLOBench provides a fair and controlled comparison of these models by using a fixed training environment (code and training hyperparameters). It also collects accuracy and latency numbers for each model and dataset combination. YOLOBench considers multiple dimensions of the model search space such as depth-width variations and different input resolutions. By analyzing these numbers, you can easily and quickly find the Pareto-optimal models that achieve the best trade-off between accuracy and speed for your edge device.

YOLOBench also evaluates several zero-cost accuracy estimators that can predict the accuracy of a model without training it. These estimators use various metrics such as MAC count, parameter count, gradient information, etc. to estimate the accuracy of a model based on its architecture from a single batch of input data. For example, using zero-cost proxies such as MAC count and the NWOT estimator, Deeplite was able to find a new YOLO backbone that outperforms a state-of-the-art YOLOv8 model on a Raspberry Pi 4 CPU.

Want to add your own hardware to YOLOBench?

Our initial set of benchmark hardware is just a start!  Showcase your hardware with YOLOBench users by adding your benchmark data on your devices.  If you’d like to have your hardware benchmarked on over 900 YOLO-based object detectors you can find the instructions on next steps here and we’ll get right back to you! 


I hope you enjoyed this blog summary of the article. Please let me know if you have any questions or feedback. 😊

Read On