January 11, 2023

A new YOLO is here - YOLOv8

Recently, the creators of the famous YOLOv5 (34.5k GitHub stars and 12.5k forks at the time of writing that post) - Ultralytics, announced their new model - YOLOv8. It’s the latest You Only Look Once algorithm version for real-time object detection, instance segmentation and classification tasks. The new version intends to be more accurate, faster, and flexible. It’s really easy to use – see the tutorial on Google Colab for more details.

We still await more detailed information about improvements, comparison to another methods and the long-awaited Arxiv paper that will summarize the YOLO released by Ultralytics. They even said it has a high priority. We hope so, and we believe them. And also we look forward to reading it. Nevertheless, they describe a YOLOv8 as a new state-of-the-art and for now, we can look through the code and implementation and list some of the new features in the YOLOv8 that includes:

  • the new backbone network
  • the new anchor-free detection head
  • the new loss function
GIF: A lot of object to detect!

Ultralytics moved the project to a new repository called ‘ultralytics’. We think the main reason is to use that as a general framework written in PyTorch to train, validate or infer models. It also intends to support other YOLO versions and make them easy to compare. Ultralytics provides integration with Roboflow, ClearML, Comment and Neural Magic. Another key feature is a user-friendly API, which enables users to easily use YOLOv8 through Command Line Interface or Python scripts. The framework supports exporting models to formats such as ONNX, OpenVINO, TensorRT or TFLite. It is worth mentioning that code is still under active development and every hour brings new commits.

How good is the new YOLO?

The tables of results below come from the Ultralytics report. The detection and segmentation models were trained on the COCO dataset, while classification models are on ImageNet. YOLOv8 provides five models of different sizes - nano (yolov8n), small (yolov8s), medium (yolov8m), large (yolov8l), and extra large (yolov8x).

The table presents the results of the YOLOv8 detection family on a COCO Val 2017 dataset.

The bigger the model is, the better mAp it achieves - that’s not surprising. In addition, we are pleased that the inference of the biggest model with 257.8 Floating Point Operations Per Second (FLOPS) works even on the relatively cheap laptop-level GPU of one of the authors of that post. Generally, YOLO seems to be a good choice for robotics because it could work onboard autonomous machines. Nice!

The table presents YOLOv8 segmentation family results on a COCO Val 2017 dataset.

Moreover, our impressions from using the new Ultralytics repository are very positive. The installation takes no more than one minute and is available under Python. All developers used to the YOLOv5 repository will be satisfied; the new one is made similarly, so there is no need to learn tons of new features in the beginning. However, one thing knocked us down with a feather. For some reason, the first thing we wanted to test was passing just an image to model.predict() method and guess what - it is not possible yet. We were not the first ones wondering how to do that - see that issue. Knock, knock, authors. We need it!

The table shows YOLOv8 family classification results on the ImageNet dataset.

GIF 2: we love animals, especialy in the gopro views

Conclusion

Here, we briefly reviewed a new Ultralytic's YOLOv8 model and we looked at improvements and possibilities of a model, reported performance and first impressions from using it. Definitely, it's worth keeping an eye on that project's development, and we are waiting for a detailed comparison with other existing methods. Hey Ultralytics! Keep up the excellent work; we look forward to using YOLO in new, exciting applications!