Dog Face Detection Using YOLO Network

Alzbeta Tureckova; Tomas Holik; Zuzana Kominkova Oplatkova

doi:10.13164/mendel.2020.2.017

Alzbeta Tureckova Tomas Bata University in Zlin, Faculty of Applied Informatics, Czech Republic
Tomas Holik Tomas Bata University in Zlin, Faculty of Applied Informatics, Czech Republic
Zuzana Kominkova Oplatkova Tomas Bata University in Zlin, Faculty of Applied Informatics, Czech Republic

DOI: https://doi.org/10.13164/mendel.2020.2.017

Keywords: Deep Learning, Deep Convolution Networks, Object detection, Dog Face Detection, YOLO, iOS Mobile Application

Abstract

This work presents the real-world application of the object detection which belongs to one of the current research lines in computer vision. Researchers are commonly focused on human face detection. Compared to that, the current paper presents a challenging task of detecting a dog face instead that is an object with extensive variability in appearance. The system utilises YOLO network, a deep convolution neural network, to~predict bounding boxes and class confidences simultaneously. This paper documents the extensive dataset of dog faces gathered from two different sources and the training procedure of the detector. The proposed system was designed for realization on mobile hardware. This Doggie Smile application helps to snapshot dogs at the moment when they face the camera. The proposed mobile application can simultaneously evaluate the gaze directions of three dogs in scene more than 13 times per second, measured on iPhone XR. The average precision of the dogface detection system is 0.92.

References

Chen, K., Pang, J., Wang, J., Xiong, Y., Li, X., Sun, S., Feng, W., Liu, Z., Shi, J., Ouyang, W., Loy, C. C., and Lin, D. Hybrid task cascade for instance segmentation. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019), pp. 4969-4978.

Chollet, F., et al. Keras. https://github.com/fchollet/keras, 2015.

Fu, C.-Y., Liu, W., Ranga, A., Tyagi, A., and Berg, A. C. Dssd: Deconvolutional single shot detector. arXiv preprint arXiv:1701.06659 (2017).

He, K., Zhang, X., Ren, S., and Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (2016), pp. 770-778.

Holik, T. Doggiesmile: An ios/iphone application for video dog detection. Master's thesis, Tomas Bata University in Zlin, faculty of Applied Informatics, 2020.

Huang, J., Rathod, V., Sun, C., Zhu, M., Korattikara, A., Fathi, A., Fischer, I., Wojna, Z., Song, Y., Guadarrama, S., et al. Speed/accuracy trade-offs for modern convolutional object detectors. In Proceedings of the IEEE conference on computer vision and pattern recognition (2017), pp. 7310-7311.

Kazemi, V., and Sullivan, J. One millisecond face alignment with an ensemble of regression trees. In Proceedings of the IEEE conference on computer vision and pattern recognition (2014), pp. 1867-1874.

Kingma, D. P., and Ba, J. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

Lin, T.-Y., Goyal, P., Girshick, R., He, K., and Dollar, P. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision (2017), pp. 2980-2988.

Liu, J., Kanazawa, A., Jacobs, D., and Belhumeur, P. Dog breed classication using part localization. In European conference on computer vision (2012), Springer, pp. 172-185.

Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., and Berg, A. C. Ssd: Single shot multibox detector. In European conference on computer vision (2016), Springer, pp. 21-37.

Parkhi, O. M., Vedaldi, A., Zisserman, A., and Jawahar, C. Cats and dogs. In 2012 IEEE conference on computer vision and pattern recognition (2012), IEEE, pp. 3498-3505.

Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. You only look once: Unied, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (2016), pp. 779-788.

Redmon, J., and Farhadi, A. Yolo9000: better, faster, stronger. In Proceedings of the IEEE conference on computer vision and pattern recognition (2017), pp. 7263-7271.

Ren, S., He, K., Girshick, R., and Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems (2015), pp. 91-99.

Uijlings, J. R., Van De Sande, K. E., Gevers, T., and Smeulders, A. W. Selective search for object recognition. International journal of computer vision 104, 2 (2013), 154-171.

Vlachynska, A., Oplatkova, Z. K., and Turecek, T. Dogface detection and localization of dogface's landmarks. In Computer Science Online Conference (2018), Springer, pp. 465-476.

Yamada, A., Kojima, K., Kiyama, J., Okamoto, M., and Murata, H. Directional edge-based dog and cat face detection method for digital camera. In 2011 IEEE International Conference on Consumer Electronics (ICCE) (2011), pp. 87-88.