Deakin University
Browse

File(s) under embargo

SAWIT: A small-sized animal wild image dataset with annotations

Version 3 2024-08-05, 02:28
Version 2 2024-06-03, 01:05
Version 1 2023-10-10, 05:17
journal contribution
posted on 2024-08-05, 02:28 authored by Thi Thu Thuy NguyenThi Thu Thuy Nguyen, AC Eichholtzer, Don DriscollDon Driscoll, NI Semianiw, Dean CorvaDean Corva, Abbas KouzaniAbbas Kouzani, TT Nguyen, Duc Thanh NguyenDuc Thanh Nguyen
AbstractComputer vision has found many applications in automatic wildlife data analytics and biodiversity monitoring. Automating tasks like animal recognition or animal detection usually require machine learning models (e.g., deep neural networks) trained on annotated datasets. However, image datasets built for general purposes fail to capture realistic conditions of ecological studies, and existing datasets collected with camera-traps mainly focus on medium to large-sized animals. There is a lack of annotated small-sized animal datasets in the field. Small-sized animals (e.g., small mammals, frogs, lizards, arthropods) play an important role in ecosystems but are difficult to capture on camera-traps. They also present additional challenges: small animals can be more difficult to identify and blend more easily with their surroundings. To fill this gap, we introduce in this paper a new dataset dedicated to ecological studies of small-sized animals, and provide benchmark results of computer vision-based wildlife monitoring. The novelty of our work lies on SAWIT (small-sized animal wild image dataset), the first real-world dataset of small-sized animals, collected from camera traps and in realistic conditions. Our dataset consists of 34,434 images and is annotated by experts in the field with object-level annotations (bounding boxes) providing 34,820 annotated animals for seven animal categories. The dataset encompasses a wide range of challenging scenarios, such as occlusions, blurriness, and instances where animals blend into the dense vegetation. Based on the dataset, we benchmark two prevailing object detection algorithms: Faster RCNN and YOLO, and their variants. Experimental results show that all the variants of YOLO (version 5) perform similarly, ranging from 59.3% to 62.6% for the overall mean Average Precision (mAP) across all the animal categories. Faster RCNN with ResNet50 and HRNet backbone achieve 61.7% mAP and 58.5% mAP respectively. Through experiments, we indicate challenges and suggest research directions for computer vision-based wildlife monitoring. We provide both the dataset and the animal detection code at https://github.com/dtnguyen0304/sawit.

History

Journal

Multimedia Tools and Applications

Volume

83

Pagination

34083-34108

Location

Berlin, Germany

ISSN

1380-7501

eISSN

1573-7721

Language

en

Publication classification

C1 Refereed article in a scholarly journal

Issue

11

Publisher

SPRINGER