BIOLOGIYA MORYA, 2026, Vol. 52, No. 1, pp. 37-47 |
|
Application of a Non-Specialized Convolutional Neural Network Model for the Automatic Detection of Macroscopic Organisms in Underwater Video Recordings from the Chukchi Sea 1Zhirmunsky National Scientific Center of Marine Biology, Far Eastern Branch, Russian Academy of Sciences, Vladivostok, Russian Federation; E-mail: This study presents an analysis of the application of the general-purpose convolutional neural network model Detectron2 (COCO-Detection/faster_rcnn_R_50_FPN_3x) for the automatic detection of macroscopic organisms in underwater video footage obtained in the Chukchi Sea using remotely operated vehicles (ROVs). The relevance of this work stems from the urgent need to develop methods for non-invasive biodiversity monitoring in Arctic seas and the lack of annotated underwater fauna datasets for training classification and detection models. In representative video fragments from different sediment types (silty-sandy, silty-sandy-shelly, and rocky), the model with the detection probability threshold disabled processed every third frame. After minor additional processing, the detection results of the neural network were compared with those of similar work performed by a human. It was shown that on silty-sandy and silty-sandy-shelly sediments, the results of the neural network were comparable to those of a human: high metrics of accuracy (0.8), recall (0.8), and the harmonic mean or F1-measure (0.80-0.82) were achieved, with the detection results of the model and a human matching for 50% of the video fragment duration. Moreover, the volume of video material requiring viewing by a specialist was reduced by 30-53%. Under complex rocky background conditions, the quality of detection by the neural network significantly decreased (accuracy - 0.12; recall - 0.07; F1-measure - 0.08), and the model was practically incapable of distinguishing living objects. The identified limitations and causes of detection errors (false positives and omissions), and prospects for improving the efficiency of the method through the creation of specialized annotated datasets and additional model training, are discussed. Key words: remotely operated vehicle (ROV), non-invasive monitoring, benthic fauna detection, computer vision, Detectron2, Faster Region-based Convolutional Neural Networks (R-CNN), Common Objects in Context (COCO). |
|
|