A deep learning-based model for automatic identification of mesopelagic organisms from in-trawl cameras
by Taraneh Westergerling, Vaneeda Allken, Webjørn Melle, Anne Gro Vea Salvanes, Shale Rosen
Mesopelagic organisms play an important role in the ocean’s carbon transport and food webs and have been regarded as a potential harvestable resource. Their extensive aggregations in the upper thousand meters of the water column are frequently detected acoustically as deep scattering layers. However, extracting species and length composition from acoustics alone is challenging. Trawl catches, commonly used for ground-truthing acoustic data, suffer from size- and species-specific escapement and are spatially integrated along the trawl path. In-trawl cameras offer records at a finer spatial scale and are unaffected by mesh selectivity in the codend. Hence, integrating optical systems into trawling operations can enhance the validation of acoustic data without increasing sampling time. In this study, we trained a deep learning-based object detection model (YOLO11s) to automate the identification of seven mesopelagic groups common in the North Atlantic Ocean (lanternfish, silvery lightfish, barracudina, krill, pelagic shrimp, gelatinous zooplankton, and squid) along with a group of larger pelagic fishes from in-trawl images collected under white, and red-light with two gain settings. The model generally performed better on white-light images (weighted mean average precision ~ 0.95). However, using red light did not greatly reduce the model’s ability to detect mesopelagic organisms (weighted mean average precision ~ 0.77). The model performed especially well at detecting lanternfish, silvery lightfish and barracudina (average precision > 0.89). Object classes with average precision values under 0.80 (e.g., pelagic shrimp, krill) benefited from increasing the image resolution and expanding the training dataset. Our study demonstrates that employing the latest machine learning algorithms enables the detection of small-sized mesopelagic species from in-trawl camera images, allowing for rapid extraction of depth-stratified data and records of fragile species that are typically lost in the codend meshes.