Background
An overview of the process that made this a reality
The SEFSC Gulf and Caribbean Reef Fish (GCRF) Branch collects roughly 15 TB of underwater video data each year (as of 2022) during reef fish video surveys conducted throughout the Gulf of Mexico. These videos are post-processed by subject matter experts (SMEs) who “read” them to identify each fish observed in the scene and generate species counts for a number of economically key species in the region. These species counts (e.g., maximum number observed, or max(N)) quantify species abundance and inform stock assessments. Manually post-processing a year’s worth of survey video currently takes, on average, 18 months with existing staffing levels and the quantity of data being collected. Thus, a given year’s video data is rarely fully processed before the next year’s surveys begin, a model that is unsustainable at worst and risks processing only subsets of all available data at best.
A variety of tools have been developed to automate the tedious annotation process using modern deep learning (DL) artificial intelligence (AI) techniques. One such application is the Video and Image Analytics for Marine Environments (VIAME) toolkit developed by Kitware in conjunction with NOAA Fisheries. VIAME is an open source computer vision software platform for training and running DL algorithms for image and video processing. Models can be trained to detect fish, label species, or track individuals in videos or image sequences, reducing the processing time from around 1-2 months per video for humans to several hours. GCRF staff are currently testing the suitability and functionality of VIAME for processing reef fish video data and are actively working to ensure that promising AI technologies are robustly vetted for precision, accuracy, and reliability before being transitioned to operational use.
The volume of data requiring processing (tens of TB and growing each year) presents significant storage and management challenges, and DL algorithms generally require high performance computing resources in order to be successful. Modern cloud technologies offer solutions to both data storage and computational resources challenges. The Southwest Fisheries Science Center (SWFSC) successfully deployed VIAME in the NOAA Fisheries cloud and documented the implementation process and lessons learned so that other regions could replicate their efforts. In 2022, the SEFSC Fisheries Assessment, Technology, and Engineering Support (FATES) Division received funding to do the same thing: to migrate VIAME from on premises workstations into the NOAA Fisheries cloud. This would allow for the creation of high performance GPU compute instances alongside scalable data storage buckets to stage large quantities of data while they are post-processed. Because the cloud is new to NOAA Fisheries, the SEFSC, and end users alike, the migration process became a learning experience for everyone involved. As of 2024, a cloud instance of VIAME has been stood up for the SEFSC and is being tested by GCRF staff.