Menhaden Ageing Model
An innovative, state-of-the-art deep learning model for automatically ageing menhaden samples using scale images
This Menhaden Ageing Model provides an innovative method for automatically estimating Menhaden age using scale images. Built upon state-of-the-art deep learning algorithms, it enables rapid generation of fish age predictions by simply pointing to a directory containing magnified images of scale samples.
About
The internal workflow is as follows:
- Raw images are first converted to grayscale such that every image pixel contains a value [0, 255]. These grayscale images are then processed using binary thresholding image processing techniques by which all pixels whose values are above a certain threshold are set to 1 while the rest are set to 0. This allows the scale itself to be distinguished from the image background. The threshold value used for menhaden, based on trail and error, is 100.
- Imperfections in the new masked images are cleaned up using morphological opening and closing techniques to remove undesired background noise and capture any missed portions of the scale.
- The contours of the masked shape are identified in order to extract the object of interest (i.e., the scale).
- The scale is then cropped out of the original image and padded to make it square.
- The new square image containing just the scale of interest is passed to a trained custom residual neural network (resnet) deep learning classification model. Model output is saved to a CSV file.
Implementation instructions follow. Be sure to set up and configure a Python environment before the first use.
Usage
Running the model requires two steps. First, raw images must be pre-processed in order to crop out the scale of interest from the full image, pad the cropped image to ensure the full scale is captured, and (optionally) normalize the cropped image to facilitate ageing. This is all done using the Scale_Raw_Image_Preprocessing.py
Python script. To execute, run the following command in a command line terminal:
python Scale_Raw_Image_Preprocessing.py --config-file <config_dir>
or
python Scale_Raw_Image_Preprocessing.py -c <config_dir>
where <config_dir>
is the path to the configuration file containing model settings described below. The ageing model itself is wrapped inside a second Python script called Scale_Aging_Inference_Script_Image_Only.py
. To execute, run the following command in a command line terminal:
python Scale_Aging_Inference_Script_Image_Only.py --config-file <config_dir>
or
python Scale_Aging_Inference_Script_Image_Only.py -c <config_dir>
For more information, including an execution example, see Using the Model in the docs.
Options
All user options are contained in a config YAML file (called configurations.yml
by default, but can be named anything) to allow easier control and greater reproducibility. Settings are entered as key: value
pairs as described below. The first set of parameters control the image processing routine while the second set control the age model itself.
Pre-Processing Options
Key | Description |
---|---|
raw_image_path |
Path to raw images. Best to include the full path in quotations. Example: “G:/Shared drives/NMFS SEFSC FATES Advanced Technology/BIOLOGY_LIFE_HISTORY_DATA/age_testing/images” |
preprocessed_image_path |
Path to save the processed images. Best to include the full path in quotations and to use a dedicated folder. |
input_type |
Input image type |
output_type |
Output image type. Should not need to be changed. |
segment |
Scale segmentation method: “binary” for binary thresholding and “sam” for Segment Anything Model (SAM). Binary thresholding should work fine if images are high contrast with light scales on a dark background. SAM is more robust to variable image conditions but requires more processing time and a GPU. |
binary_threshold |
Binary threshold pixel value for differentiation between foreground (scale) and background. Default: 100 |
points_per_side |
Number of points to use for automatic segmentation of scales with SAM. This should be adjusted based on size of object of interest with respect to the entire image. In general, you want number of points to be greater than the ratio of image size/object size for the smallest object of interest. Having too many points though could greatly increase processing time. Default: 16 |
stability_score_thresh |
Threshold for whether to include pixels in object mask. If the mask is too large, increase the score threshold and vice versa if the mask is too small. Default: 0.93 |
downsample |
Down-sample image size for input to SAM to reduce processing time. Default: 0.5 (i.e. reduce image dimensions to 50% of original size) |
sam_model_type |
SAM model type. Options are “vit_b”, “vit_l”, and “vit_h” in order of increasing size. Default: “vit_b” |
sam_weights_path |
Path to SAM model weights. Make sure this matches the model type. Best to use the full path in quotations. |
pad |
Padding for top and sides of cropped image. Defined as a fraction of the original cropped image size. Bottom padding is controlled by bottom_pad . Default: 0.2 |
bottom_pad |
Padding for bottom of cropped image. This is defined separately since for scale images, the bottom is usually visually distinct from the body and may be missed in the segmentation. Default: 0.4 |
normalization |
Method for optionally normalizing the image after cropping and padding. Options: “none”, for no normalization, “he” for histogram equalization, and “clahe” for Contrast Limited Adaptive Histogram Equalization. |
invert |
Option to invert the pixel values in gray scale before normalization. This will make dark regions light and light regions dark. |
Age Model Options
The following options control the age inference model. They are entered into the same configuration file for convenience.
Key | Description |
---|---|
image_path |
Path to preprocessed image folder. Best to include the full path in quotations. This should match preprocessed_image_path from the pre-processing step. |
model_path |
Path to model weights (directory and file name of weights file). Best to include the full path in quotations. |
out_path |
Path to save results (directory and file name for CSV file). Best to include the full path in quotations. |
- The directory containing the scale images to process
- The directory where you want the model output file to be written
- The directory containing the trained model weights (i.e., the
best_model.pth
file) and, if desired, the Segment Anything Model weights. If you simply cloned the repo and have not moved anything around, the trained model weights file will be alongside the model script in thescripts
subdirectory in the cloned repository. The SAM weights will be wherever you saved them upon downloading them.
Absolute file paths are generally recommended to avoid unintended behavior but will vary from computer to computer.
Dependencies
- Anaconda (for virtual environment implementation)
- Git CLI (for repository cloning)
- Python >= 3.9
- torch ==1.12.1
- torchvision == 0.13.1
- torchaudio == 0.12.1
- opencv-python
- pandas
- tqdm
Recommendations
- matplotlib (data plotting in Python)
- jupyter (viewing Jupyter notebook demonstrations provided in the GitHub repo)
See Getting Started for installation and setup instructions.
Release Notes
Version History
-
2025.0.1 (BETA) (May 2025): Initial version for testing with the following functionality:
- Run via command line with image, output, and model directories passed as required arguments
-
2025.0.2 (BETA) (July 2025):
- Adds optional histogram normalization (simple histogram equalization and Contrast Limited Adaptive Histogram Equalization)
- Separates image pre-processing and scale ageing into two separate scripts
- Model settings and hyperparameters, including input and output file paths, are now listed in a configuration YAML file passed as a single command line argument to both Python scripts
Software code created by U.S. Government employees is not subject to copyright in the United States (17 U.S.C. §105). The United States/Department of Commerce reserve all rights to seek and obtain copyright protection in countries other than the United States for Software authored in its entirety by the Department of Commerce. To this end, the Department of Commerce hereby grants to Recipient a royalty-free, nonexclusive license to use, copy, and create derivative works of the Software outside of the United States.
This software is a scientific product and is not official communication of the National Oceanic and Atmospheric Administration, or the United States Department of Commerce. All NOAA software and project code is provided on an “as is” basis and the user assumes responsibility for its use. Any claims against the Department of Commerce or Department of Commerce bureaus stemming from the use of this software will be governed by all applicable Federal law. Any reference to specific commercial products, processes, or services by service mark, trademark, manufacturer, or otherwise, does not constitute or imply their endorsement, recommendation or favoring by the Department of Commerce. The Department of Commerce seal and logo, or the seal and logo of a DOC bureau, shall not be used in any manner to imply endorsement of any commercial product or activity by DOC or the United States Government.