Menhaden Ageing Model

An innovative, state-of-the-art deep learning model for automatically ageing menhaden samples using scale images

This Menhaden Ageing Model provides an innovative method for automatically estimating Menhaden age using scale images. Built upon state-of-the-art deep learning algorithms, it enables rapid generation of fish age predictions by simply pointing to a directory containing magnified images of scale samples.

About

The internal workflow is as follows:

Raw images are first converted to grayscale such that every image pixel contains a value [0, 255]. These grayscale images are then processed using binary thresholding image processing techniques by which all pixels whose values are above a certain threshold are set to 1 while the rest are set to 0. This allows the scale itself to be distinguished from the image background. The threshold value used for menhaden, based on trail and error, is 100.
Imperfections in the new masked images are cleaned up using morphological opening and closing techniques to remove undesired background noise and capture any missed portions of the scale.
The contours of the masked shape are identified in order to extract the object of interest (i.e., the scale).
The scale is then cropped out of the original image and padded to make it square.
The new square image containing just the scale of interest is passed to a trained custom residual neural network (resnet) deep learning classification model. Model output is saved to a CSV file.

Implementation instructions follow. Be sure to set up and configure a Python environment before the first use.

Usage

Running the model requires two steps. First, raw images must be pre-processed in order to crop out the scale of interest from the full image, pad the cropped image to ensure the full scale is captured, and (optionally) normalize the cropped image to facilitate ageing. This is all done using the Scale_Raw_Image_Preprocessing.py Python script. To execute, run the following command in a command line terminal:

python Scale_Raw_Image_Preprocessing.py --config-file <config_dir>

python Scale_Raw_Image_Preprocessing.py -c <config_dir>

where <config_dir> is the path to the configuration file containing model settings described below. The ageing model itself is wrapped inside a second Python script called Scale_Aging_Inference_Script_Image_Only.py. To execute, run the following command in a command line terminal:

python Scale_Aging_Inference_Script_Image_Only.py --config-file <config_dir>

python Scale_Aging_Inference_Script_Image_Only.py -c <config_dir>

For more information, including an execution example, see Using the Model in the docs.

Options

All user options are contained in a config YAML file (called configurations.yml by default, but can be named anything) to allow easier control and greater reproducibility. Settings are entered as key: value pairs as described below. The first set of parameters control the image processing routine while the second set control the age model itself.

Pre-Processing Options

Paths and general options
Key	Description
`raw_image_path`	Path to raw images. Best to include the full path in quotations. Example: “G:/Shared drives/NMFS SEFSC FATES Advanced Technology/BIOLOGY_LIFE_HISTORY_DATA/age_testing/images”
`preprocessed_image_path`	Path to save the processed images. Best to include the full path in quotations and to use a dedicated folder.
`input_type`	Input image type
`output_type`	Output image type. Should not need to be changed.
`segment`	Scale segmentation method: “binary” for binary thresholding and “sam” for Segment Anything Model (SAM). Binary thresholding should work fine if images are high contrast with light scales on a dark background. SAM is more robust to variable image conditions but requires more processing time and a GPU.

Binary threshold segmentation parameters
`binary_threshold`	Binary threshold pixel value for differentiation between foreground (scale) and background. Default: 100

Segment Anything Model (SAM) parameters
`points_per_side`	Number of points to use for automatic segmentation of scales with SAM. This should be adjusted based on size of object of interest with respect to the entire image. In general, you want number of points to be greater than the ratio of image size/object size for the smallest object of interest. Having too many points though could greatly increase processing time. Default: 16
`stability_score_thresh`	Threshold for whether to include pixels in object mask. If the mask is too large, increase the score threshold and vice versa if the mask is too small. Default: 0.93
`downsample`	Down-sample image size for input to SAM to reduce processing time. Default: 0.5 (i.e. reduce image dimensions to 50% of original size)
`sam_model_type`	SAM model type. Options are “vit_b”, “vit_l”, and “vit_h” in order of increasing size. Default: “vit_b”
`sam_weights_path`	Path to SAM model weights. Make sure this matches the model type. Best to use the full path in quotations.

Cropping and padding parameters
`pad`	Padding for top and sides of cropped image. Defined as a fraction of the original cropped image size. Bottom padding is controlled by `bottom_pad`. Default: 0.2
`bottom_pad`	Padding for bottom of cropped image. This is defined separately since for scale images, the bottom is usually visually distinct from the body and may be missed in the segmentation. Default: 0.4

Image normalization options
`normalization`	Method for optionally normalizing the image after cropping and padding. Options: “none”, for no normalization, “he” for histogram equalization, and “clahe” for Contrast Limited Adaptive Histogram Equalization.
`invert`	Option to invert the pixel values in gray scale before normalization. This will make dark regions light and light regions dark.

Age Model Options

The following options control the age inference model. They are entered into the same configuration file for convenience.

Configuration for age inference model
Key	Description
`image_path`	Path to preprocessed image folder. Best to include the full path in quotations. This should match `preprocessed_image_path` from the pre-processing step.
`model_path`	Path to model weights (directory and file name of weights file). Best to include the full path in quotations.
`out_path`	Path to save results (directory and file name for CSV file). Best to include the full path in quotations.

The most important settings in this file to pay attention to are:

The directory containing the scale images to process
The directory where you want the model output file to be written
The directory containing the trained model weights (i.e., the best_model.pth file) and, if desired, the Segment Anything Model weights. If you simply cloned the repo and have not moved anything around, the trained model weights file will be alongside the model script in the scripts subdirectory in the cloned repository. The SAM weights will be wherever you saved them upon downloading them.

Absolute vs. relative file paths

Absolute file paths are generally recommended to avoid unintended behavior but will vary from computer to computer.

Dependencies

Anaconda (for virtual environment implementation)
Git CLI (for repository cloning)
Python >= 3.9
torch ==1.12.1
torchvision == 0.13.1
torchaudio == 0.12.1
opencv-python
pandas
tqdm

Recommendations

matplotlib (data plotting in Python)
jupyter (viewing Jupyter notebook demonstrations provided in the GitHub repo)

See Getting Started for installation and setup instructions.

Release Notes

Version History

2025.0.1 (BETA) (May 2025): Initial version for testing with the following functionality:
- Run via command line with image, output, and model directories passed as required arguments

2025.0.2 (BETA) (July 2025):
- Adds optional histogram normalization (simple histogram equalization and Contrast Limited Adaptive Histogram Equalization)
- Separates image pre-processing and scale ageing into two separate scripts
- Model settings and hyperparameters, including input and output file paths, are now listed in a configuration YAML file passed as a single command line argument to both Python scripts

License

Software code created by U.S. Government employees is not subject to copyright in the United States (17 U.S.C. §105). The United States/Department of Commerce reserve all rights to seek and obtain copyright protection in countries other than the United States for Software authored in its entirety by the Department of Commerce. To this end, the Department of Commerce hereby grants to Recipient a royalty-free, nonexclusive license to use, copy, and create derivative works of the Software outside of the United States.

Disclaimer

This software is a scientific product and is not official communication of the National Oceanic and Atmospheric Administration, or the United States Department of Commerce. All NOAA software and project code is provided on an “as is” basis and the user assumes responsibility for its use. Any claims against the Department of Commerce or Department of Commerce bureaus stemming from the use of this software will be governed by all applicable Federal law. Any reference to specific commercial products, processes, or services by service mark, trademark, manufacturer, or otherwise, does not constitute or imply their endorsement, recommendation or favoring by the Department of Commerce. The Department of Commerce seal and logo, or the seal and logo of a DOC bureau, shall not be used in any manner to imply endorsement of any commercial product or activity by DOC or the United States Government.