Skip to content

API: Utils

hydra_suite.utils

Utility modules for the HYDRA Suite.

This package contains various utility functions and classes for image processing, CSV writing, ROI handling, video processing, GPU acceleration, and other common operations.

FramePrefetcher

Asynchronous frame prefetcher for video processing.

Uses a background thread to read frames ahead of time, reducing I/O blocking in the main processing loop. Maintains a small buffer of pre-read frames.

Example: cap = cv2.VideoCapture("video.mp4") prefetcher = FramePrefetcher(cap, buffer_size=2) prefetcher.start()

while True:
    ret, frame = prefetcher.read()
    if not ret:
        break
    # Process frame...

prefetcher.stop()

__init__(video_capture, buffer_size=2, read_timeout=30.0)

Initialize frame prefetcher.

Args: video_capture: OpenCV VideoCapture object buffer_size (int): Number of frames to buffer (default: 2) Higher = more memory, but better I/O tolerance read_timeout (float): Seconds to wait for a frame before declaring a stall (default: 30). Increase when the decode backend shares resources with GPU inference.

start()

Start the background prefetching thread.

read()

Read the next frame (from prefetch buffer).

Returns: tuple: (ret, frame) where ret is bool indicating success, and frame is the numpy array or None

stop()

Stop the prefetching thread and clean up.

__enter__()

Context manager support.

__exit__(_exc_type, _exc_val, _exc_tb)

Context manager support.

FramePrefetcherBackward

Bases: FramePrefetcher

Frame prefetcher for backward (reverse) video iteration.

Extends FramePrefetcher to support reading frames in reverse order by seeking backward through the video.

__init__(video_capture, buffer_size=2, total_frames=None)

Initialize backward frame prefetcher.

Args: video_capture: OpenCV VideoCapture object buffer_size (int): Number of frames to buffer total_frames (int): Total frames in video (required for backward seeking)

fit_circle_to_points(points)

Fit a circle to a set of points using least squares optimization.

Uses algebraic circle fitting method for robust estimation from 3+ points.

Args: points (list): List of (x, y) coordinate tuples

Returns: tuple: (center_x, center_y, radius) or None if fitting fails

wrap_angle_degs(deg)

Normalize angle to [-180, 180] degree range.

This function is crucial for orientation tracking to ensure smooth angle transitions and prevent discontinuities at the 0/360 boundary.

Args: deg (float): Input angle in degrees

Returns: float: Normalized angle in range [-180, 180] degrees

get_device_info()

Get information about available compute devices.

Returns: dict: Device availability information

log_device_info()

Log available compute devices to help with debugging.

apply_image_adjustments(gray, brightness, contrast, gamma, use_gpu=False)

Apply brightness, contrast, and gamma corrections to grayscale image.

Optimized with: - Cached gamma LUT generation (avoids Python loops) - CuPy GPU acceleration when available - Numba JIT for CPU path - Vectorized operations

Args: gray (np.ndarray): Input grayscale image brightness (float): Brightness adjustment (-255 to +255) contrast (float): Contrast multiplier (0.0 to 3.0+) gamma (float): Gamma correction factor (0.1 to 3.0+) use_gpu (bool): Use GPU acceleration if available

Returns: np.ndarray: Adjusted grayscale image

Note: - Brightness: Additive adjustment (linear shift) - Contrast: Multiplicative adjustment (scaling) - Gamma: Power-law transformation for non-linear luminance correction

stabilize_lighting(frame, reference_intensity, current_intensity_history, alpha=0.95, roi_mask=None, median_window=5, lighting_state=None, use_gpu=False)

Stabilize lighting conditions by normalizing frame intensity to a reference level.

Optimized with: - Numba JIT for percentile and robust mean calculations - CuPy GPU acceleration for array operations - Efficient vectorized operations - Reduced Python overhead

This function compensates for gradual lighting changes by: 1. Computing frame's global intensity statistics (within ROI if provided) 2. Comparing to reference intensity established during background priming 3. Applying smooth intensity correction to maintain consistent illumination 4. Using rolling history with median filtering to suppress high-frequency noise

Args: frame (np.ndarray): Input grayscale frame reference_intensity (float): Target intensity level from background priming current_intensity_history (deque): Rolling history of recent frame intensities alpha (float): Smoothing factor for intensity adaptation (0.9-0.99) roi_mask (np.ndarray, optional): Binary mask defining region of interest median_window (int): Window size for median filtering (3-15) lighting_state (dict, optional): Dictionary to store smoothing state use_gpu (bool): Use GPU acceleration if available

Returns: tuple: (stabilized_frame, updated_intensity_history, current_mean_intensity)

hydra_suite.utils.gpu_utils

GPU utilities and device detection for the HYDRA Suite.

This module provides centralized GPU availability detection and utilities that can be used throughout the codebase. Supports: - CUDA (NVIDIA GPUs via CuPy) - MPS (Apple Silicon via PyTorch) - Automatic fallback to CPU

Import this module to check GPU availability: from hydra_suite.utils.gpu_utils import CUDA_AVAILABLE, MPS_AVAILABLE, GPU_AVAILABLE

get_device_info()

Get information about available compute devices.

Returns: dict: Device availability information

log_device_info()

Log available compute devices to help with debugging.

get_optimal_device(enable_gpu=True, prefer_cuda=True)

Get the optimal compute device based on availability.

Args: enable_gpu: Whether to use GPU if available prefer_cuda: Prefer CUDA over MPS if both available

Returns: tuple: (device_type, device_object) device_type: 'cuda', 'mps', or 'cpu' device_object: GPU device object or None for CPU

get_pose_runtime_options(backend_family='yolo')

Return runtime options for pose inference as list[(label, value)].

Values are normalized ids consumed by runtime_api, e.g.: - auto - cpu / mps / cuda / rocm - onnx_cpu / onnx_cuda - tensorrt_cuda

hydra_suite.utils.image_processing

Utility functions for image processing in the HYDRA Suite.

Optimized with Numba JIT and GPU acceleration (CuPy/PyTorch) where available.

apply_image_adjustments(gray, brightness, contrast, gamma, use_gpu=False)

Apply brightness, contrast, and gamma corrections to grayscale image.

Optimized with: - Cached gamma LUT generation (avoids Python loops) - CuPy GPU acceleration when available - Numba JIT for CPU path - Vectorized operations

Args: gray (np.ndarray): Input grayscale image brightness (float): Brightness adjustment (-255 to +255) contrast (float): Contrast multiplier (0.0 to 3.0+) gamma (float): Gamma correction factor (0.1 to 3.0+) use_gpu (bool): Use GPU acceleration if available

Returns: np.ndarray: Adjusted grayscale image

Note: - Brightness: Additive adjustment (linear shift) - Contrast: Multiplicative adjustment (scaling) - Gamma: Power-law transformation for non-linear luminance correction

stabilize_lighting(frame, reference_intensity, current_intensity_history, alpha=0.95, roi_mask=None, median_window=5, lighting_state=None, use_gpu=False)

Stabilize lighting conditions by normalizing frame intensity to a reference level.

Optimized with: - Numba JIT for percentile and robust mean calculations - CuPy GPU acceleration for array operations - Efficient vectorized operations - Reduced Python overhead

This function compensates for gradual lighting changes by: 1. Computing frame's global intensity statistics (within ROI if provided) 2. Comparing to reference intensity established during background priming 3. Applying smooth intensity correction to maintain consistent illumination 4. Using rolling history with median filtering to suppress high-frequency noise

Args: frame (np.ndarray): Input grayscale frame reference_intensity (float): Target intensity level from background priming current_intensity_history (deque): Rolling history of recent frame intensities alpha (float): Smoothing factor for intensity adaptation (0.9-0.99) roi_mask (np.ndarray, optional): Binary mask defining region of interest median_window (int): Window size for median filtering (3-15) lighting_state (dict, optional): Dictionary to store smoothing state use_gpu (bool): Use GPU acceleration if available

Returns: tuple: (stabilized_frame, updated_intensity_history, current_mean_intensity)

compute_median_color_from_frame(frame)

Compute the median color (BGR) from a frame.

Useful for setting background color to match the input video's color profile.

Args: frame: Input frame (BGR, shape: H x W x 3)

Returns: Tuple of (B, G, R) median values

hydra_suite.utils.geometry

Utility functions for geometry operations in the HYDRA Suite.

fit_circle_to_points(points)

Fit a circle to a set of points using least squares optimization.

Uses algebraic circle fitting method for robust estimation from 3+ points.

Args: points (list): List of (x, y) coordinate tuples

Returns: tuple: (center_x, center_y, radius) or None if fitting fails

wrap_angle_degs(deg)

Normalize angle to [-180, 180] degree range.

This function is crucial for orientation tracking to ensure smooth angle transitions and prevent discontinuities at the 0/360 boundary.

Args: deg (float): Input angle in degrees

Returns: float: Normalized angle in range [-180, 180] degrees

estimate_detection_crop_quality(shape, reference_body_size)

Estimate crop quality from detection geometry.

Returns a float in [0, 1] measuring how well the detection's minor axis matches the reference body size.

apply_foreign_obb_mask(crop, x_offset, y_offset, other_corners_list, background_color=128)

Fill pixels in crop that belong to other animals' OBB regions.

Shifts each foreign OBB from frame coordinates into crop-local coordinates and fills the polygon with background_color using cv2.fillPoly.

Args: crop: BGR (or grayscale) image crop extracted from the full frame. x_offset: Horizontal offset of the crop's top-left corner in frame coords. y_offset: Vertical offset of the crop's top-left corner in frame coords. other_corners_list: Sequence of (4, 2) float32 arrays of OBB corners in frame coordinates for every other detected animal. background_color: Fill value — either a scalar (0–255) applied to all channels, or a (B, G, R) tuple for colour crops.

Returns: Modified copy of crop with foreign-animal regions filled.

filter_keypoints_by_foreign_obbs(keypoints, all_corners_list, target_idx)

Zero confidence of keypoints that fall inside another animal's OBB.

Operates on global frame coordinates (after crop back-projection).

Args: keypoints: [K, 3] float32 array of (x, y, conf) in frame coordinates. all_corners_list: List of (4, 2) float32 OBB corner arrays for every detection in the frame (including the target). target_idx: Index into all_corners_list identifying the current animal — its own OBB is skipped.

Returns: Modified copy of keypoints with contaminated entries having conf=0. X/Y coordinates are preserved.

hydra_suite.utils.batch_optimizer

Batch size optimizer for YOLO detection based on available device memory.

BatchOptimizer

Optimize batch size for YOLO inference based on device capabilities.

__init__(advanced_config=None)

Initialize batch optimizer.

Args: advanced_config: Dictionary with memory allocation settings

detect_device()

Detect available compute device and its memory.

Returns: tuple: (device_type, device_name, available_memory_mb)

estimate_batch_size(frame_width, frame_height, model_name='yolo26s-obb.pt')

Estimate optimal batch size for YOLO inference.

Args: frame_width: Video frame width frame_height: Video frame height model_name: YOLO model name (for memory estimation)

Returns: int: Recommended batch size (1 if batching not recommended)

get_device_info()

Get human-readable device information.

Returns: dict: Device information for display

hydra_suite.utils.frame_prefetcher

Frame prefetching utility for asynchronous video frame loading.

This module provides a thread-based frame prefetcher that reads video frames in the background while the main tracking thread processes the current frame, reducing I/O wait times and improving overall throughput.

FramePrefetcher

Asynchronous frame prefetcher for video processing.

Uses a background thread to read frames ahead of time, reducing I/O blocking in the main processing loop. Maintains a small buffer of pre-read frames.

Example: cap = cv2.VideoCapture("video.mp4") prefetcher = FramePrefetcher(cap, buffer_size=2) prefetcher.start()

while True:
    ret, frame = prefetcher.read()
    if not ret:
        break
    # Process frame...

prefetcher.stop()

__init__(video_capture, buffer_size=2, read_timeout=30.0)

Initialize frame prefetcher.

Args: video_capture: OpenCV VideoCapture object buffer_size (int): Number of frames to buffer (default: 2) Higher = more memory, but better I/O tolerance read_timeout (float): Seconds to wait for a frame before declaring a stall (default: 30). Increase when the decode backend shares resources with GPU inference.

start()

Start the background prefetching thread.

read()

Read the next frame (from prefetch buffer).

Returns: tuple: (ret, frame) where ret is bool indicating success, and frame is the numpy array or None

stop()

Stop the prefetching thread and clean up.

__enter__()

Context manager support.

__exit__(_exc_type, _exc_val, _exc_tb)

Context manager support.

SparseFramePrefetcher

Prefetcher for a pre-determined list of sparse frame indices.

Reads frames in a background thread using seek-then-read, skipping the seek when frames are contiguous. The main thread calls read() to get (frame_idx, ret, frame) tuples in the same order as the supplied frame_indices list.

read()

Return (frame_idx, ret, frame) or None at end.

SequentialScanPrefetcher

Prefetcher that does a single sequential forward pass through a frame range.

Instead of seeking to each needed frame individually (expensive with H.264/H.265 codecs), this reads every frame from min(frame_indices) to max(frame_indices) sequentially and only queues frames that appear in the frame_indices set. Frames not in the set are decoded but immediately discarded.

This is dramatically faster than :class:SparseFramePrefetcher when the needed frames are spread across a large portion of the video range, because sequential cap.read() avoids the per-frame seek cost (~5–50 ms each on compressed codecs).

read()

Return (frame_idx, ret, frame) or None at end.

FramePrefetcherBackward

Bases: FramePrefetcher

Frame prefetcher for backward (reverse) video iteration.

Extends FramePrefetcher to support reading frames in reverse order by seeking backward through the video.

__init__(video_capture, buffer_size=2, total_frames=None)

Initialize backward frame prefetcher.

Args: video_capture: OpenCV VideoCapture object buffer_size (int): Number of frames to buffer total_frames (int): Total frames in video (required for backward seeking)