API: Utils¶
hydra_suite.utils
¶
Utility modules for the HYDRA Suite.
This package contains various utility functions and classes for image processing, CSV writing, ROI handling, video processing, GPU acceleration, and other common operations.
FramePrefetcher
¶
Asynchronous frame prefetcher for video processing.
Uses a background thread to read frames ahead of time, reducing I/O blocking in the main processing loop. Maintains a small buffer of pre-read frames.
Example: cap = cv2.VideoCapture("video.mp4") prefetcher = FramePrefetcher(cap, buffer_size=2) prefetcher.start()
while True:
ret, frame = prefetcher.read()
if not ret:
break
# Process frame...
prefetcher.stop()
__init__(video_capture, buffer_size=2, read_timeout=30.0)
¶
Initialize frame prefetcher.
Args: video_capture: OpenCV VideoCapture object buffer_size (int): Number of frames to buffer (default: 2) Higher = more memory, but better I/O tolerance read_timeout (float): Seconds to wait for a frame before declaring a stall (default: 30). Increase when the decode backend shares resources with GPU inference.
start()
¶
Start the background prefetching thread.
read()
¶
Read the next frame (from prefetch buffer).
Returns: tuple: (ret, frame) where ret is bool indicating success, and frame is the numpy array or None
stop()
¶
Stop the prefetching thread and clean up.
__enter__()
¶
Context manager support.
__exit__(_exc_type, _exc_val, _exc_tb)
¶
Context manager support.
FramePrefetcherBackward
¶
Bases: FramePrefetcher
Frame prefetcher for backward (reverse) video iteration.
Extends FramePrefetcher to support reading frames in reverse order by seeking backward through the video.
__init__(video_capture, buffer_size=2, total_frames=None)
¶
Initialize backward frame prefetcher.
Args: video_capture: OpenCV VideoCapture object buffer_size (int): Number of frames to buffer total_frames (int): Total frames in video (required for backward seeking)
fit_circle_to_points(points)
¶
Fit a circle to a set of points using least squares optimization.
Uses algebraic circle fitting method for robust estimation from 3+ points.
Args: points (list): List of (x, y) coordinate tuples
Returns: tuple: (center_x, center_y, radius) or None if fitting fails
wrap_angle_degs(deg)
¶
Normalize angle to [-180, 180] degree range.
This function is crucial for orientation tracking to ensure smooth angle transitions and prevent discontinuities at the 0/360 boundary.
Args: deg (float): Input angle in degrees
Returns: float: Normalized angle in range [-180, 180] degrees
get_device_info()
¶
Get information about available compute devices.
Returns: dict: Device availability information
log_device_info()
¶
Log available compute devices to help with debugging.
apply_image_adjustments(gray, brightness, contrast, gamma, use_gpu=False)
¶
Apply brightness, contrast, and gamma corrections to grayscale image.
Optimized with: - Cached gamma LUT generation (avoids Python loops) - CuPy GPU acceleration when available - Numba JIT for CPU path - Vectorized operations
Args: gray (np.ndarray): Input grayscale image brightness (float): Brightness adjustment (-255 to +255) contrast (float): Contrast multiplier (0.0 to 3.0+) gamma (float): Gamma correction factor (0.1 to 3.0+) use_gpu (bool): Use GPU acceleration if available
Returns: np.ndarray: Adjusted grayscale image
Note: - Brightness: Additive adjustment (linear shift) - Contrast: Multiplicative adjustment (scaling) - Gamma: Power-law transformation for non-linear luminance correction
stabilize_lighting(frame, reference_intensity, current_intensity_history, alpha=0.95, roi_mask=None, median_window=5, lighting_state=None, use_gpu=False)
¶
Stabilize lighting conditions by normalizing frame intensity to a reference level.
Optimized with: - Numba JIT for percentile and robust mean calculations - CuPy GPU acceleration for array operations - Efficient vectorized operations - Reduced Python overhead
This function compensates for gradual lighting changes by: 1. Computing frame's global intensity statistics (within ROI if provided) 2. Comparing to reference intensity established during background priming 3. Applying smooth intensity correction to maintain consistent illumination 4. Using rolling history with median filtering to suppress high-frequency noise
Args: frame (np.ndarray): Input grayscale frame reference_intensity (float): Target intensity level from background priming current_intensity_history (deque): Rolling history of recent frame intensities alpha (float): Smoothing factor for intensity adaptation (0.9-0.99) roi_mask (np.ndarray, optional): Binary mask defining region of interest median_window (int): Window size for median filtering (3-15) lighting_state (dict, optional): Dictionary to store smoothing state use_gpu (bool): Use GPU acceleration if available
Returns: tuple: (stabilized_frame, updated_intensity_history, current_mean_intensity)
hydra_suite.utils.gpu_utils
¶
GPU utilities and device detection for the HYDRA Suite.
This module provides centralized GPU availability detection and utilities that can be used throughout the codebase. Supports: - CUDA (NVIDIA GPUs via CuPy) - MPS (Apple Silicon via PyTorch) - Automatic fallback to CPU
Import this module to check GPU availability: from hydra_suite.utils.gpu_utils import CUDA_AVAILABLE, MPS_AVAILABLE, GPU_AVAILABLE
get_device_info()
¶
Get information about available compute devices.
Returns: dict: Device availability information
log_device_info()
¶
Log available compute devices to help with debugging.
get_optimal_device(enable_gpu=True, prefer_cuda=True)
¶
Get the optimal compute device based on availability.
Args: enable_gpu: Whether to use GPU if available prefer_cuda: Prefer CUDA over MPS if both available
Returns: tuple: (device_type, device_object) device_type: 'cuda', 'mps', or 'cpu' device_object: GPU device object or None for CPU
get_pose_runtime_options(backend_family='yolo')
¶
Return runtime options for pose inference as list[(label, value)].
Values are normalized ids consumed by runtime_api, e.g.: - auto - cpu / mps / cuda / rocm - onnx_cpu / onnx_cuda - tensorrt_cuda
hydra_suite.utils.image_processing
¶
Utility functions for image processing in the HYDRA Suite.
Optimized with Numba JIT and GPU acceleration (CuPy/PyTorch) where available.
apply_image_adjustments(gray, brightness, contrast, gamma, use_gpu=False)
¶
Apply brightness, contrast, and gamma corrections to grayscale image.
Optimized with: - Cached gamma LUT generation (avoids Python loops) - CuPy GPU acceleration when available - Numba JIT for CPU path - Vectorized operations
Args: gray (np.ndarray): Input grayscale image brightness (float): Brightness adjustment (-255 to +255) contrast (float): Contrast multiplier (0.0 to 3.0+) gamma (float): Gamma correction factor (0.1 to 3.0+) use_gpu (bool): Use GPU acceleration if available
Returns: np.ndarray: Adjusted grayscale image
Note: - Brightness: Additive adjustment (linear shift) - Contrast: Multiplicative adjustment (scaling) - Gamma: Power-law transformation for non-linear luminance correction
stabilize_lighting(frame, reference_intensity, current_intensity_history, alpha=0.95, roi_mask=None, median_window=5, lighting_state=None, use_gpu=False)
¶
Stabilize lighting conditions by normalizing frame intensity to a reference level.
Optimized with: - Numba JIT for percentile and robust mean calculations - CuPy GPU acceleration for array operations - Efficient vectorized operations - Reduced Python overhead
This function compensates for gradual lighting changes by: 1. Computing frame's global intensity statistics (within ROI if provided) 2. Comparing to reference intensity established during background priming 3. Applying smooth intensity correction to maintain consistent illumination 4. Using rolling history with median filtering to suppress high-frequency noise
Args: frame (np.ndarray): Input grayscale frame reference_intensity (float): Target intensity level from background priming current_intensity_history (deque): Rolling history of recent frame intensities alpha (float): Smoothing factor for intensity adaptation (0.9-0.99) roi_mask (np.ndarray, optional): Binary mask defining region of interest median_window (int): Window size for median filtering (3-15) lighting_state (dict, optional): Dictionary to store smoothing state use_gpu (bool): Use GPU acceleration if available
Returns: tuple: (stabilized_frame, updated_intensity_history, current_mean_intensity)
compute_median_color_from_frame(frame)
¶
Compute the median color (BGR) from a frame.
Useful for setting background color to match the input video's color profile.
Args: frame: Input frame (BGR, shape: H x W x 3)
Returns: Tuple of (B, G, R) median values
hydra_suite.utils.geometry
¶
Utility functions for geometry operations in the HYDRA Suite.
fit_circle_to_points(points)
¶
Fit a circle to a set of points using least squares optimization.
Uses algebraic circle fitting method for robust estimation from 3+ points.
Args: points (list): List of (x, y) coordinate tuples
Returns: tuple: (center_x, center_y, radius) or None if fitting fails
wrap_angle_degs(deg)
¶
Normalize angle to [-180, 180] degree range.
This function is crucial for orientation tracking to ensure smooth angle transitions and prevent discontinuities at the 0/360 boundary.
Args: deg (float): Input angle in degrees
Returns: float: Normalized angle in range [-180, 180] degrees
estimate_detection_crop_quality(shape, reference_body_size)
¶
Estimate crop quality from detection geometry.
Returns a float in [0, 1] measuring how well the detection's minor axis matches the reference body size.
apply_foreign_obb_mask(crop, x_offset, y_offset, other_corners_list, background_color=128)
¶
Fill pixels in crop that belong to other animals' OBB regions.
Shifts each foreign OBB from frame coordinates into crop-local coordinates
and fills the polygon with background_color using cv2.fillPoly.
Args: crop: BGR (or grayscale) image crop extracted from the full frame. x_offset: Horizontal offset of the crop's top-left corner in frame coords. y_offset: Vertical offset of the crop's top-left corner in frame coords. other_corners_list: Sequence of (4, 2) float32 arrays of OBB corners in frame coordinates for every other detected animal. background_color: Fill value — either a scalar (0–255) applied to all channels, or a (B, G, R) tuple for colour crops.
Returns: Modified copy of crop with foreign-animal regions filled.
filter_keypoints_by_foreign_obbs(keypoints, all_corners_list, target_idx)
¶
Zero confidence of keypoints that fall inside another animal's OBB.
Operates on global frame coordinates (after crop back-projection).
Args: keypoints: [K, 3] float32 array of (x, y, conf) in frame coordinates. all_corners_list: List of (4, 2) float32 OBB corner arrays for every detection in the frame (including the target). target_idx: Index into all_corners_list identifying the current animal — its own OBB is skipped.
Returns: Modified copy of keypoints with contaminated entries having conf=0. X/Y coordinates are preserved.
hydra_suite.utils.batch_optimizer
¶
Batch size optimizer for YOLO detection based on available device memory.
BatchOptimizer
¶
Optimize batch size for YOLO inference based on device capabilities.
__init__(advanced_config=None)
¶
Initialize batch optimizer.
Args: advanced_config: Dictionary with memory allocation settings
detect_device()
¶
Detect available compute device and its memory.
Returns: tuple: (device_type, device_name, available_memory_mb)
estimate_batch_size(frame_width, frame_height, model_name='yolo26s-obb.pt')
¶
Estimate optimal batch size for YOLO inference.
Args: frame_width: Video frame width frame_height: Video frame height model_name: YOLO model name (for memory estimation)
Returns: int: Recommended batch size (1 if batching not recommended)
get_device_info()
¶
Get human-readable device information.
Returns: dict: Device information for display
hydra_suite.utils.frame_prefetcher
¶
Frame prefetching utility for asynchronous video frame loading.
This module provides a thread-based frame prefetcher that reads video frames in the background while the main tracking thread processes the current frame, reducing I/O wait times and improving overall throughput.
FramePrefetcher
¶
Asynchronous frame prefetcher for video processing.
Uses a background thread to read frames ahead of time, reducing I/O blocking in the main processing loop. Maintains a small buffer of pre-read frames.
Example: cap = cv2.VideoCapture("video.mp4") prefetcher = FramePrefetcher(cap, buffer_size=2) prefetcher.start()
while True:
ret, frame = prefetcher.read()
if not ret:
break
# Process frame...
prefetcher.stop()
__init__(video_capture, buffer_size=2, read_timeout=30.0)
¶
Initialize frame prefetcher.
Args: video_capture: OpenCV VideoCapture object buffer_size (int): Number of frames to buffer (default: 2) Higher = more memory, but better I/O tolerance read_timeout (float): Seconds to wait for a frame before declaring a stall (default: 30). Increase when the decode backend shares resources with GPU inference.
start()
¶
Start the background prefetching thread.
read()
¶
Read the next frame (from prefetch buffer).
Returns: tuple: (ret, frame) where ret is bool indicating success, and frame is the numpy array or None
stop()
¶
Stop the prefetching thread and clean up.
__enter__()
¶
Context manager support.
__exit__(_exc_type, _exc_val, _exc_tb)
¶
Context manager support.
SparseFramePrefetcher
¶
Prefetcher for a pre-determined list of sparse frame indices.
Reads frames in a background thread using seek-then-read, skipping the
seek when frames are contiguous. The main thread calls read() to
get (frame_idx, ret, frame) tuples in the same order as the
supplied frame_indices list.
read()
¶
Return (frame_idx, ret, frame) or None at end.
SequentialScanPrefetcher
¶
Prefetcher that does a single sequential forward pass through a frame range.
Instead of seeking to each needed frame individually (expensive with
H.264/H.265 codecs), this reads every frame from min(frame_indices)
to max(frame_indices) sequentially and only queues frames that appear
in the frame_indices set. Frames not in the set are decoded but
immediately discarded.
This is dramatically faster than :class:SparseFramePrefetcher when the
needed frames are spread across a large portion of the video range, because
sequential cap.read() avoids the per-frame seek cost (~5–50 ms each on
compressed codecs).
read()
¶
Return (frame_idx, ret, frame) or None at end.
FramePrefetcherBackward
¶
Bases: FramePrefetcher
Frame prefetcher for backward (reverse) video iteration.
Extends FramePrefetcher to support reading frames in reverse order by seeking backward through the video.
__init__(video_capture, buffer_size=2, total_frames=None)
¶
Initialize backward frame prefetcher.
Args: video_capture: OpenCV VideoCapture object buffer_size (int): Number of frames to buffer total_frames (int): Total frames in video (required for backward seeking)