API: Core¶
hydra_suite.core
¶
Core tracking algorithms and components for the HYDRA Suite.
This package contains the tracking worker and supporting components for multi-object tracking including Kalman filters, background models, object detection, and track assignment.
TrackAssigner
¶
Handles assignment of detections to tracks with optimizations.
compute_cost_matrix(N, measurements, predictions, shapes, kf_manager, last_shape_info, meas_ori_directed=None, association_data=None)
¶
Computes cost matrix. Compatible with Vectorized Kalman Filter.
compute_assignment_confidence(cost, matched_pairs)
¶
Compute confidence scores for assignments.
assign_tracks(cost, N, M, meas, track_states, tracking_continuity, kf_manager, spatial_candidates=None, association_data=None, committed_slot_identities=None, missed_frames=None)
¶
Drop-in replacement for track assignment logic. Compatible with kf_manager.X state access.
Returns (rows, cols, free_dets, identity_rejoin_pairs) where
identity_rejoin_pairs is a list of (slot_index, det_index)
tuples from the identity-only rejoin path for committed-lost slots.
BackgroundModel
¶
Manages background models for foreground detection in tracking. Supports GPU acceleration via: - CUDA (NVIDIA GPUs) using CuPy - MPS (Apple Silicon) using PyTorch - CPU fallback with Numba JIT optimization
prime_background(cap)
¶
Initialize background model using "lightest pixel" method with lighting reference.
This is an exact port of the original prime_lightest_background method.
update_and_get_background(gray, roi_mask, tracking_stabilized)
¶
Update the background model and return the active subtraction background.
generate_foreground_mask(gray, background)
¶
Generates the foreground mask from the gray frame and background.
Uses GPU acceleration if available for significant speedup on large frames. Supports both CUDA (NVIDIA) and MPS (Apple Silicon) GPUs. Falls back to CPU if GPU operations fail (e.g., CuPy compilation errors).
ObjectDetector
¶
Detects objects in foreground masks and extracts measurements.
apply_conservative_split(fg_mask, gray=None, background=None)
¶
Split merged blobs by locally raising the threshold.
If gray and background are provided the split uses a tighter threshold on the raw difference image inside suspicious regions, which preserves animal shape better than erosion. Falls back to simple erosion when the raw images are unavailable (e.g. cached detection replays).
detect_objects(fg_mask, frame_count)
¶
Detects and measures objects from the final foreground mask.
Returns: meas: List of measurements [cx, cy, angle] sizes: List of detection areas shapes: List of (area, aspect_ratio) tuples yolo_results: None (for compatibility with YOLO detector) confidences: List of detection confidence scores (0-1)
KalmanFilterManager
¶
Manages a batch of biologically-constrained Kalman Filters.
initialize_filter(track_idx, initial_state)
¶
Reset one track slot with a new initial state estimate.
predict()
¶
Predict next measurement-space states for all active track slots.
get_predictions()
¶
Compatibility wrapper returning predict() output.
correct(track_idx, measurement, theta_r_scale=1.0)
¶
Correct a track with one measurement update.
Parameters¶
theta_r_scale : float Multiplier applied to R[2,2] (theta measurement noise) for this correction. Values > 1 make the filter trust its own heading prediction more than the measurement; used when heading confidence is low.
get_mahalanobis_matrices()
¶
Return inverse innovation covariance matrices used by assignment.
get_position_uncertainties()
¶
Return per-track positional uncertainty summary values.
IndividualDatasetGenerator
¶
Real-time individual dataset generator that runs during tracking.
Exports OBB-masked crops for each detection, where only the detected animal's OBB region is visible and the rest is masked. This provides clean, isolated training data for individual-level analysis.
Key features: - Runs in parallel with tracking (called per-frame) - Uses actual OBB polygon to mask out other animals - Saves crops with track/trajectory ID labels - Uses already-filtered detections (ROI + size filtering done by tracking)
__init__(params, output_dir, video_name, dataset_name=None)
¶
Initialize the individual dataset generator.
Args: params: Parameter dictionary output_dir: Base directory for saving crops video_name: Name of the source video (for organizing output) dataset_name: Optional custom name for the dataset
process_frame(frame, frame_id, meas, obb_corners=None, ellipse_params=None, confidences=None, track_ids=None, trajectory_ids=None, coord_scale_factor=1.0, detection_ids=None, heading_hints=None, directed_mask=None, velocities=None, canonical_affines=None, canonical_canvas_dims=None, canonical_M_inverse=None)
¶
Process a frame and save masked crops for each detection.
Supports both YOLO OBB detections and ellipse detections from background subtraction. For ellipses, OBB corners are computed from the ellipse parameters.
Detections passed here are already filtered by ROI and size filtering in the tracking pipeline - no additional filtering needed.
Args: frame: Input frame (BGR) - should be ORIGINAL resolution frame_id: Current frame number meas: List of measurements [cx, cy, theta, ...] for each detection (in detection resolution) obb_corners: List of OBB corner arrays (4 points each) for YOLO detections (in detection resolution) ellipse_params: List of ellipse params [major_axis, minor_axis] for BG sub detections (in detection resolution) (center and theta are taken from meas) confidences: Optional list of confidence scores track_ids: Optional list of track IDs for each detection trajectory_ids: Optional list of trajectory IDs for each detection coord_scale_factor: Scale factor to convert detection coords to original resolution (1/resize_factor) detection_ids: Optional list of unique Detection IDs for each detection heading_hints: Optional directed heading angles (radians) per detection from head-tail model. directed_mask: Optional boolean mask (0/1) per detection indicating whether heading_hints is valid. velocities: Optional (vx, vy) tuples per detection for motion-based fallback orientation. canonical_affines: Optional M_canonical (2x3) affine matrices per detection. When provided and canonical crop dimensions > 0, extract rotation-normalised canonical crops instead of AABB-masked crops. Falls back to legacy AABB extraction when None or when a specific detection has no affine.
Returns: num_saved: Number of crops saved from this frame
save_interpolated_crop(frame, frame_id, cx, cy, w, h, theta, traj_id, interp_from, interp_index, interp_total, heading_angle=None, heading_directed=False, canonical_affine=None)
¶
Save one interpolated crop for trajectory gap-filling supervision.
finalize()
¶
Finalize the dataset and save metadata. Called when tracking completes.
Returns: str: Path to the dataset directory, or None if not enabled
TrackingWorker
¶
Bases: QThread
Core tracking engine. Orchestrates tracking components to be functionally identical to the original monolithic implementation.
set_parameters(p)
¶
Set full tracking parameter dictionary in a thread-safe way.
update_parameters(new_params)
¶
Slot to safely update parameters from the GUI thread.
get_current_params()
¶
Return a shallow copy of current tracking parameters.
stop()
¶
Request cooperative stop for current processing loop.
emit_frame(bgr)
¶
Emit current frame to GUI in RGB format.
run()
¶
Execute tracking pipeline for the configured video and parameters.
interpolate_trajectories(trajectories_df, method='linear', max_gap=10, heading_flip_max_burst=5, directed_heading_posthoc=False)
¶
Interpolate missing values in trajectories using various methods.
Args: trajectories_df: DataFrame with trajectory data (must have X, Y, Theta, FrameID columns) method: Interpolation method - 'linear', 'cubic', 'spline', or 'none' max_gap: Maximum gap size to interpolate (frames). Larger gaps are left as NaN. heading_flip_max_burst: Maximum length of an isolated heading-flip burst to correct in post-processing. Longer segments are assumed genuine. Ignored when directed_heading_posthoc is True. directed_heading_posthoc: When True (head-tail or pose model was used), apply global heading consistency via dynamic programming instead of the local burst-based flip correction. This resolves the minimum number of per-frame flips needed to make the entire track consistently directed.
Returns: DataFrame with interpolated values
process_trajectories(trajectories_full, params)
¶
Cleans and refines raw trajectory data.
This function performs several steps: - Removes trajectories that are too short. - Breaks trajectories at points of impossibly high velocity or large jumps, which often indicate identity switches or tracking errors.
Args: trajectories_full (list of lists): The raw trajectory data from the tracker. params (dict): The dictionary of tracking parameters.
Returns: tuple: (final_trajectories, statistics_dict)
process_trajectories_from_csv(csv_path, params)
¶
Cleans and refines trajectory data from CSV file, preserving all columns including confidence metrics.
This function performs several steps: - Removes trajectories that are too short. - Breaks trajectories at points of impossibly high velocity or large jumps, which often indicate identity switches or tracking errors. - Preserves all columns from the input CSV (including confidence metrics)
Args: csv_path (str): Path to the raw CSV file from tracking params (dict): The dictionary of tracking parameters.
Returns: tuple: (final_trajectories_df, statistics_dict)
resolve_trajectories(forward_trajs, backward_trajs, params=None)
¶
Merges forward and backward trajectories using conservative consensus-based merging.
This function prioritizes identity confidence over trajectory completeness: 1. Only considers trajectory pairs as merge candidates if they have sufficient overlapping frames where positions agree (within AGREEMENT_DISTANCE) 2. Merges only the agreeing segments - disagreeing frames cause trajectory splits 3. Results in more trajectory fragments but higher confidence in identity
Algorithm: - For each forward/backward pair, count frames where both have valid positions within AGREEMENT_DISTANCE of each other - If count >= MIN_OVERLAP_FRAMES, they are merge candidates - During merge: agreeing frames are averaged, disagreeing frames cause splits into separate trajectory segments
Args: forward_trajs (list): List of forward trajectory DataFrames or lists of tuples backward_trajs (list): List of backward trajectory DataFrames or lists of tuples params (dict, optional): Parameters for merging thresholds
Returns: list: Final merged trajectories as list of DataFrames
hydra_suite.core.tracking.worker
¶
Core tracking engine running in separate thread for real-time performance. This is the main orchestrator, functionally identical to the original.
TrackingWorker
¶
Bases: QThread
Core tracking engine. Orchestrates tracking components to be functionally identical to the original monolithic implementation.
set_parameters(p)
¶
Set full tracking parameter dictionary in a thread-safe way.
update_parameters(new_params)
¶
Slot to safely update parameters from the GUI thread.
get_current_params()
¶
Return a shallow copy of current tracking parameters.
stop()
¶
Request cooperative stop for current processing loop.
emit_frame(bgr)
¶
Emit current frame to GUI in RGB format.
run()
¶
Execute tracking pipeline for the configured video and parameters.
hydra_suite.core.detectors
¶
Object detection engines.
ObjectDetector
¶
Detects objects in foreground masks and extracts measurements.
apply_conservative_split(fg_mask, gray=None, background=None)
¶
Split merged blobs by locally raising the threshold.
If gray and background are provided the split uses a tighter threshold on the raw difference image inside suspicious regions, which preserves animal shape better than erosion. Falls back to simple erosion when the raw images are unavailable (e.g. cached detection replays).
detect_objects(fg_mask, frame_count)
¶
Detects and measures objects from the final foreground mask.
Returns: meas: List of measurements [cx, cy, angle] sizes: List of detection areas shapes: List of (area, aspect_ratio) tuples yolo_results: None (for compatibility with YOLO detector) confidences: List of detection confidence scores (0-1)
DetectionFilter
¶
Bases: OBBGeometryMixin
Lightweight post-hoc filter for cached raw YOLO detections.
Contains only confidence thresholding and OBB IOU NMS — the exact same logic used by YOLOOBBDetector.filter_raw_detections — with no model loading. Safe to instantiate cheaply inside inner optimizer loops.
Usage::
filt = DetectionFilter(params)
meas, sizes, shapes, confs, corners, *_ = filt.filter_raw_detections(
raw_meas, raw_sizes, raw_shapes, raw_confidences, raw_obb_corners
)
YOLOOBBDetector
¶
Bases: OBBGeometryMixin, RuntimeArtifactMixin
Detects objects using a pretrained YOLO OBB (Oriented Bounding Box) model. Compatible interface with ObjectDetector for seamless integration.
detect_objects(frame, frame_count, return_raw=False, profiler=None)
¶
Detects objects in a frame using YOLO OBB.
Args: frame: Input frame (grayscale or BGR) frame_count: Current frame number for logging
Returns: Default mode: meas, sizes, shapes, yolo_results, confidences If return_raw=True: raw_meas, raw_sizes, raw_shapes, yolo_results, raw_confidences, raw_obb_corners, raw_heading_hints, raw_heading_confidences, raw_directed_mask, raw_canonical_affines
detect_objects_batched(frames, start_frame_idx, progress_callback=None, return_raw=False, profiler=None)
¶
Detect objects in a batch of frames using YOLO OBB.
Args: frames: List of frames (numpy arrays) start_frame_idx: Starting frame index for this batch progress_callback: Optional callback(current, total, message) for progress updates
Returns: List of tuples per frame: - return_raw=False: (meas, sizes, shapes, confidences, obb_corners) - return_raw=True: ( raw_meas, raw_sizes, raw_shapes, raw_confidences, raw_obb_corners, raw_heading_hints, raw_heading_confidences, raw_directed_mask, raw_canonical_affines )
apply_conservative_split(fg_mask, gray=None, background=None)
¶
Placeholder method for compatibility with ObjectDetector interface. YOLO doesn't use foreground masks, so this is a no-op.
create_detector(params)
¶
Factory function to create the appropriate detector based on configuration.
Args: params: Configuration dictionary
Returns: ObjectDetector or YOLOOBBDetector instance
hydra_suite.core.background.model
¶
Background modeling utilities for multi-object tracking. Functionally identical to the original implementation's background logic.
BackgroundModel
¶
Manages background models for foreground detection in tracking. Supports GPU acceleration via: - CUDA (NVIDIA GPUs) using CuPy - MPS (Apple Silicon) using PyTorch - CPU fallback with Numba JIT optimization
prime_background(cap)
¶
Initialize background model using "lightest pixel" method with lighting reference.
This is an exact port of the original prime_lightest_background method.
update_and_get_background(gray, roi_mask, tracking_stabilized)
¶
Update the background model and return the active subtraction background.
generate_foreground_mask(gray, background)
¶
Generates the foreground mask from the gray frame and background.
Uses GPU acceleration if available for significant speedup on large frames. Supports both CUDA (NVIDIA) and MPS (Apple Silicon) GPUs. Falls back to CPU if GPU operations fail (e.g., CuPy compilation errors).
hydra_suite.core.filters.kalman
¶
Biologically-Constrained Vectorized Kalman Filter. Features: 1. Anisotropic Process Noise (Longitudinal vs. Lateral uncertainty) 2. Velocity Damping (Friction) for stop-and-go behavior 3. Joseph-Form Numerical Stability 4. Circular Angle Wrap-around
KalmanFilterManager
¶
Manages a batch of biologically-constrained Kalman Filters.
initialize_filter(track_idx, initial_state)
¶
Reset one track slot with a new initial state estimate.
predict()
¶
Predict next measurement-space states for all active track slots.
get_predictions()
¶
Compatibility wrapper returning predict() output.
correct(track_idx, measurement, theta_r_scale=1.0)
¶
Correct a track with one measurement update.
Parameters¶
theta_r_scale : float Multiplier applied to R[2,2] (theta measurement noise) for this correction. Values > 1 make the filter trust its own heading prediction more than the measurement; used when heading confidence is low.
get_mahalanobis_matrices()
¶
Return inverse innovation covariance matrices used by assignment.
get_position_uncertainties()
¶
Return per-track positional uncertainty summary values.
hydra_suite.core.assigners.hungarian
¶
Optimized Track Assigner. Compatible with Vectorized Kalman Filter. Uses batch Mahalanobis distance and Numba-accelerated spatial assignment.
TrackAssigner
¶
Handles assignment of detections to tracks with optimizations.
compute_cost_matrix(N, measurements, predictions, shapes, kf_manager, last_shape_info, meas_ori_directed=None, association_data=None)
¶
Computes cost matrix. Compatible with Vectorized Kalman Filter.
compute_assignment_confidence(cost, matched_pairs)
¶
Compute confidence scores for assignments.
assign_tracks(cost, N, M, meas, track_states, tracking_continuity, kf_manager, spatial_candidates=None, association_data=None, committed_slot_identities=None, missed_frames=None)
¶
Drop-in replacement for track assignment logic. Compatible with kf_manager.X state access.
Returns (rows, cols, free_dets, identity_rejoin_pairs) where
identity_rejoin_pairs is a list of (slot_index, det_index)
tuples from the identity-only rejoin path for committed-lost slots.
hydra_suite.core.post.processing
¶
Trajectory post-processing utilities for cleaning and refining tracking data.
Optimizations: - NumPy vectorization for distance calculations - Numba JIT compilation for inner loops (if available) - Parallel processing for independent trajectory operations
process_trajectories_from_csv(csv_path, params)
¶
Cleans and refines trajectory data from CSV file, preserving all columns including confidence metrics.
This function performs several steps: - Removes trajectories that are too short. - Breaks trajectories at points of impossibly high velocity or large jumps, which often indicate identity switches or tracking errors. - Preserves all columns from the input CSV (including confidence metrics)
Args: csv_path (str): Path to the raw CSV file from tracking params (dict): The dictionary of tracking parameters.
Returns: tuple: (final_trajectories_df, statistics_dict)
process_trajectories(trajectories_full, params)
¶
Cleans and refines raw trajectory data.
This function performs several steps: - Removes trajectories that are too short. - Breaks trajectories at points of impossibly high velocity or large jumps, which often indicate identity switches or tracking errors.
Args: trajectories_full (list of lists): The raw trajectory data from the tracker. params (dict): The dictionary of tracking parameters.
Returns: tuple: (final_trajectories, statistics_dict)
resolve_trajectories(forward_trajs, backward_trajs, params=None)
¶
Merges forward and backward trajectories using conservative consensus-based merging.
This function prioritizes identity confidence over trajectory completeness: 1. Only considers trajectory pairs as merge candidates if they have sufficient overlapping frames where positions agree (within AGREEMENT_DISTANCE) 2. Merges only the agreeing segments - disagreeing frames cause trajectory splits 3. Results in more trajectory fragments but higher confidence in identity
Algorithm: - For each forward/backward pair, count frames where both have valid positions within AGREEMENT_DISTANCE of each other - If count >= MIN_OVERLAP_FRAMES, they are merge candidates - During merge: agreeing frames are averaged, disagreeing frames cause splits into separate trajectory segments
Args: forward_trajs (list): List of forward trajectory DataFrames or lists of tuples backward_trajs (list): List of backward trajectory DataFrames or lists of tuples params (dict, optional): Parameters for merging thresholds
Returns: list: Final merged trajectories as list of DataFrames
resolve_simultaneous_identity_conflicts(result_dfs, params=None)
¶
Demote the weaker of two tracks that simultaneously claim the same identity.
Enforces the physical constraint that a given identity can belong to at most
one trajectory at any point in time. For each pair of trajectories with the
same majority IdentityAssignedLabel and at least one shared frame, the
lower-scoring one has its identity columns cleared and
IdentityConflictResolved set to True.
Scoring follows the same shape as the iterative fragment solver: a unary
quality term agreement × mean_conf × length_factor plus an additive
AprilTag bonus, with the forward-pass flag as the lex tiebreaker. A long
track with consistent labels and a clear margin therefore wins over a
short, jittery, or low-confidence one — the loser is the one that gets
cleared to Unknown.
interpolate_trajectories(trajectories_df, method='linear', max_gap=10, heading_flip_max_burst=5, directed_heading_posthoc=False)
¶
Interpolate missing values in trajectories using various methods.
Args: trajectories_df: DataFrame with trajectory data (must have X, Y, Theta, FrameID columns) method: Interpolation method - 'linear', 'cubic', 'spline', or 'none' max_gap: Maximum gap size to interpolate (frames). Larger gaps are left as NaN. heading_flip_max_burst: Maximum length of an isolated heading-flip burst to correct in post-processing. Longer segments are assumed genuine. Ignored when directed_heading_posthoc is True. directed_heading_posthoc: When True (head-tail or pose model was used), apply global heading consistency via dynamic programming instead of the local burst-based flip correction. This resolves the minimum number of per-frame flips needed to make the entire track consistently directed.
Returns: DataFrame with interpolated values
relink_trajectories_with_pose(trajectories_df, params)
¶
Greedily relink short trajectory fragments using motion and optional pose continuity.
hydra_suite.core.identity
¶
Identity and individual-level analysis.
Sub-packages: pose/ — Pose inference backends, types, utilities, quality classification/ — Per-detection classifiers (AprilTag, CNN, head-tail) properties/ — Properties caching and CSV export aggregation dataset/ — Crop generation and video export
AprilTagConfig
dataclass
¶
All tunables for the AprilTag detection step.
from_params(params)
classmethod
¶
Build from the MAT tracking-parameters dictionary.
AprilTagDetector
¶
Detect AprilTags inside OBB/bbox crops using composite-strip decoding.
Usage::
detector = AprilTagDetector(AprilTagConfig.from_params(params))
observations = detector.detect_in_crops(frame, crops, offsets)
detector.close() # optional, releases C resources
detect_in_crops(crops, offsets_xy, det_indices=None)
¶
Detect tags in pre-extracted crops.
Parameters¶
crops:
List of BGR crops already cut from the frame.
offsets_xy:
(x_offset, y_offset) of each crop's top-left corner in the
original frame.
det_indices:
Explicit detection-index per crop.
close()
¶
Release native detector resources.
ClassPrediction
dataclass
¶
Single detection's classifier output.
For flat models factor_names has length 1 and the class_name /
confidence properties give the scalar view. For multi-head models
each tuple index is a distinct factor.
CNNIdentityBackend
¶
High-level wrapper around ClassifierBackend that adds CNN identity
semantics: per-factor confidence thresholding, class-name lookup, and
scoring-mode validation.
predict_batch(crops)
¶
Run inference and return per-crop ClassPrediction instances with
per-factor confidence thresholding applied.
predict_batch_cuda(crops)
¶
GPU-native batch prediction path (Streaming Phase 2).
Delegates to ClassifierBackend.predict_batch_cuda() when the
underlying backend supports it. Falls back transparently to the CPU
path when GPU execution is not available.
Parameters¶
crops:
Either a list of CPU np.ndarray crops or a stacked CUDA tensor
(B, C, H, W). The underlying backend selects the appropriate
execution path based on input type and the configured runtime.
Returns¶
list[ClassPrediction]
Same contract as predict_batch().
predict_batch_posteriors(crops, calibration=None)
¶
Calibrated posterior output hook (Streaming Phase 2 / Identity Phase 0).
Runs the same batch inference as predict_batch() but additionally
returns the full calibrated probability distribution over every class in
every factor, enabling the identity overhaul to build
IdentityEvidence objects without re-running inference.
Parameters¶
crops:
List of np.ndarray crops (same contract as predict_batch).
calibration:
Optional CalibrationModel from identity.calibration.
When None, raw softmax probabilities are returned as-is.
Returns¶
predictions: list[ClassPrediction]
Hard predictions (same as predict_batch()).
posteriors: list[list[np.ndarray]]
posteriors[det_index][factor_index] is a shape (K_f,)
float64 array of calibrated probabilities over the factor's
class list. The caller maps these to catalog log-priors via
IdentityCatalog.cnn_log_prior().
CNNIdentityCache
¶
Persistent .npz cache of per-frame CNN identity predictions.
Data is accumulated in memory via save() and written to disk in a
single compressed write via flush(). Call load() during the
tracking loop to retrieve per-frame predictions.
Supports two on-disk formats:
- Legacy (no
factor_nameskey in the .npz): flat single-factor, keysf{N}_det,f{N}_cls,f{N}_conf. Reconstructed on load asfactor_names=("flat",). - v2 (
cache_schema_version == 2andfactor_namespresent): multi-factor, keysf{N}_det,f{N}_cls_k{K},f{N}_conf_k{K}for each factor index K.
The factor_names constructor argument is used only when writing new
caches. When loading an existing file the stored factor_names always wins.
class_names_per_factor
property
¶
Per-factor class name lists, or None when not stored.
exists()
¶
Return True if the cache file exists on disk.
save(frame_idx, predictions, posteriors=None)
¶
Update in-memory cache for frame_idx. Call flush() when done.
posteriors is an optional per-detection list of per-factor probability
vectors (one np.ndarray of shape (n_classes,) per factor). When provided
the full distributions are persisted alongside the top-1 predictions so
that augmented exports can include per-class probability columns.
flush()
¶
Write all in-memory predictions to disk (v3 format when probs present).
load_probs(frame_idx)
¶
Return per-detection per-factor probability vectors for frame_idx.
Returns None when no probability data was stored. Otherwise returns
a list aligned with load(frame_idx): each entry is either None
(no probs for that detection) or a list of K np.ndarray prob vectors.
load(frame_idx)
¶
Return saved predictions for frame_idx, or [] if not found.
get_cached_frames()
¶
Return sorted frame indices present in the cache.
CNNIdentityConfig
dataclass
¶
Configuration for CNN Classifier identity method.
TrackCNNHistory
¶
Sliding-window per-track history of multi-factor classifier predictions.
Per-factor majority vote excludes None and "unknown" observations.
Ties return None for that factor.
HeadTailAnalyzer
¶
Classifier-agnostic head-tail direction analyzer.
Wraps ClassifierBackend for v2 classifier artifacts and enforces:
- flat model (not multi-head) — raises HeadTailFormatError
- labels subset of {up, down, left, right, unknown} after alias normalization
Consumers use analyze_crops(frames, per_frame_obb_corners) for the
full frame-based pipeline. New callers may use predict_labels(crops)
or analyze_detections(crops, obb_major_axes) for simpler crop lists.
canonical_labels
property
¶
Normalized labels in checkpoint order.
is_available
property
¶
True if a model is loaded and the backend is not 'none'.
backend
property
¶
Name of the active inference backend ('backend_v2' or 'none').
class_names
property
¶
Class names reported by the loaded model, or None if unavailable.
input_size
property
¶
Expected (height, width) crop input size for the loaded model, or None.
model
property
¶
Retained for compatibility; v2-backed analyzers expose no raw model.
__init__(model_path='', device='cpu', conf_threshold=0.5, batch_size=64, reference_aspect_ratio=2.0, canonical_margin=1.3, predict_device=None, *, compute_runtime=None)
¶
Construct a HeadTailAnalyzer from a classifier artifact path.
Accepts both the legacy device= parameter (maps to torch device)
and the new compute_runtime= parameter (ClassifierBackend runtime).
When compute_runtime is provided it takes precedence.
Raises: HeadTailFormatError: model is multi-head or labels are not a subset of the canonical head-tail set.
valid_output_labels()
classmethod
¶
Return the frozenset of allowed canonical head-tail labels.
is_loaded()
¶
True when the analyzer has a model ready for inference.
predict_labels(crops)
¶
Return (canonical_label, confidence) per crop.
Labels below conf_threshold are collapsed to "unknown" with
their original confidence.
analyze_detections(crops, obb_major_axes)
¶
Return (heading_radians, confidence, directed_flag) per crop.
analyze_crops(frames, per_frame_obb_corners, profiler=None)
¶
Run head-tail analysis on multiple frames.
Args: frames: BGR video frames. per_frame_obb_corners: For each frame, a list of (4,2) OBB corners.
Returns:
Per-frame list of (heading_radians, confidence, directed_flag)
tuples. heading_radians is nan when direction is ambiguous.
directed_flag is 1 when classifier was confident, 0 otherwise.
analyze_crops_cuda(frames_hwc, per_frame_obb_corners, profiler=None, *, input_is_bgr=False)
¶
GPU-native head-tail analysis for CUDA-resident frames.
Mirrors :meth:analyze_crops but accepts CUDA tensors instead of
numpy arrays. The affine warp, resize, and classifier forward pass
all stay on-device; only the final (tiny) probability vectors are
moved to CPU.
Parameters¶
frames_hwc:
List of (H, W, C) CUDA tensors (uint8 or float32). Typically
produced by NVDec (RGB, uint8) or the sequential GPU pipeline.
per_frame_obb_corners:
Same format as :meth:analyze_crops.
input_is_bgr:
Set True when frames use BGR channel ordering (cv2 convention).
Default False assumes RGB (NVDec output).
close()
¶
Release the loaded model and reset the backend to 'none'.
IndividualDatasetGenerator
¶
Real-time individual dataset generator that runs during tracking.
Exports OBB-masked crops for each detection, where only the detected animal's OBB region is visible and the rest is masked. This provides clean, isolated training data for individual-level analysis.
Key features: - Runs in parallel with tracking (called per-frame) - Uses actual OBB polygon to mask out other animals - Saves crops with track/trajectory ID labels - Uses already-filtered detections (ROI + size filtering done by tracking)
__init__(params, output_dir, video_name, dataset_name=None)
¶
Initialize the individual dataset generator.
Args: params: Parameter dictionary output_dir: Base directory for saving crops video_name: Name of the source video (for organizing output) dataset_name: Optional custom name for the dataset
process_frame(frame, frame_id, meas, obb_corners=None, ellipse_params=None, confidences=None, track_ids=None, trajectory_ids=None, coord_scale_factor=1.0, detection_ids=None, heading_hints=None, directed_mask=None, velocities=None, canonical_affines=None, canonical_canvas_dims=None, canonical_M_inverse=None)
¶
Process a frame and save masked crops for each detection.
Supports both YOLO OBB detections and ellipse detections from background subtraction. For ellipses, OBB corners are computed from the ellipse parameters.
Detections passed here are already filtered by ROI and size filtering in the tracking pipeline - no additional filtering needed.
Args: frame: Input frame (BGR) - should be ORIGINAL resolution frame_id: Current frame number meas: List of measurements [cx, cy, theta, ...] for each detection (in detection resolution) obb_corners: List of OBB corner arrays (4 points each) for YOLO detections (in detection resolution) ellipse_params: List of ellipse params [major_axis, minor_axis] for BG sub detections (in detection resolution) (center and theta are taken from meas) confidences: Optional list of confidence scores track_ids: Optional list of track IDs for each detection trajectory_ids: Optional list of trajectory IDs for each detection coord_scale_factor: Scale factor to convert detection coords to original resolution (1/resize_factor) detection_ids: Optional list of unique Detection IDs for each detection heading_hints: Optional directed heading angles (radians) per detection from head-tail model. directed_mask: Optional boolean mask (0/1) per detection indicating whether heading_hints is valid. velocities: Optional (vx, vy) tuples per detection for motion-based fallback orientation. canonical_affines: Optional M_canonical (2x3) affine matrices per detection. When provided and canonical crop dimensions > 0, extract rotation-normalised canonical crops instead of AABB-masked crops. Falls back to legacy AABB extraction when None or when a specific detection has no affine.
Returns: num_saved: Number of crops saved from this frame
save_interpolated_crop(frame, frame_id, cx, cy, w, h, theta, traj_id, interp_from, interp_index, interp_total, heading_angle=None, heading_directed=False, canonical_affine=None)
¶
Save one interpolated crop for trajectory gap-filling supervision.
finalize()
¶
Finalize the dataset and save metadata. Called when tracking completes.
Returns: str: Path to the dataset directory, or None if not enabled
OrientedTrackVideoExporter
¶
Build per-track orientation-fixed videos directly from cached geometry.
export(progress_callback=None, should_stop=None)
¶
Build orientation-corrected MP4 videos for every trajectory in the final CSV.
Streams the source video once, warps each frame into a canonical per-animal crop, and writes one output file per track to the configured output sub-directory.
PoseInferenceBackend
¶
Bases: Protocol
Protocol for all runtime backends.
PoseResult
dataclass
¶
Canonical pose output for one crop.
PoseRuntimeConfig
dataclass
¶
Configuration for pose runtime backend selection.
RuntimeMetrics
dataclass
¶
Timing metrics for pose runtime lifecycle.
IndividualPropertiesCache
¶
NPZ-backed cache for per-detection properties keyed by frame and detection ID.
is_compatible()
¶
Return True if the loaded cache matches the expected schema version.
get_cached_frames()
¶
Return a sorted iterable of frame indices present in the cache.
add_frame(frame_idx, detection_ids, pose_mean_conf=None, pose_valid_fraction=None, pose_num_valid=None, pose_num_keypoints=None, pose_keypoints=None)
¶
Add a frame of detection data to cache.
Note: pose_mean_conf, pose_valid_fraction, pose_num_valid, and pose_num_keypoints are deprecated and ignored. Only raw keypoints are stored. Summary statistics are computed on-demand when reading.
save(metadata=None)
¶
Flush all accumulated frame data to a compressed .npz file on disk.
get_frame(frame_idx, min_valid_conf=0.2)
¶
Get frame data with summary statistics computed on-demand.
Args: frame_idx: Frame index to retrieve min_valid_conf: Minimum confidence threshold for keypoint validity (default: 0.2)
Returns: Dict with detection_ids, pose_mean_conf, pose_valid_fraction, pose_num_valid, pose_num_keypoints, and pose_keypoints
get_detection(frame_idx, detection_id)
¶
Get per-detection data for a specific detection ID in a frame.
Returns a dict of pose properties for the matching detection, or None if the detection ID is not found in that frame.
close()
¶
Close the underlying NpzFile handle and release all in-memory cache data.
apply_cnn_identity_cost(*, track_identity, det, match_bonus, mismatch_penalty, scoring_mode)
¶
Compute the cost delta contributed by a CNN identity classifier for a (track, detection) pair under the given scoring mode.
ellipse_axes_from_area(area, aspect_ratio)
¶
Compute ellipse major/minor axes from area and aspect ratio.
Returns: (major_axis, minor_axis)
ellipse_to_obb_corners(cx, cy, major_axis, minor_axis, theta)
¶
Convert ellipse parameters to OBB corner points.
The OBB is the oriented bounding box that exactly fits the ellipse, which is a rotated rectangle with dimensions (major_axis x minor_axis).
Args: cx, cy: Center coordinates of the ellipse major_axis: Full length of the major axis (not semi-axis) minor_axis: Full length of the minor axis (not semi-axis) theta: Rotation angle in radians (orientation of major axis)
Returns: corners: numpy array of shape (4, 2) with corner coordinates
resolve_directed_angle(theta, heading_hint=None, heading_directed=False, vx=None, vy=None)
¶
Resolve the best directed orientation angle for a crop.
Priority: 1. Head-tail model heading (heading_directed=True and finite heading_hint). 2. Motion velocity (vx, vy non-negligible) — disambiguates theta ± π. 3. OBB axis angle (undirected, 180° ambiguity retained).
The returned angle points tail → head so that the affine-warp canonicalization places the head on the right side (+x) of the canonical crop.
Returns: (angle_rad, is_directed, source_str)
build_runtime_config(params, out_root, keypoint_names_override=None, skeleton_edges_override=None)
¶
Construct a PoseRuntimeConfig from a tracking params dict.
Resolves backend family, runtime flavor, device, batch size, skeleton, and all model-path fields, applying compute_runtime-derived overrides where appropriate.