Skip to content

API: Data

hydra_suite.data

Data I/O and dataset preparation utilities.

CSVWriterThread

Bases: Thread

Asynchronous CSV writer for high-performance data logging.

This thread-safe class handles CSV writing in the background to prevent I/O operations from blocking the main tracking loop. It uses a queue-based system to buffer data and ensures data integrity during high-frequency writes.

The CSV format includes: - TrackID: Track slot identifier (0-based index, reused across track losses) - TrajectoryID: Persistent trajectory identifier (increments when tracks are reassigned) - Index: Sequential count of detections for this track slot - X, Y: Pixel coordinates of object center - Theta: Orientation angle in radians - FrameID: Video frame number (1-based) - State: Tracking state ('active', 'occluded', 'lost')

__init__(path, header=None)

Initialize CSV writer thread.

Args: path (str): Output CSV file path header (list, optional): Column names for CSV header

run()

Main thread loop for processing queued data.

Continuously processes queued data rows until stop signal is received and all remaining data is flushed.

enqueue(row)

Add a data row to the write queue.

Args: row (list): Data row to write to CSV

stop()

Signal the thread to stop processing and shutdown gracefully.

FrameQualityScorer

Scores frames based on various quality metrics to identify challenging frames that would benefit from additional training data.

score_frame(frame_id, detection_data=None, tracking_data=None)

Score a single frame based on enabled quality metrics.

Args: frame_id: Frame number detection_data: Dict with detection information - confidences: List of detection confidence scores - count: Number of detections tracking_data: Dict with tracking information - assignment_costs: List of assignment costs - lost_tracks: Number of lost tracks - uncertainties: List of position uncertainties

Returns: score: Higher score = more problematic frame (0.0 to 1.0+)

get_worst_frames(max_frames, diversity_window=30, probabilistic=True)

Select the worst N frames with visual diversity constraint.

Args: max_frames: Maximum number of frames to select diversity_window: Minimum frame separation to ensure diversity probabilistic: If True, use rank-based probabilistic sampling. If False, use greedy selection (worst frames first).

Returns: selected_frames: List of frame IDs sorted by score (worst first)

DetectionCache

Efficient frame-by-frame detection cache using NPZ format.

This class provides streaming write during forward pass and streaming read during backward pass, minimizing memory footprint while enabling detection reuse.

__init__(cache_path, mode='w', start_frame=0, end_frame=None)

Initialize detection cache.

Args: cache_path: Path to the .npz cache file mode: 'w' for writing (forward pass), 'r' for reading (backward pass) start_frame: Starting frame index for writing end_frame: Ending frame index for writing

add_frame(frame_idx, meas, sizes, shapes, confidences, obb_corners=None, detection_ids=None, heading_hints=None, directed_mask=None, canonical_affines=None, canonical_canvas_dims=None, canonical_M_inverse=None)

Add detection data for a single frame (forward pass).

Args: frame_idx: Frame number (0-based) meas: List of numpy arrays, each [cx, cy, theta] sizes: List of detection areas shapes: List of tuples (ellipse_area, aspect_ratio) confidences: List of confidence scores (float or nan) obb_corners: Optional list of OBB corner arrays for YOLO detection_ids: Optional list of detection IDs (FrameID * 10000 + detection_index) heading_hints: Optional list of directed heading hints (radians). directed_mask: Optional list indicating whether heading_hints are directed. canonical_affines: Optional list of (2, 3) affine matrices (M_align per detection). canonical_canvas_dims: Optional list of (width, height) int tuples per detection. canonical_M_inverse: Optional list of (2, 3) inverse affine matrices per detection.

save()

Save cached detections to disk (call at end of forward pass).

get_frame(frame_idx)

Get detection data for a single frame (backward pass).

Args: frame_idx: Frame number (0-based)

Returns: Tuple of ( meas, sizes, shapes, confidences, obb_corners, detection_ids, heading_hints, directed_mask, ) where meas is a list of numpy arrays to match the tracking worker API

get_total_frames()

Get total number of frames in cache.

is_compatible()

Return whether the loaded cache format is supported by current code.

get_frame_range()

Get the frame range stored in cache.

covers_frame_range(start_frame, end_frame)

Check if cache fully covers the requested frame range.

matches_frame_range(start_frame, end_frame)

Return whether the cache metadata exactly matches the requested range.

get_missing_frames(start_frame, end_frame, max_report=10)

Return a list of missing frame indices (up to max_report).

close()

Close and cleanup cache resources.

export_dataset(video_path, csv_path, frame_ids, output_dir, dataset_name, class_name, params, include_context=True, _yolo_results_dict=None)

Export selected frames and annotations as a training dataset.

Args: video_path: Path to source video csv_path: Path to tracking CSV (for reading annotations) frame_ids: List of frame IDs to export output_dir: Directory to save dataset dataset_name: Name for the dataset class_name: Name of the object class (for classes.txt file) params: Parameters dict (for accessing RESIZE_FACTOR and REFERENCE_BODY_SIZE) include_context: Include ±1 frames around each selected frame yolo_results_dict: Optional dict of {frame_id: yolo_detections} for YOLO format export

Returns: zip_path: Path to created zip file

detect_dataset_layout(root_dir)

Detect supported dataset folder layout.

Args: root_dir: Dataset root directory.

Returns: Mapping of split name to (images_dir, labels_dir).

Raises: RuntimeError: If no recognized dataset structure is found.

get_dataset_class_name(root_dir)

Read class name from dataset.yaml or classes.txt if available.

merge_datasets(sources, output_dir, class_name, split_cfg, seed=42, dedup=True)

Merge multiple YOLO-OBB datasets.

sources: list of dicts {"name": str, "path": str} output_dir: base output dir split_cfg: dict with train/val/test ratios

Returns merged dataset path.

rewrite_labels_to_single_class(labels_dir, class_id=0)

Rewrite all label files so every object uses the same class ID.

update_dataset_class_name(root_dir, class_name)

Update dataset class metadata in yaml/txt files.

validate_labels(labels_dir)

Validate YOLO-OBB labels and return discovered class IDs and file count.

hydra_suite.data.csv_writer

Utility functions for asynchronous CSV writing in the HYDRA Suite.

CSVWriterThread

Bases: Thread

Asynchronous CSV writer for high-performance data logging.

This thread-safe class handles CSV writing in the background to prevent I/O operations from blocking the main tracking loop. It uses a queue-based system to buffer data and ensures data integrity during high-frequency writes.

The CSV format includes: - TrackID: Track slot identifier (0-based index, reused across track losses) - TrajectoryID: Persistent trajectory identifier (increments when tracks are reassigned) - Index: Sequential count of detections for this track slot - X, Y: Pixel coordinates of object center - Theta: Orientation angle in radians - FrameID: Video frame number (1-based) - State: Tracking state ('active', 'occluded', 'lost')

__init__(path, header=None)

Initialize CSV writer thread.

Args: path (str): Output CSV file path header (list, optional): Column names for CSV header

run()

Main thread loop for processing queued data.

Continuously processes queued data rows until stop signal is received and all remaining data is flushed.

enqueue(row)

Add a data row to the write queue.

Args: row (list): Data row to write to CSV

stop()

Signal the thread to stop processing and shutdown gracefully.

hydra_suite.data.detection_cache

Detection caching for efficient bidirectional tracking.

This module provides a memory-efficient way to cache detection data from the forward tracking pass and reuse it during the backward pass, eliminating the need for RAM-intensive video reversal.

DetectionCache

Efficient frame-by-frame detection cache using NPZ format.

This class provides streaming write during forward pass and streaming read during backward pass, minimizing memory footprint while enabling detection reuse.

__init__(cache_path, mode='w', start_frame=0, end_frame=None)

Initialize detection cache.

Args: cache_path: Path to the .npz cache file mode: 'w' for writing (forward pass), 'r' for reading (backward pass) start_frame: Starting frame index for writing end_frame: Ending frame index for writing

add_frame(frame_idx, meas, sizes, shapes, confidences, obb_corners=None, detection_ids=None, heading_hints=None, directed_mask=None, canonical_affines=None, canonical_canvas_dims=None, canonical_M_inverse=None)

Add detection data for a single frame (forward pass).

Args: frame_idx: Frame number (0-based) meas: List of numpy arrays, each [cx, cy, theta] sizes: List of detection areas shapes: List of tuples (ellipse_area, aspect_ratio) confidences: List of confidence scores (float or nan) obb_corners: Optional list of OBB corner arrays for YOLO detection_ids: Optional list of detection IDs (FrameID * 10000 + detection_index) heading_hints: Optional list of directed heading hints (radians). directed_mask: Optional list indicating whether heading_hints are directed. canonical_affines: Optional list of (2, 3) affine matrices (M_align per detection). canonical_canvas_dims: Optional list of (width, height) int tuples per detection. canonical_M_inverse: Optional list of (2, 3) inverse affine matrices per detection.

save()

Save cached detections to disk (call at end of forward pass).

get_frame(frame_idx)

Get detection data for a single frame (backward pass).

Args: frame_idx: Frame number (0-based)

Returns: Tuple of ( meas, sizes, shapes, confidences, obb_corners, detection_ids, heading_hints, directed_mask, ) where meas is a list of numpy arrays to match the tracking worker API

get_total_frames()

Get total number of frames in cache.

is_compatible()

Return whether the loaded cache format is supported by current code.

get_frame_range()

Get the frame range stored in cache.

covers_frame_range(start_frame, end_frame)

Check if cache fully covers the requested frame range.

matches_frame_range(start_frame, end_frame)

Return whether the cache metadata exactly matches the requested range.

get_missing_frames(start_frame, end_frame, max_report=10)

Return a list of missing frame indices (up to max_report).

close()

Close and cleanup cache resources.

hydra_suite.data.dataset_generation

Dataset generation utilities for active learning. Identifies challenging frames and exports them for annotation.

FrameQualityScorer

Scores frames based on various quality metrics to identify challenging frames that would benefit from additional training data.

score_frame(frame_id, detection_data=None, tracking_data=None)

Score a single frame based on enabled quality metrics.

Args: frame_id: Frame number detection_data: Dict with detection information - confidences: List of detection confidence scores - count: Number of detections tracking_data: Dict with tracking information - assignment_costs: List of assignment costs - lost_tracks: Number of lost tracks - uncertainties: List of position uncertainties

Returns: score: Higher score = more problematic frame (0.0 to 1.0+)

get_worst_frames(max_frames, diversity_window=30, probabilistic=True)

Select the worst N frames with visual diversity constraint.

Args: max_frames: Maximum number of frames to select diversity_window: Minimum frame separation to ensure diversity probabilistic: If True, use rank-based probabilistic sampling. If False, use greedy selection (worst frames first).

Returns: selected_frames: List of frame IDs sorted by score (worst first)

export_dataset(video_path, csv_path, frame_ids, output_dir, dataset_name, class_name, params, include_context=True, _yolo_results_dict=None)

Export selected frames and annotations as a training dataset.

Args: video_path: Path to source video csv_path: Path to tracking CSV (for reading annotations) frame_ids: List of frame IDs to export output_dir: Directory to save dataset dataset_name: Name for the dataset class_name: Name of the object class (for classes.txt file) params: Parameters dict (for accessing RESIZE_FACTOR and REFERENCE_BODY_SIZE) include_context: Include ±1 frames around each selected frame yolo_results_dict: Optional dict of {frame_id: yolo_detections} for YOLO format export

Returns: zip_path: Path to created zip file

hydra_suite.data.dataset_merge

Utilities to validate, normalize, and merge YOLO-OBB datasets.

The functions in this module are used by the GUI dataset builder to combine multiple sources (including converted X-AnyLabeling projects) into a single, train/val-ready output directory.

detect_dataset_layout(root_dir)

Detect supported dataset folder layout.

Args: root_dir: Dataset root directory.

Returns: Mapping of split name to (images_dir, labels_dir).

Raises: RuntimeError: If no recognized dataset structure is found.

get_dataset_class_name(root_dir)

Read class name from dataset.yaml or classes.txt if available.

update_dataset_class_name(root_dir, class_name)

Update dataset class metadata in yaml/txt files.

validate_labels(labels_dir)

Validate YOLO-OBB labels and return discovered class IDs and file count.

rewrite_labels_to_single_class(labels_dir, class_id=0)

Rewrite all label files so every object uses the same class ID.

write_classes_txt(root_dir, class_name)

Write classes.txt with a single class name.

merge_datasets(sources, output_dir, class_name, split_cfg, seed=42, dedup=True)

Merge multiple YOLO-OBB datasets.

sources: list of dicts {"name": str, "path": str} output_dir: base output dir split_cfg: dict with train/val/test ratios

Returns merged dataset path.