Skip to content

HYDRA Suite Docs

Dataset Generation

neurorishika/hydra-suite

Dataset Generation¶

Dataset generation supports active learning loops for improving detector models.

What It Does¶

Scores frames using configurable quality metrics.
Selects challenging frames for annotation.
Exports image/label artifacts for downstream training.

Quality Metrics (Conceptual)¶

Low confidence detections
Detection count mismatch vs expected target count
High assignment costs
Track loss events
Optional uncertainty-based triggers

When to Use¶

Detector underperforms on specific lighting or behaviors.
You need focused retraining data instead of random frame sampling.

Tradeoffs¶

Aggressive selection can bias toward outliers.
Conservative selection may miss edge cases.

Practical Loop¶

Run MAT and generate candidate frames.
Validate/annotate selected frames.
Retrain YOLO model.
Re-run MAT on representative videos.
Compare confidence and identity continuity metrics.