From ac6d368810857ea65e6a49c2e6113de682eea513 Mon Sep 17 00:00:00 2001 From: Yoilun Date: Fri, 29 May 2026 15:13:48 +0800 Subject: [PATCH] docs: design yolo-ready trajectory evidence --- ...ightweight-trajectory-yolo-ready-design.md | 278 ++++++++++++++++++ 1 file changed, 278 insertions(+) create mode 100644 docs/superpowers/specs/2026-05-29-lightweight-trajectory-yolo-ready-design.md diff --git a/docs/superpowers/specs/2026-05-29-lightweight-trajectory-yolo-ready-design.md b/docs/superpowers/specs/2026-05-29-lightweight-trajectory-yolo-ready-design.md new file mode 100644 index 0000000..6d592a4 --- /dev/null +++ b/docs/superpowers/specs/2026-05-29-lightweight-trajectory-yolo-ready-design.md @@ -0,0 +1,278 @@ +# Lightweight Trajectory Tracking With YOLO-Ready Evidence + +Date: 2026-05-29 +Branch: `lightweight-trajectory-tracking` + +## Summary + +The runtime currently confirms disposal by matching a zone becoming empty with generic trash-bin motion. That produces false matches when several zones change close together, when the trash ROI moves for an unrelated reason, or when reflection changes look like motion. + +This design adds a trajectory evidence layer. Version 1 uses lightweight motion tracking to infer "source zone -> trash ROI" during a short window after a zone becomes empty. Version 2 can add a trained YOLO backend later without changing the event engine contract. + +The first implementation must not require YOLO, PyTorch, ONNX Runtime, or OpenVINO. It must keep the current ROI occupancy timer and add a stronger disposal confirmation path. + +## Goals + +- Confirm disposal by source zone, not by FIFO matching alone. +- Reduce cases where zone 1 or zone 4 removal is incorrectly matched to another zone. +- Suppress reflection-only and trash-bin-only movement from confirming disposal. +- Keep CPU load low by activating trajectory analysis only after a zone becomes empty. +- Preserve a stable data contract that a future trained YOLO model can enrich. + +## Non-Goals + +- Do not convert the whole project to YOLO in the first trajectory version. +- Do not train or bundle a model in this branch. +- Do not replace ROI occupancy timing; it remains the authority for zone occupied/empty state. +- Do not require visual access inside the trash bin. Confirmation is based on motion entering the trash mouth ROI. + +## Current Architecture + +`main.py` captures one RTSP frame per sample interval with `ffmpeg`, passes it to `ZoneOccupancyDetector.observe()`, creates an `Observation`, and sends it to `BatchEngine.process()`. + +`vision.py` currently outputs: + +- `zone_counts`: stable occupied/empty state per configured zone. +- `trash_deposit_count`: count of generic trash ROI motion events. +- `diagnostics`: metrics for zones and trash motion. + +`engine.py` currently consumes: + +- `Observation.zone_counts` +- `Observation.trash_deposit_count` + +When a timed-out batch is removed, it becomes pending disposal. A later trash motion can close pending batches, using FIFO order when source-zone evidence is missing. + +## Proposed Architecture + +Add a trajectory evidence path between vision and engine: + +1. Zone occupancy still runs first. +2. When a zone transitions from occupied to empty, vision opens a short tracking window for that zone. +3. While any tracking window is active, the runtime temporarily shortens the capture delay so movement is sampled densely enough for a path. +4. During the window, a lightweight motion backend tracks moving blobs across the source zone, the path/corridor, and the trash mouth ROI. +5. If the path is coherent, vision emits a zone-scoped disposal evidence item. +6. The engine applies zone-scoped disposal evidence before using generic trash motion fallback. + +The engine should depend on a neutral evidence format, not on YOLO or any specific tracking backend. + +## Data Contract + +Add `disposal_evidence` to `Observation`. + +Example V1 evidence: + +```json +{ + "source_zone_id": "1", + "target": "trash", + "confidence": 0.86, + "method": "motion", + "started_at": "2026-05-29T14:03:20+08:00", + "ended_at": "2026-05-29T14:03:25+08:00", + "track_points": [[152, 210], [181, 219], [226, 235], [275, 252]], + "item_class": null, + "detector_score": null +} +``` + +Example later YOLO-enriched evidence: + +```json +{ + "source_zone_id": "1", + "target": "trash", + "confidence": 0.94, + "method": "motion+yolo", + "started_at": "2026-05-29T14:03:20+08:00", + "ended_at": "2026-05-29T14:03:25+08:00", + "track_points": [[152, 210], [181, 219], [226, 235], [275, 252]], + "item_class": "trained_product_a", + "detector_score": 0.91 +} +``` + +`trash_deposit_count` remains for compatibility and fallback, but zone-scoped `disposal_evidence` takes priority. + +## Components + +### `TrajectoryTracker` + +Owns active tracking windows. It receives current frame, timestamp, zone counts, region polygons, and trash ROI. + +Responsibilities: + +- Detect occupied-to-empty transitions. +- Start a per-zone candidate window. +- Keep recent motion observations for each active candidate. +- Decide whether a candidate has enough evidence to emit disposal evidence. +- Expire weak candidates without closing a batch. +- Report whether any candidate is active so `main.py` can use the faster trajectory sample interval. + +### `MotionTrajectoryBackend` + +The default V1 backend. It uses frame-to-frame differences and connected motion regions. + +Responsibilities: + +- Compute motion mask from the current and previous frame. +- Filter out tiny, static, and reflection-like changes. +- Extract moving blob centroids and bounding boxes. +- Associate centroids over time into a short track. +- Return backend-neutral track observations. + +The backend must work without external model dependencies. + +### `YoloDetectionBackend` + +An optional future backend. It is not implemented in V1 but the interface is reserved. + +Responsibilities when enabled later: + +- Run only during active tracking windows or on configured path crops. +- Detect trained product classes and optionally hands/person keypoints. +- Attach `item_class`, `detector_score`, and bounding boxes to the same evidence contract. +- Never bypass trajectory validation. YOLO detections enrich confidence; they do not directly close events. + +### `EvidenceFusion` + +Combines backend output into final evidence. + +V1 uses motion-only signals: + +- Origin score: first meaningful motion is near or inside the source zone. +- Direction score: track generally moves from source zone toward trash ROI. +- Target score: final track points intersect or approach the trash mouth ROI. +- Stability score: track persists across enough frames and is not a one-frame flash. + +V2 can add YOLO class and detector confidence into the same confidence calculation. + +### `BatchEngine` + +The engine should process evidence in this order: + +1. Expire old pending disposal records. +2. Apply zone-scoped `disposal_evidence` to matching pending batches first. +3. Process zone transitions. +4. Apply any evidence created in the same observation to newly pending batches. +5. Use remaining generic `trash_deposit_count` as fallback for older behavior. + +Zone-scoped evidence should only discard the pending batch from `source_zone_id`. It must not close a different zone when the source zone has no pending disposal. + +## Runtime Flow + +1. A zone is occupied long enough to create an active batch. +2. The batch reaches the dwell alarm threshold and emits `time_alarm`. +3. The item is removed from the zone. +4. Occupancy confirms the zone is empty, and the tracker opens or continues a short candidate window for that zone. +5. The engine emits `batch_pending_disposal` for that zone. +6. While the candidate is active, the runtime samples faster than the normal dwell timer interval. +7. Motion backend observes a track from source zone toward trash ROI. +8. If the track enters the trash mouth ROI with enough confidence, `disposal_evidence` is emitted. +9. The engine emits `batch_discarded` for that same zone. If evidence is emitted in the same observation that created pending disposal, the engine applies it after processing the zone-empty transition. +10. If no evidence arrives before the pending deadline, the current warning escalation behavior remains. + +## Configuration + +Add runtime settings with conservative defaults: + +```toml +[runtime] +trajectory_enabled = true +trajectory_window_seconds = 8 +trajectory_sample_interval_seconds = 1.0 +trajectory_min_points = 3 +trajectory_min_confidence = 0.72 +trajectory_motion_delta = 20.0 +trajectory_min_blob_area = 12 +trajectory_max_blob_area_fraction = 0.35 +trajectory_trash_entry_margin = 0.04 +trajectory_backend = "motion" +yolo_enabled = false +yolo_model_path = "" +yolo_min_confidence = 0.65 +``` + +`yolo_enabled = false` is the only valid first implementation mode. The config keys are included so deployment files and UI can evolve without changing the observation contract. + +`trajectory_sample_interval_seconds` applies only while at least one trajectory candidate is active. Normal monitoring keeps using the existing `sample_interval_seconds`. + +## Diagnostics + +Append trajectory diagnostics to each runtime diagnostics row: + +```json +{ + "trajectory": { + "active_candidates": ["1"], + "emitted_evidence": [ + { + "source_zone_id": "1", + "confidence": 0.86, + "method": "motion" + } + ], + "expired_candidates": [], + "rejected_candidates": [ + { + "source_zone_id": "4", + "reason": "target_not_reached" + } + ] + } +} +``` + +Diagnostics should explain why a candidate was accepted, expired, or rejected. This is required for tuning the live camera. + +## Error Handling + +- If tracking cannot run because there is no previous frame, no evidence is emitted. +- If trash ROI is not configured, trajectory evidence is disabled and current generic behavior remains. +- If faster sampling cannot keep up with RTSP capture time, runtime should continue at the achievable rate and record capture timing in diagnostics. +- If multiple zones become empty at once, keep independent candidates. A track can confirm only one source zone unless future YOLO tracking explicitly supports multiple objects. +- If evidence confidence is below threshold, do not close pending disposal. +- If YOLO is enabled later but the model fails to load, runtime should fall back to motion-only tracking and record a diagnostic error. + +## Testing Strategy + +Unit tests: + +- `Observation.from_dict()` accepts and normalizes `disposal_evidence`. +- Engine discards a pending batch from the matching source zone when evidence arrives. +- Engine does not discard zone 1 when evidence says source zone 4. +- Same-observation zone removal plus disposal evidence closes the newly pending batch. +- Generic `trash_deposit_count` still works as fallback. +- Low-confidence evidence is ignored. + +Vision tests: + +- Motion track from zone polygon to trash ROI emits evidence. +- Motion that starts away from the source zone is rejected. +- Motion that never reaches trash ROI is rejected. +- One-frame reflection flash is rejected. +- Multiple active candidates do not cross-close each other. + +Runtime tests: + +- Diagnostics include trajectory status. +- Config defaults load with trajectory enabled and YOLO disabled. +- Existing tests for zone occupancy, trash motion, restore state, API summary, and web zone rendering keep passing. + +## Rollout Plan + +1. Implement the data contract and engine evidence handling behind config. +2. Add motion trajectory backend and diagnostics. +3. Keep generic trash motion fallback enabled during rollout. +4. Deploy to the remote runtime and observe diagnostics for zones 1, 2, 4, 5, 6, and trash ROI. +5. Tune thresholds from live diagnostics. +6. Later, add YOLO backend as a separate implementation that feeds the same evidence contract. + +## Acceptance Criteria + +- Removing an alarmed item from zone 1 and moving it visibly to the trash mouth closes zone 1, not another zone. +- Removing alarmed items from multiple zones close together does not rely on FIFO when trajectory evidence identifies the source zone. +- Motion inside trash ROI alone does not confirm disposal if no source-zone trajectory exists. +- Reflection-only changes do not emit disposal evidence. +- The runtime works without YOLO dependencies installed. +- The future YOLO path can be added by implementing the reserved backend without changing `BatchEngine` event semantics.