docs: design yolo-ready trajectory evidence
This commit is contained in:
@@ -0,0 +1,278 @@
|
||||
# Lightweight Trajectory Tracking With YOLO-Ready Evidence
|
||||
|
||||
Date: 2026-05-29
|
||||
Branch: `lightweight-trajectory-tracking`
|
||||
|
||||
## Summary
|
||||
|
||||
The runtime currently confirms disposal by matching a zone becoming empty with generic trash-bin motion. That produces false matches when several zones change close together, when the trash ROI moves for an unrelated reason, or when reflection changes look like motion.
|
||||
|
||||
This design adds a trajectory evidence layer. Version 1 uses lightweight motion tracking to infer "source zone -> trash ROI" during a short window after a zone becomes empty. Version 2 can add a trained YOLO backend later without changing the event engine contract.
|
||||
|
||||
The first implementation must not require YOLO, PyTorch, ONNX Runtime, or OpenVINO. It must keep the current ROI occupancy timer and add a stronger disposal confirmation path.
|
||||
|
||||
## Goals
|
||||
|
||||
- Confirm disposal by source zone, not by FIFO matching alone.
|
||||
- Reduce cases where zone 1 or zone 4 removal is incorrectly matched to another zone.
|
||||
- Suppress reflection-only and trash-bin-only movement from confirming disposal.
|
||||
- Keep CPU load low by activating trajectory analysis only after a zone becomes empty.
|
||||
- Preserve a stable data contract that a future trained YOLO model can enrich.
|
||||
|
||||
## Non-Goals
|
||||
|
||||
- Do not convert the whole project to YOLO in the first trajectory version.
|
||||
- Do not train or bundle a model in this branch.
|
||||
- Do not replace ROI occupancy timing; it remains the authority for zone occupied/empty state.
|
||||
- Do not require visual access inside the trash bin. Confirmation is based on motion entering the trash mouth ROI.
|
||||
|
||||
## Current Architecture
|
||||
|
||||
`main.py` captures one RTSP frame per sample interval with `ffmpeg`, passes it to `ZoneOccupancyDetector.observe()`, creates an `Observation`, and sends it to `BatchEngine.process()`.
|
||||
|
||||
`vision.py` currently outputs:
|
||||
|
||||
- `zone_counts`: stable occupied/empty state per configured zone.
|
||||
- `trash_deposit_count`: count of generic trash ROI motion events.
|
||||
- `diagnostics`: metrics for zones and trash motion.
|
||||
|
||||
`engine.py` currently consumes:
|
||||
|
||||
- `Observation.zone_counts`
|
||||
- `Observation.trash_deposit_count`
|
||||
|
||||
When a timed-out batch is removed, it becomes pending disposal. A later trash motion can close pending batches, using FIFO order when source-zone evidence is missing.
|
||||
|
||||
## Proposed Architecture
|
||||
|
||||
Add a trajectory evidence path between vision and engine:
|
||||
|
||||
1. Zone occupancy still runs first.
|
||||
2. When a zone transitions from occupied to empty, vision opens a short tracking window for that zone.
|
||||
3. While any tracking window is active, the runtime temporarily shortens the capture delay so movement is sampled densely enough for a path.
|
||||
4. During the window, a lightweight motion backend tracks moving blobs across the source zone, the path/corridor, and the trash mouth ROI.
|
||||
5. If the path is coherent, vision emits a zone-scoped disposal evidence item.
|
||||
6. The engine applies zone-scoped disposal evidence before using generic trash motion fallback.
|
||||
|
||||
The engine should depend on a neutral evidence format, not on YOLO or any specific tracking backend.
|
||||
|
||||
## Data Contract
|
||||
|
||||
Add `disposal_evidence` to `Observation`.
|
||||
|
||||
Example V1 evidence:
|
||||
|
||||
```json
|
||||
{
|
||||
"source_zone_id": "1",
|
||||
"target": "trash",
|
||||
"confidence": 0.86,
|
||||
"method": "motion",
|
||||
"started_at": "2026-05-29T14:03:20+08:00",
|
||||
"ended_at": "2026-05-29T14:03:25+08:00",
|
||||
"track_points": [[152, 210], [181, 219], [226, 235], [275, 252]],
|
||||
"item_class": null,
|
||||
"detector_score": null
|
||||
}
|
||||
```
|
||||
|
||||
Example later YOLO-enriched evidence:
|
||||
|
||||
```json
|
||||
{
|
||||
"source_zone_id": "1",
|
||||
"target": "trash",
|
||||
"confidence": 0.94,
|
||||
"method": "motion+yolo",
|
||||
"started_at": "2026-05-29T14:03:20+08:00",
|
||||
"ended_at": "2026-05-29T14:03:25+08:00",
|
||||
"track_points": [[152, 210], [181, 219], [226, 235], [275, 252]],
|
||||
"item_class": "trained_product_a",
|
||||
"detector_score": 0.91
|
||||
}
|
||||
```
|
||||
|
||||
`trash_deposit_count` remains for compatibility and fallback, but zone-scoped `disposal_evidence` takes priority.
|
||||
|
||||
## Components
|
||||
|
||||
### `TrajectoryTracker`
|
||||
|
||||
Owns active tracking windows. It receives current frame, timestamp, zone counts, region polygons, and trash ROI.
|
||||
|
||||
Responsibilities:
|
||||
|
||||
- Detect occupied-to-empty transitions.
|
||||
- Start a per-zone candidate window.
|
||||
- Keep recent motion observations for each active candidate.
|
||||
- Decide whether a candidate has enough evidence to emit disposal evidence.
|
||||
- Expire weak candidates without closing a batch.
|
||||
- Report whether any candidate is active so `main.py` can use the faster trajectory sample interval.
|
||||
|
||||
### `MotionTrajectoryBackend`
|
||||
|
||||
The default V1 backend. It uses frame-to-frame differences and connected motion regions.
|
||||
|
||||
Responsibilities:
|
||||
|
||||
- Compute motion mask from the current and previous frame.
|
||||
- Filter out tiny, static, and reflection-like changes.
|
||||
- Extract moving blob centroids and bounding boxes.
|
||||
- Associate centroids over time into a short track.
|
||||
- Return backend-neutral track observations.
|
||||
|
||||
The backend must work without external model dependencies.
|
||||
|
||||
### `YoloDetectionBackend`
|
||||
|
||||
An optional future backend. It is not implemented in V1 but the interface is reserved.
|
||||
|
||||
Responsibilities when enabled later:
|
||||
|
||||
- Run only during active tracking windows or on configured path crops.
|
||||
- Detect trained product classes and optionally hands/person keypoints.
|
||||
- Attach `item_class`, `detector_score`, and bounding boxes to the same evidence contract.
|
||||
- Never bypass trajectory validation. YOLO detections enrich confidence; they do not directly close events.
|
||||
|
||||
### `EvidenceFusion`
|
||||
|
||||
Combines backend output into final evidence.
|
||||
|
||||
V1 uses motion-only signals:
|
||||
|
||||
- Origin score: first meaningful motion is near or inside the source zone.
|
||||
- Direction score: track generally moves from source zone toward trash ROI.
|
||||
- Target score: final track points intersect or approach the trash mouth ROI.
|
||||
- Stability score: track persists across enough frames and is not a one-frame flash.
|
||||
|
||||
V2 can add YOLO class and detector confidence into the same confidence calculation.
|
||||
|
||||
### `BatchEngine`
|
||||
|
||||
The engine should process evidence in this order:
|
||||
|
||||
1. Expire old pending disposal records.
|
||||
2. Apply zone-scoped `disposal_evidence` to matching pending batches first.
|
||||
3. Process zone transitions.
|
||||
4. Apply any evidence created in the same observation to newly pending batches.
|
||||
5. Use remaining generic `trash_deposit_count` as fallback for older behavior.
|
||||
|
||||
Zone-scoped evidence should only discard the pending batch from `source_zone_id`. It must not close a different zone when the source zone has no pending disposal.
|
||||
|
||||
## Runtime Flow
|
||||
|
||||
1. A zone is occupied long enough to create an active batch.
|
||||
2. The batch reaches the dwell alarm threshold and emits `time_alarm`.
|
||||
3. The item is removed from the zone.
|
||||
4. Occupancy confirms the zone is empty, and the tracker opens or continues a short candidate window for that zone.
|
||||
5. The engine emits `batch_pending_disposal` for that zone.
|
||||
6. While the candidate is active, the runtime samples faster than the normal dwell timer interval.
|
||||
7. Motion backend observes a track from source zone toward trash ROI.
|
||||
8. If the track enters the trash mouth ROI with enough confidence, `disposal_evidence` is emitted.
|
||||
9. The engine emits `batch_discarded` for that same zone. If evidence is emitted in the same observation that created pending disposal, the engine applies it after processing the zone-empty transition.
|
||||
10. If no evidence arrives before the pending deadline, the current warning escalation behavior remains.
|
||||
|
||||
## Configuration
|
||||
|
||||
Add runtime settings with conservative defaults:
|
||||
|
||||
```toml
|
||||
[runtime]
|
||||
trajectory_enabled = true
|
||||
trajectory_window_seconds = 8
|
||||
trajectory_sample_interval_seconds = 1.0
|
||||
trajectory_min_points = 3
|
||||
trajectory_min_confidence = 0.72
|
||||
trajectory_motion_delta = 20.0
|
||||
trajectory_min_blob_area = 12
|
||||
trajectory_max_blob_area_fraction = 0.35
|
||||
trajectory_trash_entry_margin = 0.04
|
||||
trajectory_backend = "motion"
|
||||
yolo_enabled = false
|
||||
yolo_model_path = ""
|
||||
yolo_min_confidence = 0.65
|
||||
```
|
||||
|
||||
`yolo_enabled = false` is the only valid first implementation mode. The config keys are included so deployment files and UI can evolve without changing the observation contract.
|
||||
|
||||
`trajectory_sample_interval_seconds` applies only while at least one trajectory candidate is active. Normal monitoring keeps using the existing `sample_interval_seconds`.
|
||||
|
||||
## Diagnostics
|
||||
|
||||
Append trajectory diagnostics to each runtime diagnostics row:
|
||||
|
||||
```json
|
||||
{
|
||||
"trajectory": {
|
||||
"active_candidates": ["1"],
|
||||
"emitted_evidence": [
|
||||
{
|
||||
"source_zone_id": "1",
|
||||
"confidence": 0.86,
|
||||
"method": "motion"
|
||||
}
|
||||
],
|
||||
"expired_candidates": [],
|
||||
"rejected_candidates": [
|
||||
{
|
||||
"source_zone_id": "4",
|
||||
"reason": "target_not_reached"
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Diagnostics should explain why a candidate was accepted, expired, or rejected. This is required for tuning the live camera.
|
||||
|
||||
## Error Handling
|
||||
|
||||
- If tracking cannot run because there is no previous frame, no evidence is emitted.
|
||||
- If trash ROI is not configured, trajectory evidence is disabled and current generic behavior remains.
|
||||
- If faster sampling cannot keep up with RTSP capture time, runtime should continue at the achievable rate and record capture timing in diagnostics.
|
||||
- If multiple zones become empty at once, keep independent candidates. A track can confirm only one source zone unless future YOLO tracking explicitly supports multiple objects.
|
||||
- If evidence confidence is below threshold, do not close pending disposal.
|
||||
- If YOLO is enabled later but the model fails to load, runtime should fall back to motion-only tracking and record a diagnostic error.
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
Unit tests:
|
||||
|
||||
- `Observation.from_dict()` accepts and normalizes `disposal_evidence`.
|
||||
- Engine discards a pending batch from the matching source zone when evidence arrives.
|
||||
- Engine does not discard zone 1 when evidence says source zone 4.
|
||||
- Same-observation zone removal plus disposal evidence closes the newly pending batch.
|
||||
- Generic `trash_deposit_count` still works as fallback.
|
||||
- Low-confidence evidence is ignored.
|
||||
|
||||
Vision tests:
|
||||
|
||||
- Motion track from zone polygon to trash ROI emits evidence.
|
||||
- Motion that starts away from the source zone is rejected.
|
||||
- Motion that never reaches trash ROI is rejected.
|
||||
- One-frame reflection flash is rejected.
|
||||
- Multiple active candidates do not cross-close each other.
|
||||
|
||||
Runtime tests:
|
||||
|
||||
- Diagnostics include trajectory status.
|
||||
- Config defaults load with trajectory enabled and YOLO disabled.
|
||||
- Existing tests for zone occupancy, trash motion, restore state, API summary, and web zone rendering keep passing.
|
||||
|
||||
## Rollout Plan
|
||||
|
||||
1. Implement the data contract and engine evidence handling behind config.
|
||||
2. Add motion trajectory backend and diagnostics.
|
||||
3. Keep generic trash motion fallback enabled during rollout.
|
||||
4. Deploy to the remote runtime and observe diagnostics for zones 1, 2, 4, 5, 6, and trash ROI.
|
||||
5. Tune thresholds from live diagnostics.
|
||||
6. Later, add YOLO backend as a separate implementation that feeds the same evidence contract.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- Removing an alarmed item from zone 1 and moving it visibly to the trash mouth closes zone 1, not another zone.
|
||||
- Removing alarmed items from multiple zones close together does not rely on FIFO when trajectory evidence identifies the source zone.
|
||||
- Motion inside trash ROI alone does not confirm disposal if no source-zone trajectory exists.
|
||||
- Reflection-only changes do not emit disposal evidence.
|
||||
- The runtime works without YOLO dependencies installed.
|
||||
- The future YOLO path can be added by implementing the reserved backend without changing `BatchEngine` event semantics.
|
||||
Reference in New Issue
Block a user