docs: design yolo-ready trajectory evidence

This commit is contained in:
Yoilun
2026-05-29 15:13:48 +08:00
parent 8b5bbff364
commit ac6d368810

View File

@@ -0,0 +1,278 @@
# Lightweight Trajectory Tracking With YOLO-Ready Evidence
Date: 2026-05-29
Branch: `lightweight-trajectory-tracking`
## Summary
The runtime currently confirms disposal by matching a zone becoming empty with generic trash-bin motion. That produces false matches when several zones change close together, when the trash ROI moves for an unrelated reason, or when reflection changes look like motion.
This design adds a trajectory evidence layer. Version 1 uses lightweight motion tracking to infer "source zone -> trash ROI" during a short window after a zone becomes empty. Version 2 can add a trained YOLO backend later without changing the event engine contract.
The first implementation must not require YOLO, PyTorch, ONNX Runtime, or OpenVINO. It must keep the current ROI occupancy timer and add a stronger disposal confirmation path.
## Goals
- Confirm disposal by source zone, not by FIFO matching alone.
- Reduce cases where zone 1 or zone 4 removal is incorrectly matched to another zone.
- Suppress reflection-only and trash-bin-only movement from confirming disposal.
- Keep CPU load low by activating trajectory analysis only after a zone becomes empty.
- Preserve a stable data contract that a future trained YOLO model can enrich.
## Non-Goals
- Do not convert the whole project to YOLO in the first trajectory version.
- Do not train or bundle a model in this branch.
- Do not replace ROI occupancy timing; it remains the authority for zone occupied/empty state.
- Do not require visual access inside the trash bin. Confirmation is based on motion entering the trash mouth ROI.
## Current Architecture
`main.py` captures one RTSP frame per sample interval with `ffmpeg`, passes it to `ZoneOccupancyDetector.observe()`, creates an `Observation`, and sends it to `BatchEngine.process()`.
`vision.py` currently outputs:
- `zone_counts`: stable occupied/empty state per configured zone.
- `trash_deposit_count`: count of generic trash ROI motion events.
- `diagnostics`: metrics for zones and trash motion.
`engine.py` currently consumes:
- `Observation.zone_counts`
- `Observation.trash_deposit_count`
When a timed-out batch is removed, it becomes pending disposal. A later trash motion can close pending batches, using FIFO order when source-zone evidence is missing.
## Proposed Architecture
Add a trajectory evidence path between vision and engine:
1. Zone occupancy still runs first.
2. When a zone transitions from occupied to empty, vision opens a short tracking window for that zone.
3. While any tracking window is active, the runtime temporarily shortens the capture delay so movement is sampled densely enough for a path.
4. During the window, a lightweight motion backend tracks moving blobs across the source zone, the path/corridor, and the trash mouth ROI.
5. If the path is coherent, vision emits a zone-scoped disposal evidence item.
6. The engine applies zone-scoped disposal evidence before using generic trash motion fallback.
The engine should depend on a neutral evidence format, not on YOLO or any specific tracking backend.
## Data Contract
Add `disposal_evidence` to `Observation`.
Example V1 evidence:
```json
{
"source_zone_id": "1",
"target": "trash",
"confidence": 0.86,
"method": "motion",
"started_at": "2026-05-29T14:03:20+08:00",
"ended_at": "2026-05-29T14:03:25+08:00",
"track_points": [[152, 210], [181, 219], [226, 235], [275, 252]],
"item_class": null,
"detector_score": null
}
```
Example later YOLO-enriched evidence:
```json
{
"source_zone_id": "1",
"target": "trash",
"confidence": 0.94,
"method": "motion+yolo",
"started_at": "2026-05-29T14:03:20+08:00",
"ended_at": "2026-05-29T14:03:25+08:00",
"track_points": [[152, 210], [181, 219], [226, 235], [275, 252]],
"item_class": "trained_product_a",
"detector_score": 0.91
}
```
`trash_deposit_count` remains for compatibility and fallback, but zone-scoped `disposal_evidence` takes priority.
## Components
### `TrajectoryTracker`
Owns active tracking windows. It receives current frame, timestamp, zone counts, region polygons, and trash ROI.
Responsibilities:
- Detect occupied-to-empty transitions.
- Start a per-zone candidate window.
- Keep recent motion observations for each active candidate.
- Decide whether a candidate has enough evidence to emit disposal evidence.
- Expire weak candidates without closing a batch.
- Report whether any candidate is active so `main.py` can use the faster trajectory sample interval.
### `MotionTrajectoryBackend`
The default V1 backend. It uses frame-to-frame differences and connected motion regions.
Responsibilities:
- Compute motion mask from the current and previous frame.
- Filter out tiny, static, and reflection-like changes.
- Extract moving blob centroids and bounding boxes.
- Associate centroids over time into a short track.
- Return backend-neutral track observations.
The backend must work without external model dependencies.
### `YoloDetectionBackend`
An optional future backend. It is not implemented in V1 but the interface is reserved.
Responsibilities when enabled later:
- Run only during active tracking windows or on configured path crops.
- Detect trained product classes and optionally hands/person keypoints.
- Attach `item_class`, `detector_score`, and bounding boxes to the same evidence contract.
- Never bypass trajectory validation. YOLO detections enrich confidence; they do not directly close events.
### `EvidenceFusion`
Combines backend output into final evidence.
V1 uses motion-only signals:
- Origin score: first meaningful motion is near or inside the source zone.
- Direction score: track generally moves from source zone toward trash ROI.
- Target score: final track points intersect or approach the trash mouth ROI.
- Stability score: track persists across enough frames and is not a one-frame flash.
V2 can add YOLO class and detector confidence into the same confidence calculation.
### `BatchEngine`
The engine should process evidence in this order:
1. Expire old pending disposal records.
2. Apply zone-scoped `disposal_evidence` to matching pending batches first.
3. Process zone transitions.
4. Apply any evidence created in the same observation to newly pending batches.
5. Use remaining generic `trash_deposit_count` as fallback for older behavior.
Zone-scoped evidence should only discard the pending batch from `source_zone_id`. It must not close a different zone when the source zone has no pending disposal.
## Runtime Flow
1. A zone is occupied long enough to create an active batch.
2. The batch reaches the dwell alarm threshold and emits `time_alarm`.
3. The item is removed from the zone.
4. Occupancy confirms the zone is empty, and the tracker opens or continues a short candidate window for that zone.
5. The engine emits `batch_pending_disposal` for that zone.
6. While the candidate is active, the runtime samples faster than the normal dwell timer interval.
7. Motion backend observes a track from source zone toward trash ROI.
8. If the track enters the trash mouth ROI with enough confidence, `disposal_evidence` is emitted.
9. The engine emits `batch_discarded` for that same zone. If evidence is emitted in the same observation that created pending disposal, the engine applies it after processing the zone-empty transition.
10. If no evidence arrives before the pending deadline, the current warning escalation behavior remains.
## Configuration
Add runtime settings with conservative defaults:
```toml
[runtime]
trajectory_enabled = true
trajectory_window_seconds = 8
trajectory_sample_interval_seconds = 1.0
trajectory_min_points = 3
trajectory_min_confidence = 0.72
trajectory_motion_delta = 20.0
trajectory_min_blob_area = 12
trajectory_max_blob_area_fraction = 0.35
trajectory_trash_entry_margin = 0.04
trajectory_backend = "motion"
yolo_enabled = false
yolo_model_path = ""
yolo_min_confidence = 0.65
```
`yolo_enabled = false` is the only valid first implementation mode. The config keys are included so deployment files and UI can evolve without changing the observation contract.
`trajectory_sample_interval_seconds` applies only while at least one trajectory candidate is active. Normal monitoring keeps using the existing `sample_interval_seconds`.
## Diagnostics
Append trajectory diagnostics to each runtime diagnostics row:
```json
{
"trajectory": {
"active_candidates": ["1"],
"emitted_evidence": [
{
"source_zone_id": "1",
"confidence": 0.86,
"method": "motion"
}
],
"expired_candidates": [],
"rejected_candidates": [
{
"source_zone_id": "4",
"reason": "target_not_reached"
}
]
}
}
```
Diagnostics should explain why a candidate was accepted, expired, or rejected. This is required for tuning the live camera.
## Error Handling
- If tracking cannot run because there is no previous frame, no evidence is emitted.
- If trash ROI is not configured, trajectory evidence is disabled and current generic behavior remains.
- If faster sampling cannot keep up with RTSP capture time, runtime should continue at the achievable rate and record capture timing in diagnostics.
- If multiple zones become empty at once, keep independent candidates. A track can confirm only one source zone unless future YOLO tracking explicitly supports multiple objects.
- If evidence confidence is below threshold, do not close pending disposal.
- If YOLO is enabled later but the model fails to load, runtime should fall back to motion-only tracking and record a diagnostic error.
## Testing Strategy
Unit tests:
- `Observation.from_dict()` accepts and normalizes `disposal_evidence`.
- Engine discards a pending batch from the matching source zone when evidence arrives.
- Engine does not discard zone 1 when evidence says source zone 4.
- Same-observation zone removal plus disposal evidence closes the newly pending batch.
- Generic `trash_deposit_count` still works as fallback.
- Low-confidence evidence is ignored.
Vision tests:
- Motion track from zone polygon to trash ROI emits evidence.
- Motion that starts away from the source zone is rejected.
- Motion that never reaches trash ROI is rejected.
- One-frame reflection flash is rejected.
- Multiple active candidates do not cross-close each other.
Runtime tests:
- Diagnostics include trajectory status.
- Config defaults load with trajectory enabled and YOLO disabled.
- Existing tests for zone occupancy, trash motion, restore state, API summary, and web zone rendering keep passing.
## Rollout Plan
1. Implement the data contract and engine evidence handling behind config.
2. Add motion trajectory backend and diagnostics.
3. Keep generic trash motion fallback enabled during rollout.
4. Deploy to the remote runtime and observe diagnostics for zones 1, 2, 4, 5, 6, and trash ROI.
5. Tune thresholds from live diagnostics.
6. Later, add YOLO backend as a separate implementation that feeds the same evidence contract.
## Acceptance Criteria
- Removing an alarmed item from zone 1 and moving it visibly to the trash mouth closes zone 1, not another zone.
- Removing alarmed items from multiple zones close together does not rely on FIFO when trajectory evidence identifies the source zone.
- Motion inside trash ROI alone does not confirm disposal if no source-zone trajectory exists.
- Reflection-only changes do not emit disposal evidence.
- The runtime works without YOLO dependencies installed.
- The future YOLO path can be added by implementing the reserved backend without changing `BatchEngine` event semantics.