- Skip half_hour_report events from webhook posts in people_flow - Handle pre-existing stale worker status files during startup gracefully - Make store_dwell_alert timestamp parsing robust against invalid/empty values - Update lessons learned and todo documentation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
88 lines
9.4 KiB
Markdown
88 lines
9.4 KiB
Markdown
# Lessons
|
|
|
|
## 2026-05-12
|
|
|
|
- Trigger: the user corrected the execution workflow for non-trivial tasks and required persistent task tracking.
|
|
- Rule: for any non-trivial task, create or update `tasks/todo.md` before substantive implementation, keep progress current, and do not mark done without review evidence.
|
|
- Preventive action: check for `tasks/todo.md`, `tasks/lessons.md`, and repository guidance files before editing code; if the user corrects process expectations, record the rule immediately.
|
|
|
|
- Trigger: the user required corrections to be persisted for future sessions.
|
|
- Rule: any user correction must be recorded in `tasks/lessons.md` as `trigger -> rule -> preventive action`.
|
|
- Preventive action: after any correction, update lessons before closing the task and mention the recorded rule in the final verification summary.
|
|
|
|
- Trigger: the user clarified that this repository is meant to run in mainland China environments.
|
|
- Rule: future code, build, deployment, and integration changes must consider mainland China network accessibility and should prefer China-friendly defaults where practical.
|
|
- Preventive action: when adding dependencies, mirrors, external endpoints, or download flows, explicitly check whether the default path works reliably in mainland China and add configuration or fallback when needed.
|
|
|
|
- Trigger: the user required deployment to use `docker compose` only and explicitly disallowed host environment changes.
|
|
- Rule: for remote rollout tasks in this repo, prefer repository-contained `docker compose` changes and do not install packages, edit host configs, or mutate global environment state unless the user explicitly approves it.
|
|
- Preventive action: when a deployment is blocked, first fix Dockerfiles, compose files, env files, and mounted paths inside the repo before considering any host-level workaround.
|
|
|
|
## 2026-05-15
|
|
|
|
- Trigger: the `.11` OTA bundle host did not have a `zip` executable, and the current Python containers no longer exposed the historical `lap` overlay paths.
|
|
- Rule: OTA bundle publication must not assume host archive tools or historical runtime overlay paths are present.
|
|
- Preventive action: when cutting a release, package the ZIP with Python stdlib if `zip` is unavailable, treat overlay extraction as optional unless the paths are verified live in containers, and validate the final archive contents before upload.
|
|
|
|
## 2026-05-18
|
|
|
|
- Trigger: the user clarified that OTA installer updates should not keep repackaging and uploading the whole repository tree or fixed `people_flow_project` weights.
|
|
- Rule: managed-portal OTA releases should ship a minimal ZIP with deploy metadata and managed config only; `people_flow_project` weights should be reused from a stable host location unless the weights themselves changed or the host is new.
|
|
- Preventive action: when preparing OTA artifacts, use the minimal packaging script, exclude `managed/people_flow_project/weights` by default, and only publish a weights-bearing bundle for first-time installs or actual weight updates.
|
|
|
|
## 2026-05-19
|
|
|
|
- Trigger: the user corrected the OTA publication login for `10.8.0.1`.
|
|
- Rule: the OTA web host `10.8.0.1` must be published with `root`, not `xiaozheng`.
|
|
- Preventive action: for future managed-portal OTA rollouts, verify publication access against `root@10.8.0.1:/var/www/html/ai_deploy` before treating upload as blocked.
|
|
|
|
- Trigger: the user clarified that all new installation targets are Ubuntu machines and asked for missing `unzip` to be handled automatically, with weights delivered separately.
|
|
- Rule: the managed-portal OTA installer should treat Ubuntu as the first-install baseline, auto-install `unzip` via `apt-get` when needed, and use a separate people-flow weights archive instead of forcing weights into the main ZIP.
|
|
- Preventive action: keep the main OTA ZIP minimal, publish `people-flow-weights-<RELEASE_VERSION>.tar.gz` alongside each release when weights are available, and validate that the installer still reuses shared weights on upgrades.
|
|
|
|
- Trigger: the user corrected the YOLO weight repair strategy after a host had DeepFace weights but lacked only `yolo11n.pt`.
|
|
- Rule: OTA recovery for a missing small model must not force a full 1GB+ weights archive download or fall back to public GitHub downloads.
|
|
- Preventive action: publish a small `people-flow-yolo11n-<RELEASE_VERSION>.tar.gz` artifact and make the installer download it when only `people_flow_project/weights/yolo11n.pt` is missing.
|
|
|
|
## 2026-06-04
|
|
|
|
- Trigger: the user corrected the OTA Docker registry address for the video-recognition rollout on `10.8.0.14`.
|
|
- Rule: when updating OTA-hosted Docker images, use the exact registry host and port provided by the user; `ota.zhengxinshipin.com` and `ota.zhengxinshipin.com:5443` are not interchangeable.
|
|
- Preventive action: before concluding a remote image reference is missing, verify whether the intended registry includes a non-default port and test the exact `host:port/repo:tag` reference.
|
|
|
|
- Trigger: the user clarified that the managed-portal four-service rollout must follow the published installer on `root@10.8.0.1:/var/www/html/ai_deploy`.
|
|
- Rule: for managed-portal release updates, treat the published installer bundle and its embedded Compose/env files as the deployment source of truth instead of reverse-engineering the current host state.
|
|
- Preventive action: before updating the managed-portal stack on a target host, inspect `install-managed-portal-*.sh`, `release-manifest.env`, and the bundled `docker-compose.ota-release.yml` under `/var/www/html/ai_deploy`.
|
|
|
|
- Trigger: the user redirected a live service investigation from `10.8.0.14` to `10.8.0.15`.
|
|
- Rule: when continuing operational debugging across multiple hosts, do not assume the previously investigated host is still the active target after the user switches machines.
|
|
- Preventive action: restate the target host before diagnosis or remediation, and refresh runtime evidence from that exact machine instead of carrying over prior-host conclusions.
|
|
|
|
## 2026-06-09
|
|
|
|
- Trigger: the user corrected the intended people-flow RTSP source on `10.8.0.22`.
|
|
- Rule: when validating or repairing managed child-service deployments, treat the user-provided live RTSP URL as the source of truth and verify that the running container environment matches it exactly.
|
|
- Preventive action: after any host-specific stream correction, inspect both the release env file and the container's effective `RTSP_URL`; if they differ, recreate only the affected service with the repository Compose/env inputs and record the exact URL used.
|
|
|
|
- Trigger: the user corrected the intended `store_dwell_alert` RTSP source on `10.8.0.15`.
|
|
- Rule: for host-specific `store_dwell_alert` stream changes, verify both `RTSP_URL` and any derived identifiers such as `CAMERA_ID` in the deployed release env and the running container before concluding the rollout is correct.
|
|
- Preventive action: after changing a `store_dwell_alert` stream on a target host, inspect the release env, render `docker compose config`, and recreate only `store-dwell-alert` so the effective `RTSP_URL` and `CAMERA_ID` match the intended source.
|
|
|
|
- Trigger: the user corrected the intended `store_dwell_alert` RTSP source on `10.8.0.22`.
|
|
- Rule: even when the deployed release env on a host already has the intended `store_dwell_alert` stream, do not assume the running container picked it up; verify the live container environment separately.
|
|
- Preventive action: on host-specific `store_dwell_alert` changes, compare `deploy/managed-portal.release.env` with `docker inspect store-dwell-alert`; if the env is already correct but the container is stale, force-recreate only `store-dwell-alert`.
|
|
|
|
## 2026-06-10
|
|
|
|
- Trigger: the user clarified during the `.14` webhook repair that `video-recognition` `input_mode` is dedicated to the RTSP recognition path and must not be changed for webhook integration.
|
|
- Rule: when repairing `store-dwell-alert` to `video-recognition` webhook delivery on a host that already runs RTSP recognition, keep the main `video-recognition` `input_mode` unchanged unless the user explicitly requests a recognition-mode switch.
|
|
- Preventive action: before mirroring a reference host's webhook setup, check whether that host's `input_mode` differs from the target and, if it does, design the fix around a separate receiver path or image rather than changing the target's main recognition mode.
|
|
|
|
- Trigger: the user redirected the `.11` image reuse plan to go through the shared OTA registry tag instead of a host-local sidecar-only image.
|
|
- Rule: when a working image on one host needs to be reused by other machines, publish the exact validated image content to the user-specified OTA registry tag first, then update targets by pulling that registry tag rather than relying on host-local image transfer alone.
|
|
- Preventive action: before rolling a host-specific image fix to a single machine, check whether the user expects the image to become the shared registry baseline; if yes, validate the source image digest and publish it to the exact registry path before updating consumers.
|
|
|
|
- Trigger: the user clarified that the live `.14` deployment fix may use `sudo` on the target host.
|
|
- Rule: when host-owned deployment files block a required live fix and the user explicitly grants `sudo`, prefer the direct `sudo` path over indirect container-side file mutation.
|
|
- Preventive action: if a remote deployment edit fails on file ownership, check whether the user has authorized `sudo`; when authorized, switch to `sudo` for the host-side config edit and service recreation commands.
|