# Lessons ## 2026-05-12 - Trigger: the user corrected the execution workflow for non-trivial tasks and required persistent task tracking. - Rule: for any non-trivial task, create or update `tasks/todo.md` before substantive implementation, keep progress current, and do not mark done without review evidence. - Preventive action: check for `tasks/todo.md`, `tasks/lessons.md`, and repository guidance files before editing code; if the user corrects process expectations, record the rule immediately. - Trigger: the user required corrections to be persisted for future sessions. - Rule: any user correction must be recorded in `tasks/lessons.md` as `trigger -> rule -> preventive action`. - Preventive action: after any correction, update lessons before closing the task and mention the recorded rule in the final verification summary. - Trigger: the user clarified that this repository is meant to run in mainland China environments. - Rule: future code, build, deployment, and integration changes must consider mainland China network accessibility and should prefer China-friendly defaults where practical. - Preventive action: when adding dependencies, mirrors, external endpoints, or download flows, explicitly check whether the default path works reliably in mainland China and add configuration or fallback when needed. - Trigger: the user required deployment to use `docker compose` only and explicitly disallowed host environment changes. - Rule: for remote rollout tasks in this repo, prefer repository-contained `docker compose` changes and do not install packages, edit host configs, or mutate global environment state unless the user explicitly approves it. - Preventive action: when a deployment is blocked, first fix Dockerfiles, compose files, env files, and mounted paths inside the repo before considering any host-level workaround. ## 2026-05-15 - Trigger: the `.11` OTA bundle host did not have a `zip` executable, and the current Python containers no longer exposed the historical `lap` overlay paths. - Rule: OTA bundle publication must not assume host archive tools or historical runtime overlay paths are present. - Preventive action: when cutting a release, package the ZIP with Python stdlib if `zip` is unavailable, treat overlay extraction as optional unless the paths are verified live in containers, and validate the final archive contents before upload. ## 2026-05-18 - Trigger: the user clarified that OTA installer updates should not keep repackaging and uploading the whole repository tree or fixed `people_flow_project` weights. - Rule: managed-portal OTA releases should ship a minimal ZIP with deploy metadata and managed config only; `people_flow_project` weights should be reused from a stable host location unless the weights themselves changed or the host is new. - Preventive action: when preparing OTA artifacts, use the minimal packaging script, exclude `managed/people_flow_project/weights` by default, and only publish a weights-bearing bundle for first-time installs or actual weight updates. ## 2026-05-19 - Trigger: the user corrected the OTA publication login for `10.8.0.1`. - Rule: the OTA web host `10.8.0.1` must be published with `root`, not `xiaozheng`. - Preventive action: for future managed-portal OTA rollouts, verify publication access against `root@10.8.0.1:/var/www/html/ai_deploy` before treating upload as blocked. - Trigger: the user clarified that all new installation targets are Ubuntu machines and asked for missing `unzip` to be handled automatically, with weights delivered separately. - Rule: the managed-portal OTA installer should treat Ubuntu as the first-install baseline, auto-install `unzip` via `apt-get` when needed, and use a separate people-flow weights archive instead of forcing weights into the main ZIP. - Preventive action: keep the main OTA ZIP minimal, publish `people-flow-weights-.tar.gz` alongside each release when weights are available, and validate that the installer still reuses shared weights on upgrades. - Trigger: the user corrected the YOLO weight repair strategy after a host had DeepFace weights but lacked only `yolo11n.pt`. - Rule: OTA recovery for a missing small model must not force a full 1GB+ weights archive download or fall back to public GitHub downloads. - Preventive action: publish a small `people-flow-yolo11n-.tar.gz` artifact and make the installer download it when only `people_flow_project/weights/yolo11n.pt` is missing. ## 2026-06-04 - Trigger: the user corrected the OTA Docker registry address for the video-recognition rollout on `10.8.0.14`. - Rule: when updating OTA-hosted Docker images, use the exact registry host and port provided by the user; `ota.zhengxinshipin.com` and `ota.zhengxinshipin.com:5443` are not interchangeable. - Preventive action: before concluding a remote image reference is missing, verify whether the intended registry includes a non-default port and test the exact `host:port/repo:tag` reference. - Trigger: the user clarified that the managed-portal four-service rollout must follow the published installer on `root@10.8.0.1:/var/www/html/ai_deploy`. - Rule: for managed-portal release updates, treat the published installer bundle and its embedded Compose/env files as the deployment source of truth instead of reverse-engineering the current host state. - Preventive action: before updating the managed-portal stack on a target host, inspect `install-managed-portal-*.sh`, `release-manifest.env`, and the bundled `docker-compose.ota-release.yml` under `/var/www/html/ai_deploy`. - Trigger: the user redirected a live service investigation from `10.8.0.14` to `10.8.0.15`. - Rule: when continuing operational debugging across multiple hosts, do not assume the previously investigated host is still the active target after the user switches machines. - Preventive action: restate the target host before diagnosis or remediation, and refresh runtime evidence from that exact machine instead of carrying over prior-host conclusions. ## 2026-06-09 - Trigger: the user corrected the intended people-flow RTSP source on `10.8.0.22`. - Rule: when validating or repairing managed child-service deployments, treat the user-provided live RTSP URL as the source of truth and verify that the running container environment matches it exactly. - Preventive action: after any host-specific stream correction, inspect both the release env file and the container's effective `RTSP_URL`; if they differ, recreate only the affected service with the repository Compose/env inputs and record the exact URL used. - Trigger: the user corrected the intended `store_dwell_alert` RTSP source on `10.8.0.15`. - Rule: for host-specific `store_dwell_alert` stream changes, verify both `RTSP_URL` and any derived identifiers such as `CAMERA_ID` in the deployed release env and the running container before concluding the rollout is correct. - Preventive action: after changing a `store_dwell_alert` stream on a target host, inspect the release env, render `docker compose config`, and recreate only `store-dwell-alert` so the effective `RTSP_URL` and `CAMERA_ID` match the intended source. - Trigger: the user corrected the intended `store_dwell_alert` RTSP source on `10.8.0.22`. - Rule: even when the deployed release env on a host already has the intended `store_dwell_alert` stream, do not assume the running container picked it up; verify the live container environment separately. - Preventive action: on host-specific `store_dwell_alert` changes, compare `deploy/managed-portal.release.env` with `docker inspect store-dwell-alert`; if the env is already correct but the container is stale, force-recreate only `store-dwell-alert`. ## 2026-06-10 - Trigger: the user clarified during the `.14` webhook repair that `video-recognition` `input_mode` is dedicated to the RTSP recognition path and must not be changed for webhook integration. - Rule: when repairing `store-dwell-alert` to `video-recognition` webhook delivery on a host that already runs RTSP recognition, keep the main `video-recognition` `input_mode` unchanged unless the user explicitly requests a recognition-mode switch. - Preventive action: before mirroring a reference host's webhook setup, check whether that host's `input_mode` differs from the target and, if it does, design the fix around a separate receiver path or image rather than changing the target's main recognition mode. - Trigger: the user redirected the `.11` image reuse plan to go through the shared OTA registry tag instead of a host-local sidecar-only image. - Rule: when a working image on one host needs to be reused by other machines, publish the exact validated image content to the user-specified OTA registry tag first, then update targets by pulling that registry tag rather than relying on host-local image transfer alone. - Preventive action: before rolling a host-specific image fix to a single machine, check whether the user expects the image to become the shared registry baseline; if yes, validate the source image digest and publish it to the exact registry path before updating consumers. - Trigger: the user clarified that the live `.14` deployment fix may use `sudo` on the target host. - Rule: when host-owned deployment files block a required live fix and the user explicitly grants `sudo`, prefer the direct `sudo` path over indirect container-side file mutation. - Preventive action: if a remote deployment edit fails on file ownership, check whether the user has authorized `sudo`; when authorized, switch to `sudo` for the host-side config edit and service recreation commands.