Initial video AI analysis project
This commit is contained in:
190
docs/superpowers/plans/2026-06-16-hik-cloud-download-analysis.md
Normal file
190
docs/superpowers/plans/2026-06-16-hik-cloud-download-analysis.md
Normal file
@@ -0,0 +1,190 @@
|
||||
# Hik Cloud Download Analysis Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Add Hik Cloud Storage recording download as a configurable multi-device source, then feed downloaded videos into the existing model analysis pipeline.
|
||||
|
||||
**Architecture:** Keep the current local-folder pipeline intact. Add a cloud acquisition module that plans one-hour chunks, calls the Hik download-address API, downloads videos to local output storage, records a download manifest, and returns local file records for the existing probe/frame/clip/inference/aggregate stages.
|
||||
|
||||
**Tech Stack:** Python standard library, existing `unittest` suite, existing JSONL manifest helpers, FFmpeg/vLLM pipeline already in `video_ai_analysis_poc`.
|
||||
|
||||
---
|
||||
|
||||
### Task 1: Config Schema And Time Chunking
|
||||
|
||||
**Files:**
|
||||
- Modify: `video_ai_analysis_poc/config.py`
|
||||
- Create: `video_ai_analysis_poc/hik_cloud.py`
|
||||
- Modify: `tests/test_config.py`
|
||||
- Create: `tests/test_hik_cloud.py`
|
||||
|
||||
- [ ] **Step 1: Write failing config tests**
|
||||
|
||||
Add tests that load:
|
||||
|
||||
```yaml
|
||||
source:
|
||||
mode: hik_cloud
|
||||
hik_cloud:
|
||||
access_token_env: HIK_CLOUD_ACCESS_TOKEN
|
||||
devices:
|
||||
- device_serial: EXAMPLE_DEVICE_SERIAL
|
||||
channel_no: 1
|
||||
name: front
|
||||
time_ranges:
|
||||
- begin: "2026-02-03 09:00:00"
|
||||
end: "2026-02-03 10:30:00"
|
||||
```
|
||||
|
||||
Expected: `source.mode == "hik_cloud"`, `devices` is a list of dicts, and `time_ranges` is a list of dicts.
|
||||
|
||||
- [ ] **Step 2: Write failing chunk tests**
|
||||
|
||||
Test that `build_download_chunks(...)` converts the range above into chunks with `timeEnd - timeBegin <= 3600`.
|
||||
|
||||
- [ ] **Step 3: Run red tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
python3 -B -m unittest tests.test_config tests.test_hik_cloud -v
|
||||
```
|
||||
|
||||
Expected: fail because list-of-mapping parsing and `hik_cloud.py` do not exist yet.
|
||||
|
||||
- [ ] **Step 4: Implement minimal parser/defaults/chunking**
|
||||
|
||||
Extend the simple YAML parser only enough for list items shaped as mappings. Add defaults for `source` and `hik_cloud`. Implement date-time parsing with `zoneinfo.ZoneInfo`.
|
||||
|
||||
- [ ] **Step 5: Run green tests**
|
||||
|
||||
Run the same unittest command. Expected: pass.
|
||||
|
||||
### Task 2: Hik Download Address API Client
|
||||
|
||||
**Files:**
|
||||
- Modify: `video_ai_analysis_poc/hik_cloud.py`
|
||||
- Modify: `tests/test_hik_cloud.py`
|
||||
|
||||
- [ ] **Step 1: Write failing API client tests**
|
||||
|
||||
Mock the HTTP function and verify:
|
||||
|
||||
- URL is `api_base_url.rstrip("/") + download_path`.
|
||||
- Headers include `Authorization: bearer TOKEN`.
|
||||
- JSON body includes `deviceSerial`, `channelNo`, `timeBegin`, `timeEnd`.
|
||||
- Success returns URL and actual begin/end.
|
||||
- Code `80438027` returns a structured `no_recording` result.
|
||||
- Other non-zero codes return `address_failed`.
|
||||
|
||||
- [ ] **Step 2: Run red tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
python3 -B -m unittest tests.test_hik_cloud -v
|
||||
```
|
||||
|
||||
Expected: fail because the client is missing.
|
||||
|
||||
- [ ] **Step 3: Implement client**
|
||||
|
||||
Use `urllib.request` and injectable callables for tests. Do not log or persist the token.
|
||||
|
||||
- [ ] **Step 4: Run green tests**
|
||||
|
||||
Run the same command. Expected: pass.
|
||||
|
||||
### Task 3: Download Files And Manifest
|
||||
|
||||
**Files:**
|
||||
- Modify: `video_ai_analysis_poc/hik_cloud.py`
|
||||
- Modify: `video_ai_analysis_poc/paths.py`
|
||||
- Modify: `tests/test_hik_cloud.py`
|
||||
|
||||
- [ ] **Step 1: Write failing downloader tests**
|
||||
|
||||
Mock address results and download bytes. Verify downloaded files are written under `downloads/hik_cloud/<device>/ch<channel>/`, filenames contain requested timestamps, manifest rows are written, token/query signatures are not in filenames, and resume skips already downloaded files.
|
||||
|
||||
- [ ] **Step 2: Run red tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
python3 -B -m unittest tests.test_hik_cloud -v
|
||||
```
|
||||
|
||||
Expected: fail because downloader/manifest behavior is missing.
|
||||
|
||||
- [ ] **Step 3: Implement downloader**
|
||||
|
||||
Write `download_hik_cloud_recordings(config, output_dir, *, address_client=None, download_url=None)` returning downloaded video records with cloud metadata.
|
||||
|
||||
- [ ] **Step 4: Run green tests**
|
||||
|
||||
Run the same command. Expected: pass.
|
||||
|
||||
### Task 4: CLI Cloud Source Integration
|
||||
|
||||
**Files:**
|
||||
- Modify: `video_ai_analysis_poc/cli.py`
|
||||
- Modify: `tests/test_cli.py`
|
||||
|
||||
- [ ] **Step 1: Write failing CLI tests**
|
||||
|
||||
Add tests that:
|
||||
|
||||
- `source.mode: local` still uses `discover_videos`.
|
||||
- `source.mode: hik_cloud` calls the cloud downloader and probes returned downloaded paths.
|
||||
- `--dry-run` in cloud mode requests download addresses and writes the download manifest, but does not download video files, probe, call FFmpeg, call VLM, or aggregate.
|
||||
- `--until clips` in cloud mode produces video/frame/clip manifests from mocked downloaded video records.
|
||||
|
||||
- [ ] **Step 2: Run red tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
python3 -B -m unittest tests.test_cli -v
|
||||
```
|
||||
|
||||
Expected: fail because CLI has no source mode branch.
|
||||
|
||||
- [ ] **Step 3: Implement CLI branch**
|
||||
|
||||
Keep local behavior unchanged. In cloud mode, call downloader before probe and carry cloud metadata into `video_manifest.jsonl`.
|
||||
|
||||
- [ ] **Step 4: Run green tests**
|
||||
|
||||
Run the same command. Expected: pass.
|
||||
|
||||
### Task 5: Docs, Example Config, And Full Verification
|
||||
|
||||
**Files:**
|
||||
- Modify: `config/local_batch.yaml`
|
||||
- Modify: `docs/project.md`
|
||||
- Modify: `findings.md`
|
||||
- Modify: `progress.md`
|
||||
- Modify: `memories.md`
|
||||
|
||||
- [ ] **Step 1: Update docs/config**
|
||||
|
||||
Add a commented or safe example for `source.mode: hik_cloud`, token env var, devices, and time ranges. Do not include a real token.
|
||||
|
||||
- [ ] **Step 2: Run full tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
python3 -B -m unittest discover -s tests -v
|
||||
python3 -B -m py_compile video_ai_analysis_poc/*.py
|
||||
```
|
||||
|
||||
Expected: all pass.
|
||||
|
||||
- [ ] **Step 3: Run local mock smoke**
|
||||
|
||||
Use test mocks or a temporary local HTTP fixture to verify cloud mode can produce downloaded files and continue to `--until clips` without a real Hik token.
|
||||
|
||||
- [ ] **Step 4: Record results**
|
||||
|
||||
Update `progress.md` with commands, results, files changed, and remaining risk. Real Hik API verification is skipped until a real AccessToken/device/time range is provided.
|
||||
@@ -0,0 +1,151 @@
|
||||
# Hik Cloud Download Analysis Design
|
||||
|
||||
## Goal
|
||||
|
||||
Add Hik Cloud Storage recording download as a first-class video source for the existing video analysis pipeline. The implementation must support configurable AccessToken, multiple devices, configurable date-time ranges, one-hour API slicing, video downloads, and reuse the existing local analysis pipeline.
|
||||
|
||||
## Source Model
|
||||
|
||||
The pipeline keeps the existing local mode and adds a cloud mode:
|
||||
|
||||
```yaml
|
||||
source:
|
||||
mode: local # local | hik_cloud
|
||||
```
|
||||
|
||||
`local` keeps the current folder discovery behavior. `hik_cloud` runs a download stage first, then analyzes the downloaded files exactly like local files.
|
||||
|
||||
## Hik Cloud Configuration
|
||||
|
||||
The config should allow a literal token for controlled testing and an environment variable for normal use:
|
||||
|
||||
```yaml
|
||||
hik_cloud:
|
||||
api_base_url: https://api2.hik-cloud.com
|
||||
download_path: /v1/carrier/cstorage/open/play/download
|
||||
access_token: null
|
||||
access_token_env: HIK_CLOUD_ACCESS_TOKEN
|
||||
chunk_seconds: 600
|
||||
timeout_seconds: 60
|
||||
download_timeout_seconds: 600
|
||||
devices:
|
||||
- device_serial: EXAMPLE_DEVICE_SERIAL
|
||||
channel_no: 1
|
||||
name: store-front
|
||||
time_ranges:
|
||||
- begin: "2026-02-03 09:00:00"
|
||||
end: "2026-02-03 11:30:00"
|
||||
```
|
||||
|
||||
The implementation must not print or persist the token. Manifest entries may record the API URL path, device serial, channel, requested times, actual times, and status, but not the Authorization header.
|
||||
|
||||
## Time Handling
|
||||
|
||||
The user-facing time range includes year, month, day, hour, minute, and second. The config supports both `YYYY-MM-DD HH:MM:SS` strings and integer epoch seconds. String parsing uses `runtime.timezone`, defaulting to `Asia/Shanghai`, and converts to Unix seconds for `timeBegin` and `timeEnd`.
|
||||
|
||||
Ranges are split into chunks with `end - begin <= 3600` because the PDF documents error `80430002` when the requested interval exceeds 3600 seconds. The example default uses 600 seconds because real remote smoke found that shorter chunks produced valid, probeable MP4 files for the provided test range.
|
||||
|
||||
## API Contract
|
||||
|
||||
Use the PDF section “2、获取录像下载地址”:
|
||||
|
||||
```text
|
||||
POST https://api2.hik-cloud.com/v1/carrier/cstorage/open/play/download
|
||||
Authorization: bearer <AccessToken>
|
||||
Content-Type: application/json
|
||||
```
|
||||
|
||||
Request body:
|
||||
|
||||
```json
|
||||
{
|
||||
"deviceSerial": "EXAMPLE_DEVICE_SERIAL",
|
||||
"channelNo": 1,
|
||||
"timeBegin": 1764856787,
|
||||
"timeEnd": 1764856978
|
||||
}
|
||||
```
|
||||
|
||||
Successful response:
|
||||
|
||||
```json
|
||||
{
|
||||
"code": 0,
|
||||
"data": {
|
||||
"url": "https://...",
|
||||
"actualBeginTime": "1764856787",
|
||||
"actualEndTime": "1764856978"
|
||||
},
|
||||
"success": true
|
||||
}
|
||||
```
|
||||
|
||||
Non-zero codes become structured failures. `80438027` is treated as `no_recording` so one empty chunk does not stop the batch.
|
||||
|
||||
## Output Contract
|
||||
|
||||
Cloud downloads write a dedicated manifest:
|
||||
|
||||
```text
|
||||
<output.dir>/hik_cloud_download_manifest.jsonl
|
||||
```
|
||||
|
||||
Each row contains:
|
||||
|
||||
- `source: hik_cloud`
|
||||
- `device_serial`
|
||||
- `channel_no`
|
||||
- `requested_begin`, `requested_end`
|
||||
- `actual_begin`, `actual_end`
|
||||
- `download_url_host` or no URL at all if avoiding host persistence is preferred
|
||||
- `path` for downloaded video
|
||||
- `status`: `address_ok`, `downloaded`, `no_recording`, `address_failed`, `download_failed`
|
||||
- `retry_count`, `last_error`
|
||||
|
||||
Downloaded videos go under:
|
||||
|
||||
```text
|
||||
<output.dir>/downloads/hik_cloud/<device_serial>/ch<channel_no>/
|
||||
```
|
||||
|
||||
Filenames use device/channel/requested timestamps and never include URL query signatures or tokens.
|
||||
|
||||
## Pipeline Integration
|
||||
|
||||
`cli.py` should branch only at source acquisition:
|
||||
|
||||
```text
|
||||
local mode:
|
||||
discover local videos -> probe -> frames -> clips -> inference -> aggregate
|
||||
|
||||
hik_cloud mode:
|
||||
build chunks -> request download URLs -> download videos -> probe -> frames -> clips -> inference -> aggregate
|
||||
```
|
||||
|
||||
After downloads complete, the rest of the pipeline should consume downloaded file paths and preserve cloud metadata in `video_manifest.jsonl`.
|
||||
|
||||
FFmpeg sampling caps output frames from the requested/actual cloud chunk duration. This prevents malformed or irregular Hik MP4 timestamps from making the `fps=1` filter duplicate tens of thousands of frames for a 10-minute chunk.
|
||||
|
||||
Cloud `--dry-run` stops at download-address planning: it requests addresses and writes `hik_cloud_download_manifest.jsonl`, but does not download video files, run ffprobe, sample frames, infer, or aggregate.
|
||||
|
||||
## Error Handling
|
||||
|
||||
- Missing token: fail fast with a clear config error in `hik_cloud` mode.
|
||||
- Invalid range: fail fast if `end <= begin`.
|
||||
- API code 80438027: record `no_recording`, continue.
|
||||
- Other API non-zero code: record `address_failed`, continue other chunks.
|
||||
- Download HTTP/IO failure: record `download_failed`, continue other chunks.
|
||||
- Existing downloaded file with manifest status `downloaded`: skip on resume.
|
||||
|
||||
## Testing
|
||||
|
||||
Use TDD with standard-library mocks:
|
||||
|
||||
- config parser loads `devices` as list of dicts.
|
||||
- time parser accepts date-time strings and epoch integers.
|
||||
- splitter produces max-3600-second chunks.
|
||||
- API client builds correct URL, body, bearer header, and parses success/failure.
|
||||
- downloader writes bytes and manifest without persisting token.
|
||||
- CLI cloud mode uses downloaded files and keeps local mode unchanged.
|
||||
|
||||
Real Hik API smoke uses the sensitive `access_token.md` file provided by the user on the remote test environment. Do not copy values from that file into docs, tests, logs, or final responses.
|
||||
Reference in New Issue
Block a user