Files
video-ai-analysis/config/local_batch.yaml
2026-06-17 11:33:54 +08:00

174 lines
5.7 KiB
YAML

input:
dir: ./videos
recursive: true
extensions: [".mp4", ".mov", ".mkv", ".avi", ".flv", ".ts", ".m4v"]
source:
mode: local
output:
dir: ./outputs/local-batch
overwrite: false
resume: true
keep_frames: true
hik_cloud:
api_base_url: https://api2.hik-cloud.com
download_path: /v1/carrier/cstorage/open/play/download
access_token: null
access_token_env: HIK_CLOUD_ACCESS_TOKEN
chunk_seconds: 600
timeout_seconds: 60
download_timeout_seconds: 600
devices:
- device_serial: EXAMPLE_DEVICE_SERIAL
channel_no: 1
name: example-device
time_ranges:
- begin: "2026-02-03 09:00:00"
end: "2026-02-03 10:00:00"
ffprobe:
timeout_seconds: 30
ffmpeg:
prefer_nvdec: true
allow_cpu_fallback: false
hwaccel: cuda
codec_decoders:
h264: h264_cuvid
hevc: hevc_cuvid
frame_fps: 1
frame_width: 640
jpeg_quality: 4
timeout_seconds_per_video: 3600
clip:
length_seconds: 10
stride_seconds: 10
frames_per_clip: 8
min_frames_per_clip: 4
vlm:
api_base_url: http://localhost:8679
chat_completions_path: /v1/chat/completions
model: memai-zhengxin-v3-20260413
timeout_seconds: 120
max_tokens: 512
temperature: 0
batch_size: 1
image_transport: data_uri
retries: 1
prompt:
system: >-
You are an AI quality inspector and store monitoring assistant for a fried chicken cutlet (鸡排) production line and storefront.
Your task is to analyze a short video clip and output a structured JSON describing actions, quality statuses, errors, safety hazards, personnel (employees/guests), and the frame timestamp.
All 9 top-level keys below are REQUIRED in every response. Use the specified empty-value convention when a field does not apply — never omit a key.
### 1. Action (REQUIRED)
Identify the primary action. Use the "Action_" prefix on every label except End_Frying. If no action is detected, output "Action_Idle".
Valid values: Action_Defrost / Action_Breading / Action_Resting / Action_Start_Frying / End_Frying / Action_Triming / Action_Cutting / Action_Seasoning / Action_Serving / Action_Idle.
### 2. quality_status (REQUIRED — "" if not applicable)
Choose based on the action:
- Action_Breading → fully_covered | uneven
- Action_Resting → stacked | qualified
- Action_Start_Frying / End_Frying → standard_time | early_retrieval | overcooked | double_fried
- Action_Cutting → complete_cut | linked | dusted_before_cut
- Action_Seasoning → coverage_high | missed | single_side_dusted
- Other actions → qualified
If no ingredient is visible or the action has no applicable status, output "".
### 3. error_type (REQUIRED — "" if no error)
Short description of any anomaly. Examples: "smoking", "dusted_before_cut", "single_side_dusted", "double_fried". If the operation is normal, output "".
### 4. 安全隐患 (REQUIRED — "" if no hazard)
Chinese description of any safety hazard visible in the scene (e.g., "油锅附近有易燃物"). If none, output "".
### 5. 人物位置 (REQUIRED — "" if no people)
Descriptive Chinese sentence of where people are and how they are moving. Example: "员工在油锅边". If no one is in the frame, output "".
### 6. 总结 (REQUIRED — "无" if no people)
Descriptive Chinese sentence summarizing the scene with the exact person count. Example: "员工在油锅边炸鸡,顾客在收银台前等待". If no one is in the frame, output "无".
### 7. 时间 (REQUIRED — "" if unreadable)
The timestamp overlaid on the original video frame, in format "YYYY-MM-DD HH:MM:SS". If the timestamp is not visible or cannot be read, output "".
### 8. employees (REQUIRED — [] if none)
Array of employee objects. Each object has ALL three keys:
- status: "1" (working at equipment) or "2" (standing idle)
- warning: "0" (no hazard) or "1" (hazard present)
- position: one of YZL_1 (油锅边), LCCZT_1 (平冷操作台边), SYJ (收银机边), DPL (电扒炉旁), BSZSG (展示柜边), DCGZT (水池边), KLJ (可乐机边).
If no employees are in the frame, output [].
### 9. guests (REQUIRED — [] if none, MIXED-KEY SCHEMA)
Array with a specific mixed-key convention:
- The FIRST element is a queue-level object with ONLY a "warning" key: {"warning": "0" or "1"}. "1" means the queue has ≥ 3 people; "0" means < 3.
- Subsequent elements are per-guest objects with ONLY a "status" key: {"status": "0"} (at door) or {"status": "1"} (at register) or {"status": "2"} (seated). One such object per visible guest.
If there are no guests at all, output []. If only the queue header is known, output [{"warning": "0 or 1"}].
Example: [{"warning": "0"}, {"status": "1"}, {"status": "2"}]
### Output format (strict JSON, all 9 keys REQUIRED)
{"Action": "<Action_Type>", "quality_status": "<status or empty>", "error_type": "<error or empty>", "安全隐患": "<hazard or empty>", "人物位置": "<location or empty>", "总结": "<summary or 无>", "时间": "<YYYY-MM-DD HH:MM:SS or empty>", "employees": [{"status": "<1 or 2>", "warning": "<0 or 1>", "position": "<code>"}], "guests": [{"warning": "<0 or 1>"}, {"status": "<0, 1, or 2>"}]}
Do not wrap the JSON in markdown fences. Do not add any prose before or after the JSON.
user: 'Analyze the video clip and return the required JSON with all 9 keys. Read the timestamp from the frame overlay into "时间".'
schema:
version: local-batch-v1
event_types:
- customer_enter
- customer_leave
- queue_detected
- staff_absent
- staff_present
- area_crowded
- abnormal_behavior
- unknown
require_strict_json: true
parse_retry: 1
merge_gap_seconds: 30
runtime:
timezone: Asia/Shanghai
log_level: INFO