7.9 KiB
Webhook Case Management Design
Goal: Add outbound webhooks plus a local case-management layer so the project can both push runtime facts to external systems and independently track pending/handled cases in the local management console.
Architecture: Keep the existing runtime event stream as the source of operational facts. Add a separate case-state layer that consumes selected runtime events, persists case state transitions, exposes management APIs, and emits case webhooks without mutating the underlying batch facts. Integrate manual handling and external callback handling through the same case-state model.
Tech Stack: Python 3.11+ standard library backend, JSONL persistence, Vite + vanilla JavaScript frontend, existing unittest and Node test suites.
Scope
This design extends the current project in four focused areas:
- Add outbound webhook delivery for runtime batch events.
- Add a local case model for operator workflow.
- Add management APIs for listing, summarizing, manually handling, and externally updating cases.
- Add frontend views and actions for local case operations.
The runtime batch engine remains the producer of factual detection events. Case handling is a downstream interpretation layer.
Current Constraints
- The current runtime writes facts to
logs/events.jsonland diagnostics tologs/runtime_diagnostics.jsonl. - The management API is a small standard-library HTTP server and should stay that way.
- The frontend already renders runtime metrics and runtime events and should continue to do so.
- The user-selected workflow requires both manual handling and external callback handling.
- The user-selected workflow requires both event webhooks and case webhooks.
- The events that should enter the local pending-case flow are
time_alarm,batch_pending_disposal, andwarning_escalated.
Design Summary
The system is split into three cooperating layers:
-
Batch event layer Produces facts such as
batch_started,time_alarm,batch_pending_disposal,batch_discarded, andwarning_escalated. These remain append-only runtime facts. -
Case state layer Consumes selected batch events and maintains a separate per-batch local case state. The case layer owns pending/handled workflow and does not rewrite prior runtime facts.
-
Integration layer Delivers outbound event and case webhooks, accepts external case callbacks, and records webhook delivery attempts for audit and debugging.
Persistence Model
logs/events.jsonlExisting runtime fact log. No schema removals.logs/cases.jsonlNew append-only case transition log. Each line records a case snapshot after a state change.logs/webhook_delivery.jsonlNew append-only webhook delivery audit log. Each line records an attempted outbound delivery result.
events.jsonl remains the source of factual batch history. cases.jsonl is the source of case workflow state. webhook_delivery.jsonl is operational telemetry only.
Case Model
Each batch can own at most one local case. A case is created or updated from selected batch events and then independently handled by a local operator or external callback.
Case fields
case_idbatch_idcamera_idzone_idzone_labelcase_typecase_statussource_eventcreated_atupdated_athandled_athandled_byhandled_sourcelast_event_tspayload
Case type values
time_alarmpending_disposalwarning_escalated
Case status values
openhandled
Handled source values
manualwebhook_callbackauto_closed
Case State Flow
-
time_alarmCreate a case if one does not exist for the batch. If a case already exists, keep it open and refresh timestamps. -
batch_pending_disposalCreate a case if one does not exist. If one exists, update it in place and upgradecase_typetopending_disposal. -
warning_escalatedUpdate the same case in place and upgradecase_typetowarning_escalated. -
Manual handling Mark the case as
handled, sethandled_source=manual, recordhandled_by, and append the new snapshot tocases.jsonl. -
External callback handling Mark the case as
handled, sethandled_source=webhook_callback, optionally recordhandled_byandsource_ref, and append the new snapshot tocases.jsonl. -
batch_discardedIf the related case is stillopen, close it automatically withhandled_source=auto_closed.
Handled cases must not reopen when stale older events are replayed or re-read. Only new event processing in forward time may mutate an existing case. Restore logic must preserve handled status across runtime/API restarts.
Backend Components
- Create
src/cold_display_guard/cases.pyfor case transition logic, persistence, restore, and summary helpers. - Create
src/cold_display_guard/webhooks.pyfor webhook config parsing, payload building, synchronous delivery, and delivery audit logging. - Extend
src/cold_display_guard/config.pyfor webhook configuration and case/log sink paths. - Extend
src/cold_display_guard/main.pyto feed runtime events into case persistence and webhook delivery. - Extend
src/cold_display_guard/manage_api.pyto expose case listing, case summary, manual handling, and token-protected callback handling.
API Design
All new endpoints stay under /api/manage/*.
GET /api/manage/casesQuery:status=open|handledoptional,limitoptional.GET /api/manage/cases/summaryReturns case counts and latest update time.POST /api/manage/cases/{case_id}/handleBody:handled_byrequired,noteoptional.POST /api/manage/webhooks/case-updateBody:case_idrequired,statusrequired and must equalhandled,handled_byoptional,source_refoptional.
The callback endpoint must require the configured shared token in the X-Webhook-Token header and must reject unauthenticated updates.
Webhook Configuration
[webhooks]
enabled = true
event_url = "https://example.com/runtime-events"
case_url = "https://example.com/case-events"
callback_token = "shared-secret"
connect_timeout_seconds = 3
read_timeout_seconds = 5
Outbound Webhook Delivery
Event webhook payload core fields:
kind = "batch_event"eventtsbatch_idcamera_idzone_idzone_labelseveritystate
Case webhook payload core fields:
kind = "case_event"action = "created" | "updated" | "handled"case_idcase_typecase_statusbatch_idsource_eventhandled_sourceupdated_at
Delivery rules:
- Local runtime facts and case state must be persisted before webhook failure can affect control flow.
- Webhook failure must append a line to
logs/webhook_delivery.jsonl. - Webhook failure must not stop local event persistence or local case persistence.
- This batch does not add a retry queue.
Frontend Changes
- Keep the current runtime event table for factual runtime events only.
- Add a separate case table with:
case_idcase_typecase_statuszone_labelbatch_idcreated_atupdated_athandled_source
- Add manual-handle UI for
opencases withhandled_byrequired andnoteoptional. - Add summary cards for:
open_case_counthandled_case_counttime_alarm_case_countpending_disposal_case_countwarning_escalated_case_count
Testing Plan
- Preserve existing batch engine behavior tests.
- Add case tests for create, escalate, manual handle, callback handle, auto-close, and non-reopen behavior.
- Add webhook tests for payloads, delivery success, and failure audit logging.
- Add API tests for new case and callback endpoints.
- Add frontend tests for case rendering, case summary mapping, and manual-handle request flow.
Verification commands:
eval "$(/opt/homebrew/bin/pyenv init -)" && PYTHONPATH=src python -m unittest discover -s tests -vnode --test web/test/zone-state.test.jscd web && pnpm build