feat: add webhook retry queue

This commit is contained in:
2026-06-09 11:32:34 +08:00
parent 81f170924c
commit 8f516fdc01
12 changed files with 940 additions and 74 deletions

View File

@@ -1,51 +1,31 @@
# Task Todo
- [x] Review the current project instructions and check for task-relevant lessons.
- [x] Check repository status before writing the implementation plan.
- [x] Inspect existing engine, CLI, docs, and frontend event handling for disposal-tracking impact.
- [x] Write the design spec for webhook case management in an isolated worktree.
- [x] Confirm the design with the user before implementation.
- [x] Check repository status before starting retry-queue work.
- [x] Re-verify that `main` includes webhook case management before layering retries on top.
- [x] Inspect the current webhook delivery path, config surface, runtime integration point, and manage API hooks.
- [x] Write the detailed retry-queue implementation plan to `docs/superpowers/plans/2026-06-09-webhook-retry-queue.md`.
- [x] Execute webhook retry queue backend TDD cycle.
- [x] Execute runtime/manage API retry integration TDD cycle.
- [x] Update documentation/config formatting for retry queue settings and sinks.
- [x] Run targeted verification and final full verification.
## Design Review
## Notes
- Spec path: `docs/superpowers/specs/2026-06-09-webhook-case-management-design.md`
- Scope fixed to local case management plus outbound and inbound webhook integration.
- Confirmed behaviors:
- manual handling and external callback handling are both supported
- cases are created from `time_alarm`, `batch_pending_disposal`, and `warning_escalated`
- both batch-event webhooks and case-state webhooks are required
- callback `status` is exactly `handled`
- callback-applied case handling must emit a `case_event` webhook
- `tasks/lessons.md` is absent in this repository/worktree, so there were no prior session lessons to review.
- Main branch merge result is available locally at `81f1709`; retry-queue work continues from branch `feat/webhook-retry-queue`.
## 2026-06-09 Implementation Plan
## Review
- [x] Create isolated worktree for implementation on branch `feat/webhook-case-management`.
- [x] Re-check runtime baseline in the worktree and note the local Python environment requirement.
- [x] Write the detailed implementation plan to `docs/superpowers/plans/2026-06-09-webhook-case-management-implementation.md`.
- [x] Execute backend case-state TDD cycle.
- [x] Execute webhook integration TDD cycle.
- [x] Execute management API TDD cycle.
- [x] Execute frontend case-management TDD cycle.
- [x] Run full verification and record outcomes.
## 2026-06-09 Implementation Review
- Worktree path: `/Users/glo/.config/superpowers/worktrees/cold_display_guard/webhook-case-management`
- Baseline note: the default `python3` in this shell resolves to macOS system Python 3.9 and cannot import the repo's `dataclass(..., slots=True)` code. Python verification in this worktree must run through `eval "$(/opt/homebrew/bin/pyenv init -)" && python ...`, which resolves to Python 3.12.11.
- Frontend baseline check in the worktree passed with `node --test web/test/zone-state.test.js`.
- Implemented:
- `src/cold_display_guard/cases.py` for case lifecycle and JSONL persistence
- `src/cold_display_guard/webhooks.py` for outbound event/case webhook delivery and audit logging
- runtime integration in `src/cold_display_guard/main.py`
- case listing/summary/manual-handle/callback routes in `src/cold_display_guard/manage_api.py`
- frontend case summary and manual-handle flow in `web/src/main.js` and `web/src/zone-state.js`
- Targeted verification passed during implementation:
- `eval "$(/opt/homebrew/bin/pyenv init -)" && PYTHONPATH=src python -m unittest tests/test_cases.py -v`
- Plan saved to `docs/superpowers/plans/2026-06-09-webhook-retry-queue.md`.
- Chosen scope keeps the first outbound webhook attempt synchronous, then persists failures into a JSONL-backed retry queue with bounded backoff and dead-letter cutoff.
- Retry queue observability and manual compensation will be exposed through the management API rather than the frontend in this phase.
- Implemented queue-aware webhook delivery in `src/cold_display_guard/webhooks.py`, runtime retry draining in `src/cold_display_guard/main.py`, manage API retry list/drain endpoints in `src/cold_display_guard/manage_api.py`, and config/doc updates in `src/cold_display_guard/config.py`, `config/example.toml`, and `README_zh.md`.
- Targeted verification passed:
- `eval "$(/opt/homebrew/bin/pyenv init -)" && PYTHONPATH=src python -m unittest tests/test_webhooks.py -v`
- `eval "$(/opt/homebrew/bin/pyenv init -)" && PYTHONPATH=src python -m unittest tests/test_main.py -v`
- `eval "$(/opt/homebrew/bin/pyenv init -)" && PYTHONPATH=src python -m unittest tests/test_manage_api.py -v`
- `node --test web/test/zone-state.test.js`
- `eval "$(/opt/homebrew/bin/pyenv init -)" && PYTHONPATH=src python -m unittest tests/test_config.py -v`
- Final verification passed:
- `eval "$(/opt/homebrew/bin/pyenv init -)" && PYTHONPATH=src python -m unittest discover -s tests -v`
- `cd web && pnpm build`
- Frontend build note: the isolated worktree needed `cd web && pnpm install --frozen-lockfile` before `pnpm build` because `node_modules` are not shared into new worktrees.
- `cd web && pnpm install --frozen-lockfile && pnpm build`