Files
cold_display_guard/docs/superpowers/plans/2026-06-09-webhook-retry-queue.md

5.1 KiB

Webhook Retry Queue Implementation Plan

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (- [ ]) syntax for tracking.

Goal: Add persistent webhook retry queue handling so failed outbound webhook deliveries are retried with backoff instead of being recorded only as one-shot failures.

Architecture: Keep the current synchronous direct-send path as the first attempt, but persist failed outbound deliveries into a separate append-only retry-state JSONL log. Reconstruct the latest retry state from that log, retry due items from runtime and management API entry points, and expose queue visibility plus manual drain control through the existing management API.

Tech Stack: Python 3.12 standard library backend, JSONL persistence, unittest, existing Vite frontend left unchanged for this phase.


Task 1: Retry Queue Model And Delivery Semantics

Files:

  • Modify: src/cold_display_guard/webhooks.py

  • Test: tests/test_webhooks.py

  • Step 1: Write failing retry-queue tests Add tests for:

    • non-2xx direct delivery is treated as failure rather than success
    • failed direct delivery appends a pending retry snapshot
    • due retry success marks the queued item delivered
    • repeated retry failure increments attempts and eventually becomes dead_letter
  • Step 2: Run test to verify it fails Run: eval "$(/opt/homebrew/bin/pyenv init -)" && PYTHONPATH=src python -m unittest tests/test_webhooks.py -v Expected: FAIL because retry queue helpers and non-2xx handling do not exist yet.

  • Step 3: Implement minimal retry queue support In src/cold_display_guard/webhooks.py:

    • add webhook retry settings parsing
    • add retry snapshot load/append helpers
    • add in-memory retry store operations
    • treat only HTTP 2xx as successful delivery
    • enqueue failed direct deliveries
    • retry due queued deliveries with bounded exponential backoff and dead-letter cutoff
  • Step 4: Run test to verify it passes Run: eval "$(/opt/homebrew/bin/pyenv init -)" && PYTHONPATH=src python -m unittest tests/test_webhooks.py -v Expected: PASS

Task 2: Runtime And Manage API Integration

Files:

  • Modify: src/cold_display_guard/main.py

  • Modify: src/cold_display_guard/manage_api.py

  • Test: tests/test_main.py

  • Test: tests/test_manage_api.py

  • Step 1: Write failing integration tests Add tests for:

    • runtime delivery enqueues failed outbound webhooks and drains due retries
    • manual case handling uses the queue-aware sender
    • management API can list queued retry items
    • management API can manually trigger a retry drain and report results
  • Step 2: Run test to verify it fails Run:

    • eval "$(/opt/homebrew/bin/pyenv init -)" && PYTHONPATH=src python -m unittest tests/test_main.py -v
    • eval "$(/opt/homebrew/bin/pyenv init -)" && PYTHONPATH=src python -m unittest tests/test_manage_api.py -v Expected: FAIL because runtime/API do not know about queue paths or drain actions yet.
  • Step 3: Implement minimal integration

    • add retry-queue path resolution to runtime and management API
    • make runtime direct sends queue-aware and drain due items each cycle
    • make case-handle callbacks/manual operations queue-aware
    • add GET /api/manage/webhooks/retries
    • add POST /api/manage/webhooks/retries/drain
  • Step 4: Run test to verify it passes Run:

    • eval "$(/opt/homebrew/bin/pyenv init -)" && PYTHONPATH=src python -m unittest tests/test_main.py -v
    • eval "$(/opt/homebrew/bin/pyenv init -)" && PYTHONPATH=src python -m unittest tests/test_manage_api.py -v Expected: PASS

Task 3: Config Surface, Docs, And Final Verification

Files:

  • Modify: src/cold_display_guard/config.py

  • Modify: config/example.toml

  • Modify: README_zh.md

  • Test: tests/test_config.py

  • Step 1: Write failing config/doc tests Extend config tests so saved config output includes retry queue sink/settings.

  • Step 2: Run test to verify it fails Run: eval "$(/opt/homebrew/bin/pyenv init -)" && PYTHONPATH=src python -m unittest tests/test_config.py -v Expected: FAIL because retry queue config formatting does not exist yet.

  • Step 3: Implement config and docs updates

    • add defaults for retry queue sink path and retry policy settings
    • expose the non-secret retry config in manage config payload
    • document retry queue behavior, new log file, and manual drain/list endpoints
  • Step 4: Run targeted and full verification Run:

    • eval "$(/opt/homebrew/bin/pyenv init -)" && PYTHONPATH=src python -m unittest tests/test_config.py -v
    • eval "$(/opt/homebrew/bin/pyenv init -)" && PYTHONPATH=src python -m unittest tests/test_webhooks.py -v
    • eval "$(/opt/homebrew/bin/pyenv init -)" && PYTHONPATH=src python -m unittest tests/test_main.py -v
    • eval "$(/opt/homebrew/bin/pyenv init -)" && PYTHONPATH=src python -m unittest tests/test_manage_api.py -v
    • eval "$(/opt/homebrew/bin/pyenv init -)" && PYTHONPATH=src python -m unittest discover -s tests -v Expected: PASS