Retries & delivery log
The gateway retries failed deliveries on a fixed exponential-ish schedule. After the seventh attempt the delivery is marked permanently failed and must be requeued manually if you still want it.
What counts as success / failure
| Outcome from your endpoint | Treated as |
|---|---|
HTTP 2xx within 10 seconds | Success |
HTTP non-2xx (any 3xx / 4xx / 5xx) | Failure |
| Request times out at 10 seconds | Failure |
| TLS handshake error, DNS failure, connection refused | Failure |
The transport-level timeout is hard-coded at 10 seconds. Plan your receiver to acknowledge fast — persist the event to your own queue and process asynchronously, then return 200.
Backoff schedule
Every failure schedules the next attempt at a fixed delay from the previous one:
| Attempt | Delay since previous | Cumulative time after first attempt |
|---|---|---|
| 1 | — (initial delivery) | 0 |
| 2 | + 1 minute | 1 m |
| 3 | + 5 minutes | 6 m |
| 4 | + 30 minutes | 36 m |
| 5 | + 2 hours | 2 h 36 m |
| 6 | + 12 hours | 14 h 36 m |
| 7 | + 24 hours | 38 h 36 m |
After attempt 7 fails, the delivery moves to status='failed' (dead). The gateway does not auto-revive it — you'd need to either ship a fix to your receiver and manually requeue, or accept the loss.
Why this schedule
The pattern is "fast then slow" on purpose:
- The first three retries (within 6 minutes) catch transient network blips and deploys.
- The mid-range delays (30 m, 2 h) handle longer outages or partial deploys.
- The long tail (12 h, 24 h) gives someone on-call enough time to actually notice and fix a receiver before the delivery is lost forever.
Dedupe by envelope id
Every retry of the same delivery uses the same envelope id (the delivery record's id) and the same X-CPG-Delivery header value. The body bytes are byte-for-byte identical across retries — only the X-CPG-Timestamp and signature change (since the signature includes the timestamp).
Your receiver should treat id as a primary key: process each id once, return 2xx immediately on duplicates. Otherwise you risk double-crediting users when a retry arrives after you've already processed the original.
Delivery log UI
Your dashboard's Webhook deliveries page shows every attempt in real time:
- Pending / Delivering / Delivered / Failed (dead) — current status. A row waiting for a retry shows as Pending (with a non-zero attempt count and a future retry time); there is no distinct "retrying" status.
- Event type and timestamp — what fired and when.
- Most recent response — the HTTP status code of the last attempt (e.g.
500), or blank when the request never received a response (DNS / TLS / connection / timeout). The response body is not stored — only the status code is retained. - Attempt count —
0/7for a fresh delivery awaiting its first attempt (the counter increments after each attempt);7/7for a permanently-failed one. - Retry button — visible on failed and dead rows. Clicking it resets the row to
pendingwithattempt_count=0, clears the previous response, and sets an immediatenext_retry_at. The next worker tick (~10 s) picks it up.
A "manual" retry uses the same body bytes as the original — so if your receiver has already processed that id, it will dedupe correctly without any extra work on your end.
When to manually requeue
The retry button is most useful in these scenarios:
- You shipped a fix to your receiver after the delivery had already exhausted automatic retries. Click retry and the original event lands again.
- You wiped your local event store and want to re-process recent events. Delivery rows are not automatically pruned, so you can requeue an old delivery as long as its webhook still exists.
- You're debugging during development and want to repeatedly receive the same event for testing.
When a delivery is not delivered
If a transient infrastructure issue prevents an event from being enqueued at the moment it fires, that event may be dropped — the gateway favours availability over delivery guarantees. The transaction itself is always recorded, so your /v1/transactions poll will still surface the deposit/withdrawal. For mission-critical accounting, treat webhook events as advisories and reconcile against /v1/transactions (or the per-resource GETs) on a slow loop.