ShopIntegrations
Reliability

Webhooks: How to Design for Missed Events

5 min read
Shop Integrations Team

Webhooks are the backbone of real-time integrations - but they are not perfectly reliable. Network failures, server downtime, and rate limiting mean webhooks can be missed or arrive out of order. Here is how to design systems that stay consistent even when webhooks fail.

Why Shopify Webhooks Can Fail

Shopify webhooks are highly reliable, but they operate under best-effort delivery semantics. They are not guaranteed. Here are common failure modes:

1. Endpoint Downtime

Your webhook endpoint might be down during deployments, infrastructure failures, or scaling events. Shopify attempts delivery and retries a few times with exponential backoff. If all attempts fail within the retry window, the event is lost.

2. Network Timeouts

Shopify has timeout limits for webhook delivery (typically 5 seconds). If your endpoint does not respond within this window, the delivery fails and may retry - or may not, depending on where in the retry cycle it occurred.

3. Rate Limiting

If your endpoint receives more webhooks than it can handle, you might return 429 (Too Many Requests) or drop connections. Shopify interprets this as delivery failure and retries - which may compound the problem.

4. Out-of-Order Delivery

Webhooks can arrive out of order. An order updated webhook might arrive before the order created webhook. This is especially common during high traffic or when retries are involved.

Design Pattern 1: Webhook + Polling Hybrid

Do not rely solely on webhooks. Use webhooks for real-time notifications, but have a polling mechanism as a fallback to catch missed events.

Design Pattern 2: Event Replay

Store all webhook payloads in an event log before processing. If you discover missed events or processing errors, you can replay events from the log.

Design Pattern 3: Idempotency Keys

Process every webhook payload with an idempotency key to ensure duplicate deliveries or replays do not cause duplicate side effects.

Design Pattern 4: Delivery Receipts

Track which webhooks you have successfully processed and alert on gaps.

Design Pattern 5: Ordering Guarantees

Handle out-of-order webhooks gracefully by tracking entity versions or timestamps.

Conclusion

Webhooks are powerful but imperfect. Production systems must assume webhooks will fail and design accordingly. The patterns above are not optional extras - they are reliability requirements. Implement them from day one, not after your first production incident.

Need help with this?

We have built these patterns into production systems for dozens of merchants. See how we can help you implement them.

Get the 7-day readiness audit

Identify reliability gaps, integration risks, and automation opportunities. Get a concrete action plan in one week.