← Graph

How To Ensure Systems Do What We Want And Take Care Of Themselves

talk 22 connections

Michał Zajączkowski de Mezer's wroclove.rb 2022 talk. Language-agnostic with no code — a message-passing abstraction is used throughout. Core recipe: 'at-least-once + idempotence' enables components to self-heal and resume workflows after failure. Walks through processing guarantees / delivery semantics (at most once, at least once, exactly once) and argues exactly-once is what we almost always want but is hard without the recipe. Retry guidance: use exponential backoff so retries don't kill overloaded receivers; distinguish expected vs unexpected errors and only alert on the unexpected, using metrics/alarms to detect tendencies; decide retry duration based on the actor (humans ~1–5 retries fast; machines retry for days/weeks so systems heal while you're on vacation). Mentions timeouts, fail-fast, circuit breaker, back pressure, and rate limiting as further patterns. On idempotence, exercises 'protocol thinking' against HTTP: read operations are naturally idempotent; deletes are idempotent but receivers must recognize already-processed messages; PUT is idempotent only under a single-sender assumption — multi-sender races (lost update / mid-air collision) require optimistic locking with a version parameter; POST creation requires idempotency keys (unique tokens generated by the sender and indexed by the receiver) to avoid duplicates on retry. Also covers sender-chosen IDs via PUT with UUIDv4 as an alternative, and the fallback 'find-or-create' (read-check-write) pattern with its race and consistency caveats when the receiver is a third party without idempotency-key support — sometimes the best remedy is to negotiate a feature request or switch API providers. Q&A covers: retrying a POST whose source-of-truth entity changed in the meantime (apply optimistic locking with version 0); working with third-party APIs that support neither idempotency keys nor find-or-create (negotiate or accept duplicates); and why overworked receivers at scale make idempotency hard — addressed via CAP-theorem-style sharding with a deterministic hash-based partitioner so concurrent writes on the same entity land on the same partition.

date
2022-03-11
type
talk
talk How To Ensure Systems Do What We Want And Take Care Of Themselves
about
Recipe is the central thesis of the talk.
talk How To Ensure Systems Do What We Want And Take Care Of Themselves
about
Introduces at-most-once, at-least-once, and exactly-once as the framing vocabulary.
talk How To Ensure Systems Do What We Want And Take Care Of Themselves
about
Recommended default backoff strategy for retries.
talk How To Ensure Systems Do What We Want And Take Care Of Themselves
about
Argues for splitting error reporting into expected (metrics/alarms) vs unexpected (team alerts).
talk How To Ensure Systems Do What We Want And Take Care Of Themselves
about
The discipline used to exercise HTTP verbs against failure scenarios.
talk How To Ensure Systems Do What We Want And Take Care Of Themselves
about
Idempotence concept
Second half of the talk is dedicated to idempotence at the protocol level.
talk How To Ensure Systems Do What We Want And Take Care Of Themselves
about
Walks through GET, DELETE, PUT, POST under the protocol-thinking lens.
talk How To Ensure Systems Do What We Want And Take Care Of Themselves
about
Presented as the protocol-level solution to lost-update races on PUT/DELETE.
talk How To Ensure Systems Do What We Want And Take Care Of Themselves
about
Race condition motivating optimistic locking.
talk How To Ensure Systems Do What We Want And Take Care Of Themselves
about
Idempotency Key concept
Convention for making POST creation idempotent.
talk How To Ensure Systems Do What We Want And Take Care Of Themselves
about
Fallback when a third-party API lacks idempotency-key support.
talk How To Ensure Systems Do What We Want And Take Care Of Themselves
about
Sharding concept
Raised in Q&A as the CAP-theorem answer to race-load scaling.
asked_at
How To Ensure Systems Do What We Want And Take Care Of Themselves talk
Second audience question in the Q&A.
asked_at
How To Ensure Systems Do What We Want And Take Care Of Themselves talk
Third audience question in the Q&A.
asked_at
How To Ensure Systems Do What We Want And Take Care Of Themselves talk
First audience question in the Q&A.
authored
How To Ensure Systems Do What We Want And Take Care Of Themselves talk
Single-speaker presentation delivered at wroclove.rb 2022.
from_talk
How To Ensure Systems Do What We Want And Take Care Of Themselves talk
Single 'remember-this-one-thing' takeaway of the presentation.
from_talk
How To Ensure Systems Do What We Want And Take Care Of Themselves talk
Advice given in the retry section.
from_talk
How To Ensure Systems Do What We Want And Take Care Of Themselves talk
Advice given in the error-reporting section.
from_talk
How To Ensure Systems Do What We Want And Take Care Of Themselves talk
Guidance on when to stop retrying based on human vs machine senders.
from_talk
How To Ensure Systems Do What We Want And Take Care Of Themselves talk
Advice for integrating with third parties lacking idempotency support.
talk How To Ensure Systems Do What We Want And Take Care Of Themselves
presented_at
Talk delivered at wroclove.rb 2022 on 2022-03-11.

Provenance

Created
2026-04-17 16:17 seed
Read by
14 extractions