Lesson from the Lemonade rewrite: keep the two phases asynchronous. Recording old-flow samples (quote snapshots + HTTP interactions) runs continuously in production under a feature flag; verification runs separately, diffs the new flow against the old, and can be re-run after each fix to the new flow. Gradually polish the rough edges instead of chasing a big-bang correctness proof.