Modern CPUs cannot run faster per core due to physics limits, so they parallelize by speculating past branch instructions. Any if/goto introduces entropy that hurts the branch predictor: when the prediction misses, all the speculated work must be thrown away. As a result, branch-free code doing more arithmetic is often faster than branchy code doing less work. Wójtowicz demonstrates this with a leap-year function where the obvious early-return implementation is 3× slower than an equivalent branch-free version.