Discipline Doesn't Scale

talk 27 connections

Markus Schirp's wroclove.rb 2026 conference talk, delivered with an assistant (Katie) physically throwing darts at a slide to illustrate the mental model. Not a Mutant talk: refines a scribble-on-napkins argument Schirp typically uses in consulting conversations. Core thesis: discipline doesn't scale — any organization, at 3 or 5,000 developers, will screw up, and only automation helps. Introduces three thresholds every software system has: (1) ecosystem threshold — what the base language gives you batteries-included (Haskell's type system very high; C somewhat lower; assembly lowest; Ruby essentially 'it parsed and eventually booted'); (2) automation threshold — tests, linters, TDD, DDD, CI and all quality gates layered on top; (3) contribution threshold — whether merging the change actually helps the company. Each contribution is a dart: darts landing below the ecosystem threshold are automatically rejected (syntax errors, failed boot); darts between the automation and contribution thresholds land green (helpful merges); darts in the red zone between automation and contribution are the most dangerous — CI passes, merge button is clicked, and things break in production (wrong DB schema, forgotten mobile membership schema update, etc.). The gap between automation and contribution is where discipline lives, and discipline is a statistical losing bet at scale. Developers (Schirp includes himself) are overconfident about the shape of their personal dart distribution; integrated over time everyone regresses to the same gaussian-ish curve. LLMs simply throw many more darts, so even the same distribution produces far more red hits — the fix is raising the automation threshold closer to the contribution threshold to auto-reject more. In Ruby this matters especially: the ecosystem threshold is very low, the Ruby ecosystem has spearheaded TDD and similar tooling precisely because the language gives so little, and runtime hazards (monkey patches leaking from third-party gems, redefining division operators, method_missing, eval) keep dragging it down; Schirp patches out eval, method_missing and freezes core classes in his production systems but says a big VM-level 'harden' method should exist. Recommends raising the automation threshold with Mutant (mutation testing), Sorbet/RBS type systems (cites Shopify as proof it helps at scale; seen retrofitted with good effects), and property-based testing (an open field in Ruby — would love someone to write and popularize a property-testing framework; invariants like 'reverse doesn't change length' or 'sum of line items >= any single item' produce phenomenal business-logic tests; god-tier in Haskell/finance). Closes on the LLM point: LLMs are an amplifier of existing patterns, not specifically bad; since we can never eliminate the red area (that would be AGI, and everyone's out of a job), we must reduce its ratio. Q&A covers: what counts as 'shit in production' (long-lasting damage like a wrong DB schema discovered five years later, not a trivial bug); how scale changes the problem (same problem, three-person teams just don't see the distribution); product-quality 10× amplification (features nobody asked for); why the base language matters (bridging a strong type system to a good contribution threshold is much less work than bridging from 'it parsed'); defaults and standard library matter because systems regress to defaults; why Schirp reaches for Rust as an '80% Haskell, easy sell' when he can't pick Haskell for economic reasons; when to invest in culture vs automation (culture is a second-order encoding of discipline — fix the deterministic automation first, culture is celebrated but doesn't move the needle).

date

2026-04-17

type

talk

talk Discipline Doesn't Scale

about

Contribution Threshold Model concept

Talk is structured around walking through the dartboard/threshold model.

talk Discipline Doesn't Scale

about

Ecosystem Threshold concept

One of the three thresholds introduced in the model.

talk Discipline Doesn't Scale

about

Automation Threshold concept

Central lever the talk advocates raising.

talk Discipline Doesn't Scale

about

Contribution Threshold concept

Goal line above which merges help the company.

talk Discipline Doesn't Scale

about

Ruby tool

Argues Ruby's low ecosystem threshold makes the automation gap especially costly.

talk Discipline Doesn't Scale

about

Large Language Models concept

LLMs multiply the number of contributions and stress the automation–contribution gap.

talk Discipline Doesn't Scale

about

Mutant tool

Listed as one of the tools that can raise Ruby's automation threshold.

talk Discipline Doesn't Scale

about

Sorbet tool

Cited as a way to raise the automation threshold in Ruby; Shopify's usage evidence.

talk Discipline Doesn't Scale

about

RBS tool

Mentioned alongside Sorbet as Ruby typing tooling that raises the automation threshold.

talk Discipline Doesn't Scale

about

Property-Based Testing concept

Open opportunity in Ruby; Schirp calls for a good library and popularization.

talk Discipline Doesn't Scale

about

Mutation Testing concept

Named as a way to reduce wiggle room and raise the automation threshold.

talk Discipline Doesn't Scale

about

Haskell tool

Used as the canonical example of a very high ecosystem threshold.

talk Discipline Doesn't Scale

about

Rust tool

Schirp's real-world greenfield choice cited during Q&A.

question What counts as 'shit in production'?

asked_at

Discipline Doesn't Scale talk