Fantastic Databases and Where to Find Them

talk 26 connections

Krzysztof 'Chris' Hasiński's wroclove.rb 2023 talk framed around the 'here be dragons' metaphor (and the Lennox Globe's Latin 'HC SVNT DRACONES'): Rails developers treat databases as exotic storage boxes but most real work happens there. Observes that the canonical test pyramid doesn't apply to Rails — most Rails tests are integration tests because they hit the DB. Tours: (1) SQLite — public domain, extremely fast (the sqlite.org site itself serves ~20M SQL queries/day on a 7-year-old VM shared with 23 apps), embeddable, trivial backups (copy the file to run 5 parallel test DBs); downsides: process-level locking pushes you toward processes-over-threads, single writer forces retry logic, no full-text search. (2) PostgreSQL — the 'elephant in the room'; cites Stephan Schmidt's 'Use One Big Server / Postgres for Everything' article showing Postgres can replace GraphQL, cache, queue, etc. Highlights unlogged tables (supported in Rails 6 migrations, very fast but wiped on restart) — adding `create_unlogged_tables` to RSpec shaved ~3.5 minutes off a bad suite and ~20s off a good one. Window functions (OVER/PARTITION BY, ROW_NUMBER, cumulative SUM with ROWS BETWEEN) are 'parallel calculations' usable from Rails and much faster than doing it in Ruby; pattern-matches well with Ruby. EXPLAIN (with ANALYZE, BUFFERS) explained in detail: strategies as a tree (seq scan, hash, hash join), cost-to-first-row vs total cost, row-count estimates, width; use transaction+rollback for EXPLAIN ANALYZE on inserts/updates/deletes; BUFFERS will be default in Postgres 16/17. Speaker wrote a gem exposing `estimate_count` via EXPLAIN without requiring migrations. Indexes: one index per query (for filter or sort or both), PgHero (by Andrew Kane) for missing-index detection, vacuum analyze to monitor usage; try all column orders for geodata; partial indexes whose WHERE mirrors the query; covering indexes via INCLUDE for index-only scans. Extensions: PostGIS for geo, TimescaleDB for time series, and especially Citus (bundles columnar storage, distributed tables, parallel querying). Columnar storage = append-only, 10× compression, dramatic speedup on analytical reads; mix-and-match with row tables (cold partitions columnar, newest row-based). Distributed tables via `create_distributed_table` on a distribution column (e.g. customer_id) plus reference/collocated tables. (3) MySQL — superfast, supports in-memory tables (Postgres unlogged still writes sometimes); uses update-in-place + rollback segment for MVCC, unlike Postgres's write-new-row + stale-marking that requires VACUUM — faster because most transactions commit quickly. Now has JSON operators, window functions, full-text search — a serious alternative; Basecamp/HEY use it. (4) Exotic: CockroachDB — Citus-like auto-sharding + geo-aware distribution with hotspot detection, survives datacenter outages. ClickHouse — Russian, Yandex-origin, columnar, pre-computed aggregates, multiple table engines; Cloudflare uses it for ~6M inserts/second. Closes that NoSQL/document/graph is a 'part two for next year'. Images generated with Midjourney ('database as an animal'). Q&A: EXPLAIN cost values are 'story points' — unitless defaults you can tune in postgres.conf for your hardware; unlogged tables are wiped on DB restart/crash (fine for CI, confusing on local dev after Docker restarts).

type

talk

talk Fantastic Databases and Where to Find Them

about

SQLite tool

Covers SQLite as a surprisingly capable small database.

talk Fantastic Databases and Where to Find Them

about

PostgreSQL tool

Bulk of the talk explores Postgres features.

talk Fantastic Databases and Where to Find Them

about

MySQL tool

Framed as a serious modern alternative with faster MVCC.

talk Fantastic Databases and Where to Find Them

about

Citus tool

Highlighted Postgres extension for columnar storage, distributed tables and parallel querying.

talk Fantastic Databases and Where to Find Them

about

CockroachDB tool

Presented as a distributed, geo-aware 'auto-Citus' exotic database.

talk Fantastic Databases and Where to Find Them

about

ClickHouse tool

Presented as an exotic columnar analytical database from Yandex.

talk Fantastic Databases and Where to Find Them

about

Window Functions concept

Dedicated section demonstrating window functions as a 'freakishly powerful' Postgres feature.

talk Fantastic Databases and Where to Find Them

about

Unlogged Tables concept

Unlogged tables used as a fast cache and test-speedup trick.

talk Fantastic Databases and Where to Find Them

about

EXPLAIN / EXPLAIN ANALYZE concept

Long walkthrough of how to read and use EXPLAIN/ANALYZE/BUFFERS.

talk Fantastic Databases and Where to Find Them

about

Database Indexes concept

Covers index rules, PgHero, partial and covering indexes.

talk Fantastic Databases and Where to Find Them

about

Partial Index concept

Recommends mirroring the query's WHERE in partial indexes.

talk Fantastic Databases and Where to Find Them

about

Covering Index concept

Mentions INCLUDE columns for index-only scans as under-used.

talk Fantastic Databases and Where to Find Them

about

Columnar Storage concept

Explains columnar storage via Citus for analytical workloads.

talk Fantastic Databases and Where to Find Them

about

Test Pyramid concept

Opens with the test pyramid vs the inverted shape to motivate why Rails tests are integration tests.

talk Fantastic Databases and Where to Find Them

about

MVCC Trade-offs in Postgres vs MySQL concept

Explains Postgres stale-row + VACUUM vs MySQL update-in-place + rollback segment.

talk Fantastic Databases and Where to Find Them

about

Ruby on Rails tool

Framed around Rails developers' relationship with databases.

talk Fantastic Databases and Where to Find Them

about

Postgres for Everything resource

Cites Stephan Schmidt's article listing Postgres replacing parts of a typical stack.

talk Fantastic Databases and Where to Find Them

about

Lennox Globe resource

Opening metaphor: 'HC SVNT DRACONES' on the Lennox globe equals databases on a Rails dev's mental map.

question EXPLAIN cost units meaning

asked_at

Fantastic Databases and Where to Find Them talk

Asked during the Q&A.

question Unlogged table durability on connection reset

asked_at

Fantastic Databases and Where to Find Them talk

Asked during the Q&A.

person Krzysztof Hasiński

authored

Fantastic Databases and Where to Find Them talk

Chris (Krzysztof) Hasiński delivered this talk.

takeaway Use Unlogged Tables to Speed Up Rails Test Suites

from_talk

Fantastic Databases and Where to Find Them talk

Takeaway from the unlogged-tables section.

takeaway Rails Tests Are Mostly Integration Tests

from_talk

Fantastic Databases and Where to Find Them talk

Opening framing argument.

takeaway Prefer SQL Window Functions over Ruby Aggregation

from_talk

Fantastic Databases and Where to Find Them talk

Core lesson of the window functions section.

takeaway Learn to Read EXPLAIN

from_talk

Fantastic Databases and Where to Find Them talk

Core lesson of the EXPLAIN section.

talk Fantastic Databases and Where to Find Them

presented_at

wroclove.rb 2023 event

Talk given at wroclove.rb 2023.

Provenance

Created: 2026-04-17 16:17 seed
Last updated in: Fantastic Databases and Where to Find Them — wroclove.rb ... 2026-04-17 22:11
Read by: 34 extractions