← Graph

How to handle a 20+ filter index endpoint with big datasets?

question 1 connections

Final audience question. Maciej: silver bullets only — kidding; depends on data size. Often Postgres or SQLite with proper indexes is enough; load-test locally, on staging, and production; if you outgrow that, secondary indexes like Elasticsearch are his choice, but joining Elasticsearch and Postgres in-app can become the new bottleneck. Stephen: 'big data' is usually smaller than you think — 30GB is tiny, hundreds of GB is small. Users don't actually use 20 filters randomly; monitor production, find the Pareto-hot filter clusters, and index 3–6 combinations covering 80% of queries. Minimize joins if you can. Caio: well-designed databases with proper indexes handle most cases without Elasticsearch; also control the UX — one product limited the combinable-filter count after metrics showed no one used many at once, and rate-limiting internal APIs is often overlooked but saves headaches. Maciej pushes back on minimizing joins: Postgres handles 20+ joins fine, don't be afraid.

answer_summary
Start with Postgres/SQLite + proper indexes, load-test, find Pareto-hot filter combinations, add Elasticsearch only if you outgrow relational, and don't be afraid of joins or of limiting UI combinations.
question How to handle a 20+ filter index endpoint with big datasets?
asked_at
Final audience question.

Provenance

Created in
Performance Panel at wroclove.rb 2024 2026-04-17 23:20