Relations / Sql Indexing Cardinality

Least You Need to Know: Indexes, Selectivity, and Cardinality Intuition

Index and query-plan interview questions are mostly about search-space pruning. Good indexes cut down candidate rows quickly; bad selectivity, tiny tables, or the wrong leading columns can erase that advantage.

The least you need to know

Indexes help most when the query condition is selective and avoids scanning most rows.
Low-selectivity columns such as a two-value status flag often make poor standalone indexes.
For very small tables, a sequential scan can be cheaper than using an index.
Cardinality estimates guide join order because they approximate how many rows each step will produce.
For composite indexes, leading-column order matters because many engines use left-prefix matching behavior.

Key notation

selectivity fraction of rows a predicate keeps

cardinality estimated number of rows in an intermediate result

(a, b) composite index whose leading column is a

Tiny worked example

If a million-row users table has an index on `email`, an equality lookup by one exact email is highly selective.
If the same table has an index on `is_active` with values only `true` or `false`, each lookup may still touch a huge fraction of rows.
In a composite index on `(country, created_at)`, queries filtering by `country` can often use the index directly, while filtering only by `created_at` may not use the leading prefix efficiently.
Query planners compare such estimates when choosing scan and join strategies.

Common mistakes

Students often assume every index is automatically helpful.
Students often ignore row-count estimates when reasoning about join order.
Students often forget that composite-index order changes which predicates the index supports well.

How to recognize this kind of problem

If the predicate narrows to one or a few rows, an index is usually promising.
If a condition matches a huge chunk of the table, the index may not win.
When asked why the planner joins one table first, compare intermediate result sizes.