Algorithms / Coordinate Compression
Least You Need to Know: Coordinate Compression, Order Preservation, and Dense Indexing
Coordinate compression replaces sparse or huge values with a dense rank order while preserving **relative order comparisons**. It is a preprocessing bridge between messy input values and array-based data structures.
The least you need to know
- Compression maps distinct values to ranks `0..m-1` or `1..m`.
- The goal is to preserve ordering, not numeric gaps.
- Fenwick trees and segment trees often use compressed coordinates.
- Equal original values map to the same compressed value if duplicates are allowed.
- Compression is useful when coordinates are large but only few distinct ones appear.
Key notation
rank(x)
compressed index assigned to value x
sorted distinct values
basis for the compression mapping
dense index
small contiguous integer index set
Tiny worked example
- Suppose relevant x-values are `10, 1_000_000, 17, 10`.
- The sorted distinct values are `10, 17, 1_000_000`.
- A compression map might send them to `1, 2, 3`.
- Order comparisons still work, and array-based structures become practical.
Common mistakes
- Compression does not preserve original arithmetic distances.
- Students sometimes give equal values different ranks, which breaks duplicates handling.
- You must compress every value that might appear in queries or updates, not just those in the initial array.
How to recognize this kind of problem
- Input coordinates are huge or sparse but only relative order matters.
- An array-based data structure wants small contiguous indices.
- The phrase `compress values` or `rank transform` appears.