Algorithms / Indicator Variables
Least You Need to Know: Indicator Variables and Linearity of Expectation
Indicator variables turn counting questions into algebra. Define one indicator for each event of interest, sum them to count how many events occur, and then use linearity of expectation to compute the expected count. The key advantage is that independence is often unnecessary.
The least you need to know
- An indicator variable takes value 1 when an event happens and 0 otherwise.
- A total count can often be written as a sum of indicators.
- The expectation of an indicator is the probability of its event.
- Linearity of expectation says `E[X+Y]=E[X]+E[Y]` even without independence.
- This method is standard for expected counts such as collisions, inversions, and matches.
Key notation
I_E
indicator variable of event `E`, equal to 1 if `E` happens and 0 otherwise
E[I_E]
probability that event `E` happens
linearity of expectation
expectation of a sum equals the sum of expectations
Tiny worked example
- Let `X` be the number of collisions in a hash table.
- Define `I_{i,j}=1` when keys `i` and `j` collide, else 0.
- Then `X = sum I_{i,j}` over all pairs.
- Taking expectations gives `E[X] = sum E[I_{i,j}] = sum P(i and j collide)`.
Common mistakes
- Students often overcomplicate expected counts by trying to compute the whole distribution.
- Linearity of expectation does not require the indicators to be independent.
- The main step is expressing the count as a sum of indicators.
How to recognize this kind of problem
- The prompt asks for an expected number of bad pairs, matches, collisions, or inversions.
- There is a natural yes/no event for each object or pair of objects.
- A sum of 0-1 variables counts the total of interest exactly.