4.10 Binomial Distribution $B(n,p)$
1. Description of the Problem
The Binomial distribution describes the probability of a discrete random variable $X$ in a specific type of scenario. We deal with a game (or experiment) that has exactly two possible outcomes:
- SUCCESS (with a constant probability $p$)
- FAILURE (with probability $1-p$)
We play the game $n$ times independently. Our defining parameters are:
$p$ = probability of success (in a single trial)
The variable $X$ counts the number of successes. We say that $X$ follows a binomial distribution and write $X \sim B(n,p)$. Since $n$ is the number of trials, $X$ can take on the discrete values:
2. Using the GDC
The probabilities $P(X=0), P(X=1), \dots$ are most efficiently obtained using a Graphic Display Calculator (GDC). For Casio calculators, select MENU $\to$ Statistics $\to$ DIST $\to$ BINOMIAL.
| Function | Purpose |
|---|---|
| Bpd(x) | Finds the probability of exactly $x$ successes. |
| Bcd($x_1$ to $x_2$) | Finds the cumulative probability from $x_1$ up to $x_2$ successes. |
- Data: always set to Variable
- Numtrial: the total number of trials ($n$)
- p: the probability of success in a single trial ($p$)
EXAMPLE 1
We toss a die 5 times. The "success" is defined as getting a six. Here, $n=5$ and $p=\dfrac{1}{6}$. We may have 0, 1, 2, 3, 4, or 5 successes.
| $X$ | 0 | 1 | 2 | 3 | 4 | 5 |
|---|---|---|---|---|---|---|
| GDC | Bpd(0) | Bpd(1) | Bpd(2) | Bpd(3) | Bpd(4) | Bpd(5) |
| $P(X=x)$ | 0.4019 | 0.4019 | 0.1608 | 0.0322 | 0.0032 | 0.0001 |
We can efficiently answer various inequality questions using the GDC's cumulative function (Bcd):
| Find the probability of: | Notation | GDC Entry | Result |
|---|---|---|---|
| exactly 3 sixes | $P(X=3)$ | Bpd(3) | 0.0322 |
| at most 3 sixes | $P(X \le 3)$ | Bcd(0 to 3) | 0.9967 |
| less than 3 sixes | $P(X < 3)$ | Bcd(0 to 2) | 0.9645 |
| more than 3 sixes | $P(X > 3)$ | Bcd(4 to 5) | 0.0033 |
| at least 3 sixes | $P(X \ge 3)$ | Bcd(3 to 5) | 0.0355 |
3. The Binomial Formula
Remark: This mathematical formula is not strictly in the syllabus (as results are obtained via GDC), but it is invaluable for deep conceptual understanding.
- The probability to obtain 5 sixes in a row is strictly: $\left(\dfrac{1}{6}\right)^5$
- The probability to obtain 0 sixes at all is strictly: $\left(\dfrac{5}{6}\right)^5$
- The probability to obtain exactly 2 sixes and 3 no-sixes is: $^5C_2 \left(\dfrac{1}{6}\right)^2 \left(\dfrac{5}{6}\right)^3$
Here, $^5C_2$ algebraically defines the number of ways to arrange exactly 2 sixes within 5 independent trials. In general, if we play $n$ times a game with a success probability of $p$, the probability $P(X=x)$ evaluates mathematically to:
4. Expected Value and Variance
For any random variable $X$ strictly following a binomial distribution $B(n, p)$, the expected value (mean) and variance are elegantly derived using the parameters:
$E(X) = 5\left(\dfrac{1}{6}\right) = \mathbf{\dfrac{5}{6}}$
$Var(X) = 5\left(\dfrac{1}{6}\right)\left(\dfrac{5}{6}\right) = \mathbf{\dfrac{25}{36}}$
EXAMPLE 2
A box contains 5 balls: 1 BLACK and 4 WHITE. We win if we select a BLACK ball. We systematically play this game exactly 10 times (with replacement). Find the requested probability parameters.
This requires the precise output of $Bpd(4) = \mathbf{0.088}$.
[Indeed, mathematically: $P(X=4) = {^{10}C_4}(0.2)^4(0.8)^6 = 0.088$]
This compiles the cumulative threshold $Bcd(0 \text{ to } 4) = \mathbf{0.967}$.
[In fact: $P(X \le 4) = P(X=0) + P(X=1) + P(X=2) + P(X=3) + P(X=4)$]
This requires the bounded cumulative region $Bcd(1 \text{ to } 10) = \mathbf{0.893}$.
[Alternative logical route: $P(X \ge 1) = 1 - P(X=0) = 1 - 0.107 = 0.893$]
$E(X) = np = 10 \times 0.2 = \mathbf{2}$
$Var(X) = np(1-p) = 10 \times 0.2 \times 0.8 = \mathbf{1.6}$
EXAMPLE 3
Let $p=0.2$ and $n$ remain unknown. It is mathematically given that $P(X=1) = 0.268$. Find $n$.
By actively utilizing trial and error on the Numtrial input of the GDC, we sequentially test values and observe that $Bpd(10)$ evaluates strictly to $0.268$.
Hence, $\mathbf{n=10}$.
5. Mode of a Binomial Distribution (Mainly for HL)
To locate the mode (the outcome of highest probability), we first evaluate the expected number $E(X) = np$. The mode predictably clusters adjacently to the mean.
Say $n=20$ and $p=\dfrac{1}{6}$, meaning $E(X) = \dfrac{20}{6} \approx 3.33$.
We systematically check the nearest integer values (3 and 4) using the GDC:
$P(X=3) = 0.237$
$P(X=4) = 0.202$
Because $0.237 > 0.202$, the absolute highest probability corresponds to $X=3$. Hence, the mode is 3.
Say $n=60$ and $p=\dfrac{1}{6}$, meaning $E(X) = \dfrac{60}{6} = 10$.
We systematically evaluate the immediately adjacent neighboring integers (9, 10, 11):
$P(X=9) = 0.134$
$P(X=10) = 0.137$
$P(X=11) = 0.126$
The peak probability falls exactly on the mean. Hence, the mode is 10.
For $n=5$ and $p=\dfrac{1}{6}$, we evaluate $E(X) = \dfrac{5}{6} \approx 0.833$. We check the adjacent integers 0 and 1:
$P(X=0) = 0.4019$
$P(X=1) = 0.4019$
The probabilities are exactly identical. Hence, there are two distinct modes: 0 and 1.