4.10 Binomial Distribution $B(n,p)$

1. Description of the Problem

The Binomial distribution describes the probability of a discrete random variable $X$ in a specific type of scenario. We deal with a game (or experiment) that has exactly two possible outcomes:

  • SUCCESS (with a constant probability $p$)
  • FAILURE (with probability $1-p$)

We play the game $n$ times independently. Our defining parameters are:

$n$ = total number of trials
$p$ = probability of success (in a single trial)

The variable $X$ counts the number of successes. We say that $X$ follows a binomial distribution and write $X \sim B(n,p)$. Since $n$ is the number of trials, $X$ can take on the discrete values:

$0, 1, 2, 3, 4, \dots, n$

2. Using the GDC

The probabilities $P(X=0), P(X=1), \dots$ are most efficiently obtained using a Graphic Display Calculator (GDC). For Casio calculators, select MENU $\to$ Statistics $\to$ DIST $\to$ BINOMIAL.

Function Purpose
Bpd(x) Finds the probability of exactly $x$ successes.
Bcd($x_1$ to $x_2$) Finds the cumulative probability from $x_1$ up to $x_2$ successes.
The menu for both functions requires:
  • Data: always set to Variable
  • Numtrial: the total number of trials ($n$)
  • p: the probability of success in a single trial ($p$)

EXAMPLE 1

We toss a die 5 times. The "success" is defined as getting a six. Here, $n=5$ and $p=\dfrac{1}{6}$. We may have 0, 1, 2, 3, 4, or 5 successes.

$X$ 0 1 2 3 4 5
GDC Bpd(0) Bpd(1) Bpd(2) Bpd(3) Bpd(4) Bpd(5)
$P(X=x)$ 0.4019 0.4019 0.1608 0.0322 0.0032 0.0001

We can efficiently answer various inequality questions using the GDC's cumulative function (Bcd):

Find the probability of: Notation GDC Entry Result
exactly 3 sixes $P(X=3)$ Bpd(3) 0.0322
at most 3 sixes $P(X \le 3)$ Bcd(0 to 3) 0.9967
less than 3 sixes $P(X < 3)$ Bcd(0 to 2) 0.9645
more than 3 sixes $P(X > 3)$ Bcd(4 to 5) 0.0033
at least 3 sixes $P(X \ge 3)$ Bcd(3 to 5) 0.0355

3. The Binomial Formula

Remark: This mathematical formula is not strictly in the syllabus (as results are obtained via GDC), but it is invaluable for deep conceptual understanding.

  • The probability to obtain 5 sixes in a row is strictly: $\left(\dfrac{1}{6}\right)^5$
  • The probability to obtain 0 sixes at all is strictly: $\left(\dfrac{5}{6}\right)^5$
  • The probability to obtain exactly 2 sixes and 3 no-sixes is: $^5C_2 \left(\dfrac{1}{6}\right)^2 \left(\dfrac{5}{6}\right)^3$

Here, $^5C_2$ algebraically defines the number of ways to arrange exactly 2 sixes within 5 independent trials. In general, if we play $n$ times a game with a success probability of $p$, the probability $P(X=x)$ evaluates mathematically to:

$P(X=x) = {^n}C_x p^x (1-p)^{n-x}$

4. Expected Value and Variance

For any random variable $X$ strictly following a binomial distribution $B(n, p)$, the expected value (mean) and variance are elegantly derived using the parameters:

$E(X) = np$
$Var(X) = np(1-p)$
For the dice example ($n=5$, $p=\dfrac{1}{6}$):
$E(X) = 5\left(\dfrac{1}{6}\right) = \mathbf{\dfrac{5}{6}}$
$Var(X) = 5\left(\dfrac{1}{6}\right)\left(\dfrac{5}{6}\right) = \mathbf{\dfrac{25}{36}}$
Notice (Only for HL): Since you know $E(X)$ and $Var(X)$, you structurally know $E(X^2)$ as well. Indeed: $$ \begin{aligned} Var(X) &= E(X^2) - (E(X))^2 \\ E(X^2) &= Var(X) + (E(X))^2 \\ E(X^2) &= np(1-p) + (np)^2 \end{aligned} $$

EXAMPLE 2

A box contains 5 balls: 1 BLACK and 4 WHITE. We win if we select a BLACK ball. We systematically play this game exactly 10 times (with replacement). Find the requested probability parameters.

The variable $X = \text{"number of winning games"}$ follows a binomial distribution with $n=10$ and $p=\dfrac{1}{5}=0.2$. We write $X \sim B(10, 0.2)$.
(a) The probability to win exactly 4 times:
This requires the precise output of $Bpd(4) = \mathbf{0.088}$.
[Indeed, mathematically: $P(X=4) = {^{10}C_4}(0.2)^4(0.8)^6 = 0.088$]
(b) The probability to win at most 4 times:
This compiles the cumulative threshold $Bcd(0 \text{ to } 4) = \mathbf{0.967}$.
[In fact: $P(X \le 4) = P(X=0) + P(X=1) + P(X=2) + P(X=3) + P(X=4)$]
(c) The probability to win at least once:
This requires the bounded cumulative region $Bcd(1 \text{ to } 10) = \mathbf{0.893}$.
[Alternative logical route: $P(X \ge 1) = 1 - P(X=0) = 1 - 0.107 = 0.893$]
(d) The expected number of winning games:
$E(X) = np = 10 \times 0.2 = \mathbf{2}$
(e) The variance of the number of winning games:
$Var(X) = np(1-p) = 10 \times 0.2 \times 0.8 = \mathbf{1.6}$

EXAMPLE 3

Let $p=0.2$ and $n$ remain unknown. It is mathematically given that $P(X=1) = 0.268$. Find $n$.

Solution: We know intrinsically that $n$ must evaluate to a positive integer.
By actively utilizing trial and error on the Numtrial input of the GDC, we sequentially test values and observe that $Bpd(10)$ evaluates strictly to $0.268$.
Hence, $\mathbf{n=10}$.

5. Mode of a Binomial Distribution (Mainly for HL)

To locate the mode (the outcome of highest probability), we first evaluate the expected number $E(X) = np$. The mode predictably clusters adjacently to the mean.

If the expected number is a decimal:
Say $n=20$ and $p=\dfrac{1}{6}$, meaning $E(X) = \dfrac{20}{6} \approx 3.33$.
We systematically check the nearest integer values (3 and 4) using the GDC:
$P(X=3) = 0.237$
$P(X=4) = 0.202$
Because $0.237 > 0.202$, the absolute highest probability corresponds to $X=3$. Hence, the mode is 3.
If the expected number is a whole number:
Say $n=60$ and $p=\dfrac{1}{6}$, meaning $E(X) = \dfrac{60}{6} = 10$.
We systematically evaluate the immediately adjacent neighboring integers (9, 10, 11):
$P(X=9) = 0.134$
$P(X=10) = 0.137$
$P(X=11) = 0.126$
The peak probability falls exactly on the mean. Hence, the mode is 10.
Notice (Two Modes Case): In certain specific parameter matrices, we may mathematically encounter dual modes.
For $n=5$ and $p=\dfrac{1}{6}$, we evaluate $E(X) = \dfrac{5}{6} \approx 0.833$. We check the adjacent integers 0 and 1:
$P(X=0) = 0.4019$
$P(X=1) = 0.4019$
The probabilities are exactly identical. Hence, there are two distinct modes: 0 and 1.