Session 9 — Hypothesis Testing: Comparison Tests
Decision Making Statistics — S04
This session extends hypothesis testing to the comparison of two unknown populations.
1 Session plan
- Comparison test between two unknown means
- Comparison test between two unknown proportions
2 Comparison test between two unknown means
2.1 Introductory example
The Department of Education studies the average first-hire salary of students graduating in 2021. It wants to compare engineering school graduates with business school graduates. There is no reference value here: both means are unknown.
2.2 Modeling framework
- Population: graduates in 2021
- Subpopulation 1: engineering school graduates \(\rightarrow X_1\), unknown mean \(\mu_1\)
- Subpopulation 2: business school graduates \(\rightarrow X_2\), unknown mean \(\mu_2\)
- Parameters of interest: \(\mu_1\) and \(\mu_2\)
2.3 Hypotheses
Depending on the question, we test:
- \(H_0: \mu_1 = \mu_2\)
- \(H_1: \mu_1 \neq \mu_2\)
- or \(H_1: \mu_1 < \mu_2\)
- or \(H_1: \mu_1 > \mu_2\)
2.4 Test statistic
For two independent large samples (\(n_1,n_2\geq 30\)):
\[ U_{obs} = \frac{\bar{x}_1-\bar{x}_2}{\sqrt{\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}}} \]
Under \(H_0\), we use approximately
\[ U_{obs} \sim \mathcal{N}(0,1). \]
3 Comparison test between two unknown proportions
3.1 Introductory example
The Ministry of Education compares hiring rates between engineering and business school graduates using two independent studies.
3.2 Modeling framework
- Subpopulation 1: engineering graduates \(\rightarrow X_1 \sim B(p_1)\)
- Subpopulation 2: business graduates \(\rightarrow X_2 \sim B(p_2)\)
- Parameters of interest: \(p_1\) and \(p_2\)
3.3 Hypotheses
- \(H_0: p_1 = p_2\)
- \(H_1: p_1 \neq p_2\)
- or \(H_1: p_1 < p_2\)
- or \(H_1: p_1 > p_2\)
3.4 Test statistic
For two independent large samples:
\[ U_{obs} = \frac{\hat{p}_1-\hat{p}_2}{\sqrt{\hat{p}(1-\hat{p})\left(\frac{1}{n_1}+\frac{1}{n_2}\right)}} \]
with the pooled proportion
\[ \hat{p} = \frac{n_1\hat{p}_1+n_2\hat{p}_2}{n_1+n_2}. \]
In the two-proportion test, the denominator is built with the pooled proportion under \(H_0\).
Before calculating, always identify:
- the two subpopulations,
- the parameter being compared (mean or proportion),
- whether the alternative is two-tailed, left-tailed, or right-tailed.
4 Exercises
4.1 Exercise 1 — Comparison of two means
Data were collected on the price of product PP in April (sample 1) and May (sample 2):
- April: \(n_1 = 60\) observations, \(\bar{x}_1 = 18.90\)€, \(s_1 = 2.50\)€
- May: \(n_2 = 55\) observations, \(\bar{x}_2 = 18.10\)€, \(s_2 = 2.20\)€
Can we confirm, with \(5\%\) risk (then \(10\%\) risk), that the average price has decreased between April and May?
Build the full modeling framework: population, variable, subpopulations, hypotheses, test statistic, and conclusion.
Since the question is whether the average price has decreased from April to May, a convenient formulation is:
- subpopulation 1: April prices (\(\mu_1\) unknown)
- subpopulation 2: May prices (\(\mu_2\) unknown)
Hypotheses: a decrease means the April mean exceeds the May mean, so
\[ H_0: \mu_1 = \mu_2, \qquad H_1: \mu_1 > \mu_2. \]
This is a right-tailed test on \(\mu_1 - \mu_2\).
The test statistic is
\[ U_{obs} = \frac{\bar{x}_1-\bar{x}_2}{\sqrt{\dfrac{s_1^2}{n_1}+\dfrac{s_2^2}{n_2}}} = \frac{18.90-18.10}{\sqrt{\dfrac{6.25}{60}+\dfrac{4.84}{55}}} = \frac{0.80}{\sqrt{0.1042+0.0880}} = \frac{0.80}{\sqrt{0.1922}} \approx \frac{0.80}{0.4384} \approx 1.825. \]
Critical values for a right-tailed test:
- at \(5\%\) risk: \(1.645\)
- at \(10\%\) risk: \(1.282\)
Since \(1.825 > 1.645\) and \(1.825 > 1.282\), we reject \(H_0\) at both risk levels.
Conclusion: the data confirm that the average price has significantly decreased between April and May, at both \(5\%\) and \(10\%\) risk.
4.2 Exercise 2 — Comparison of two proportions
A firm tests advertising effectiveness in two areas A and B:
- Area A: 1,100 randomly selected people who saw the ad, of whom 658 remembered the slogan
- Area B: 1,000 randomly selected people who saw the ad, of whom 526 remembered the slogan
Can we confirm, with 5% risk (then 1%), that advertising is more effective in area A?
Let \(p_1\) be the remembrance rate in area A and \(p_2\) the remembrance rate in area B.
We test:
\[ H_0: p_1 = p_2, \qquad H_1: p_1 > p_2. \]
Observed proportions:
\[ \hat{p}_1 = \frac{658}{1100} \approx 0.5982, \qquad \hat{p}_2 = \frac{526}{1000} = 0.526. \]
Pooled proportion:
\[ \hat{p} = \frac{658+526}{1100+1000} = \frac{1184}{2100} \approx 0.5638. \]
Test statistic:
\[ U_{obs} \approx 3.331. \]
Critical values for a right-tailed test:
- at 5% risk: \(1.645\)
- at 1% risk: \(2.326\)
Since
\[ 3.331 > 1.645 \quad \text{and} \quad 3.331 > 2.326, \]
we reject \(H_0\) in both cases.
Conclusion: yes, we can confirm that advertising is more effective in area A.
4.3 Application — Comparison of two average salaries
The Ministry of Education conducted two studies:
- Study 1: 100 engineering school graduates — average salary = €3,350, with standard deviation \(s = 400\)
- Study 2: 120 business school graduates — average salary = €3,550, with standard deviation \(s = 650\)
Can we say, with 1% risk, that there is a difference in average salary between engineering and business school graduates?
Let \(\mu_1\) be the mean salary of engineering graduates and \(\mu_2\) the mean salary of business school graduates.
We test:
\[ H_0: \mu_1 = \mu_2, \qquad H_1: \mu_1 \neq \mu_2. \]
The observed statistic is
\[ U_{obs} = \frac{3350-3550}{\sqrt{\frac{400^2}{100}+\frac{650^2}{120}}} \approx -2.795. \]
For a two-tailed test at 1% risk, the critical value is \(z_{0.005}=2.576\).
Since
\[ |{-2.795}| > 2.576, \]
we reject \(H_0\).
Conclusion: with 1% risk, there is a significant difference in average salary. The sample indicates a higher mean salary for business school graduates.
4.4 Second application — Comparison of two hiring rates
The Ministry of Education conducted two studies:
- Study 1: 100 engineering school graduates — 70 were hired after graduation
- Study 2: 120 business school graduates — 75 were hired after graduation
Can we confirm, with 5% risk, that the hiring rate is higher among engineering school graduates?
Let \(p_1\) be the hiring rate among engineering graduates and \(p_2\) the hiring rate among business graduates.
We test:
\[ H_0: p_1 = p_2, \qquad H_1: p_1 > p_2. \]
Observed proportions:
\[ \hat{p}_1 = 0.70, \qquad \hat{p}_2 = \frac{75}{120} = 0.625. \]
Pooled proportion:
\[ \hat{p} = \frac{70+75}{220} \approx 0.6591. \]
The statistic is
\[ U_{obs} \approx 1.169. \]
For a right-tailed test at 5% risk, the critical value is \(1.645\).
Since
\[ 1.169 < 1.645, \]
we do not reject \(H_0\).
Conclusion: with 5% risk, we do not have enough evidence to confirm that the hiring rate is higher among engineering school graduates.
4.5 Exercise 3 — Productivity comparison between two production lines
A factory manager wants to know whether production line A is more productive than line B.
- Line A: \(n_1 = 40\) working days, \(\bar{x}_1 = 95\) parts/hour, \(s_1 = 8\) parts/hour
- Line B: \(n_2 = 35\) working days, \(\bar{x}_2 = 91\) parts/hour, \(s_2 = 10\) parts/hour
Can we confirm, with \(5\%\) risk (then \(1\%\)), that line A is more productive than line B?
Let \(\mu_1\) and \(\mu_2\) be the mean hourly output of lines A and B respectively.
We test:
\[ H_0: \mu_1 = \mu_2, \qquad H_1: \mu_1 > \mu_2. \]
The test statistic is
\[ U_{obs} = \frac{95-91}{\sqrt{\dfrac{64}{40}+\dfrac{100}{35}}} = \frac{4}{\sqrt{1.600+2.857}} = \frac{4}{\sqrt{4.457}} = \frac{4}{2.111} \approx 1.895. \]
Critical values for a right-tailed test:
- at \(5\%\) risk: \(1.645\)
- at \(1\%\) risk: \(2.326\)
At 5% risk: \(1.895 > 1.645\) → reject \(H_0\). Line A is significantly more productive.
At 1% risk: \(1.895 < 2.326\) → do not reject \(H_0\). The evidence is insufficient at the stricter level.
Comment: the advantage of line A is confirmed at \(5\%\) but not at \(1\%\).
4.6 Exercise 4 — Contract renewal by sales region
A subscription-based firm compares its contract renewal rates across two sales regions:
- North region: \(n_1 = 300\) customers, \(198\) renewed their contract
- South region: \(n_2 = 250\) customers, \(137\) renewed their contract
Can we say, with \(5\%\) risk (then \(1\%\)), that the renewal rates differ between the two regions?
Let \(p_1\) and \(p_2\) be the renewal rates in the North and South regions respectively.
We test:
\[ H_0: p_1 = p_2, \qquad H_1: p_1 \neq p_2. \]
Observed proportions:
\[ \hat{p}_1 = \frac{198}{300} = 0.660, \qquad \hat{p}_2 = \frac{137}{250} = 0.548. \]
Pooled proportion:
\[ \hat{p} = \frac{198+137}{300+250} = \frac{335}{550} \approx 0.6091. \]
Test statistic:
\[ U_{obs} = \frac{0.660-0.548}{\sqrt{0.6091\times 0.3909\times\left(\dfrac{1}{300}+\dfrac{1}{250}\right)}} = \frac{0.112}{\sqrt{0.2381\times 0.007333}} = \frac{0.112}{\sqrt{0.001746}} = \frac{0.112}{0.04178} \approx 2.681. \]
Critical values for a two-tailed test:
- at \(5\%\) risk: \(z_{0.025}=1.96\)
- at \(1\%\) risk: \(z_{0.005}=2.576\)
Since \(|2.681| > 1.96\) and \(|2.681| > 2.576\), we reject \(H_0\) at both risk levels.
Conclusion: with both \(5\%\) and \(1\%\) risk, the renewal rates in the North and South regions are significantly different. The North region has a higher renewal rate (\(66\%\) vs. \(54.8\%\)).