Hypothesis testing for single population mean when the variance \sigma^{2} is known
Contents
Hypothesis testing for single population mean when the variance \(\sigma^{2}\) is knownΒΆ
Assume that we are interested in making an inference about the mean of a normally distributed single population with known variance and we have a random sample of size n drawn from this population:
When the sample is coming from a normally distributed population with known variance \(\sigma^{2}\), under \(H_{0}\), the test concerning the population mean \(\mu\), is based on the following test statistic:
where \(\bar{X}\) is the sample mean, \(\mu_{0}\) is the hypothesized population mean value, and \(\sigma\) is the population standard deviation, and \(n\) is the sample size.
For a two-sided test:
Reject \(H_{0}\) if \(z^{*} \leq -z(1-\alpha/2)\) OR if \(z^{*} \geq z(1-\alpha/2)\), otherwise fail to reject \(H_{0}\).
Here, \(z(1-\alpha/2)\) refers to the upper \(1-\alpha/2\) percentage point of the standard normal distribution.
For a one-sided upper-tail test:
Reject \(H_{0}\) if \(z^{*} \geq z(1-\alpha)\), otherwise fail to reject \(H_{0}\).
Here, \(z(1-\alpha)\) refers to the upper \(\alpha\) percentage point of the standard normal distribution.
For a one-sided lower-tail test:
Reject \(H_{0}\) if \(z^{*} \leq z(\alpha)=-z(1-\alpha)\), otherwise fail to reject \(H_{0}\).
The P-value is the probability that we would observe a more extreme statistic than we did if the null hypothesis were true.
If \(z^{*}\) is the computed value of the test statistic, the P-value is:
where \(\Phi (z^{*}) = Pr(Z \leq z^{*})\) is the standard normal cumulative distribution function.
Decision rule: If \(P-value\) \leq \(\alpha\), then reject \(H_{0}\), otherwise fail to reject \(H_{0}\).
ExampleΒΆ
At a certain production facility that assembles computer keyboards, from past experience the assembly time is known to follow a normal distribution with mean (\(\mu\)) of 130 seconds and standard deviation (\(\sigma\)) of 15 seconds.
The new production supervisor suspects that the average time to assemble the keyboards does not quite follow the specified value.
To examine this problem, he measures the times for 100 assemblies and found that the sample assembly time average (\(\bar{X}\)) is \(126.8\) seconds.
Can the supervisor conclude at the \(5\%\) level of significance that the mean assembly time of 130 seconds is incorrect?
# define the hypothesis testing function for testing single normal population
# mean when pop. variance is known
import numpy as np
import scipy.stats as stats
def OneSampZ(Xbar, sigma1, n1, mu0 = 0, alpha = 0.05, direction = "two_sided"):
z_star = (Xbar-mu0)/np.sqrt(sigma1**2/n1)
if (direction == "two_sided"):
p_val = 2*(1 - stats.norm.cdf(abs(z_star)))
elif (direction == "one_sided_upper_tail"):
p_val = (1 - stats.norm.cdf(z_star))
else:
p_val = stats.norm.cdf(z_star)
if (p_val < alpha):
Hypothesis_Status = 'Reject Null Hypothesis : Significant'
else:
Hypothesis_Status = 'Do not reject Null Hypothesis : Not Significant'
print (Hypothesis_Status)
z_star = np.round(z_star,4)
p_val = np.round(p_val,4)
return z_star, p_val
OneSampZ(Xbar = 126.8, sigma1 = 15, n1 = 100, mu0 = 130, alpha = 0.05, direction = "two_sided")
Reject Null Hypothesis : Significant
(-2.1333, 0.0329)
Hypothesis testing for single population mean when the variance \(\sigma^{2}\) is unknownΒΆ
When the sample is coming from a normally distributed population with unknown variance \(\sigma^{2}\), the test concerning the population mean \(\mu\), is based on the following test statistic, under \(H_{0}\): $\( \begin{aligned} t^{*} =\frac{\bar{X}-\mu_{0}}{s/\sqrt{n}}\sim t_{n-1} \end{aligned} \)$
where \(\bar{X}\) is the sample mean, \(\mu_{0}\) is the hypothesized population mean value, and \(s\) is the sample standard deviation, and \(n\) is the sample size.
For a two-sided test:
Reject \(H_{0}\) if \(t^{*} \leq -t_{(1-\alpha/2,n-1)}\) OR if \(t^{*} \geq t_{(1-\alpha/2,n-1)}\), otherwise fail to reject \(H_{0}\).
Here, \(t(1-\alpha/2, n-1)\) refers to the upper \(1-\alpha/2\) percentage point of the t distribution with \(n-1\) degrees of freedom.
For a one-sided upper-tail test: $\( \begin{aligned} H_{0} &: \mu \leq \mu_{0} \nonumber \\ H_{1} &: \mu > \mu_{0} \nonumber \end{aligned} \)$
Reject \(H_{0}\) if \(t^{*} \geq t_{(1-\alpha,n-1)}\), otherwise fail to reject \(H_{0}\).
Here, \(t(1-\alpha, n-1)\) refers to the upper \(1-\alpha\) percentage point of the t distribution with \(n-1\) degrees of freedom.
For a one-sided lower-tail test:
Reject \(H_{0}\) if \(t^{*} \leq t_{(\alpha,n-1)}=-t_{(1-\alpha,n-1)}\), otherwise fail to reject \(H_{0}\).
Here, \(t(1-\alpha, n-1)\) refers to the upper \(1-\alpha\) percentage point of the t distribution with \(n-1\) degrees of freedom.
ExampleΒΆ
In past summers in a large library system, the mean number of books borrowed per cardholder was 8.50.
The library administration would like to test whether the mean number of books borrowed per cardholder this summer under modified loan arrangements differs from the level of past summers (\(H_{1}\)) or not (\(H_{0}\)).
A random sample of 25 cardholders showed the following results for borrowing this summer: \(\bar{X}=9.34\) books and \(s=3.31\) books.
Conduct hypothesis test to test the claim, controlling the significance level at \(0.05\).
# define the hypothesis testing function for testing single normal population
# mean when pop. variance is known
import numpy as np
import scipy.stats as stats
def OneSampt(Xbar, sd1, n1, mu0 = 0, alpha = 0.05, direction = "two_sided"):
t_star = (Xbar-mu0)/np.sqrt(sd1**2/n1)
if (direction == "two_sided"):
p_val = 2*(1 - stats.t.cdf(abs(t_star), df = n1-1))
elif (direction == "one_sided_upper_tail"):
p_val = (1 - stats.t.cdf(t_star, df = n1-1))
else:
p_val = stats.t.cdf(t_star, df = n1-1)
if (p_val < alpha):
Hypothesis_Status = 'Reject Null Hypothesis : Significant'
else:
Hypothesis_Status = 'Do not reject Null Hypothesis : Not Significant'
print (Hypothesis_Status)
t_star = np.round(t_star,4)
p_val = np.round(p_val,4)
return t_star, p_val
OneSampt(Xbar = 9.34, sd1 = 3.31, n1 = 25, mu0 = 8.5, alpha = 0.05, direction = "two_sided")
Do not reject Null Hypothesis : Not Significant
(1.2689, 0.2167)
ExampleΒΆ
The developer of a decision-support software package wishes to test whether users consider a color graphics enhancement to be beneficial, on balance, given its list price of $800.
A random sample of \(10\) users of the package is invited to try out the enhancement and rate it on a scale ranging from \(-5\) (completely useless) and \(5\) (very beneficial).
The results are as follows: the sample mean \(\bar{X}=0.535\) and the sample standard deviation \(s=2.3\).
Test the hypothesis \(H_{0}:\mu \leq 0\) and \(H_{1}:\mu > 0\), where \(\mu\) denotes the mean rating of users at the significance level of \(0.01\).
OneSampt(Xbar = 0.535, sd1 = 2.3, n1 = 10, mu0 = 0, alpha = 0.01, direction = "one_sided_upper_tail")
Do not reject Null Hypothesis : Not Significant
(0.7356, 0.2404)
Session InfoΒΆ
import session_info
session_info.show()
Click to view session information
----- numpy 1.22.4 scipy 1.8.1 session_info 1.0.0 -----
Click to view modules imported as dependencies
asttokens NA backcall 0.2.0 beta_ufunc NA binom_ufunc NA colorama 0.4.4 cython_runtime NA dateutil 2.8.2 debugpy 1.6.0 decorator 5.1.1 entrypoints 0.4 executing 0.8.3 hypergeom_ufunc NA ipykernel 6.13.0 ipython_genutils 0.2.0 jedi 0.18.1 mpl_toolkits NA nbinom_ufunc NA packaging 21.3 parso 0.8.3 pexpect 4.8.0 pickleshare 0.7.5 pkg_resources NA prompt_toolkit 3.0.29 psutil 5.9.1 ptyprocess 0.7.0 pure_eval 0.2.2 pydev_ipython NA pydevconsole NA pydevd 2.8.0 pydevd_file_utils NA pydevd_plugins NA pydevd_tracing NA pygments 2.12.0 six 1.16.0 sphinxcontrib NA stack_data 0.2.0 tornado 6.1 traitlets 5.2.1.post0 wcwidth 0.2.5 zmq 23.0.0
----- IPython 8.4.0 jupyter_client 7.3.1 jupyter_core 4.10.0 notebook 6.4.11 ----- Python 3.8.12 (default, May 4 2022, 08:13:04) [GCC 9.4.0] Linux-5.13.0-1023-azure-x86_64-with-glibc2.2.5 ----- Session information updated at 2022-05-28 16:28