Hypothesis testing for single population mean when the variance \(\sigma^{2}\) is known

  • Assume that we are interested in making an inference about the mean of a normally distributed single population with known variance and we have a random sample of size n drawn from this population:

\[ X_{1}, X_{2}, \dots, X_{n} \sim N(\mu, \sigma^{2}). \]
  • When the sample is coming from a normally distributed population with known variance \(\sigma^{2}\), under \(H_{0}\), the test concerning the population mean \(\mu\), is based on the following test statistic:

\[ \begin{aligned} Z^{*} &=\frac{\bar{X}-\mu_{0}}{\sigma/\sqrt{n}}\sim N(0,1) \end{aligned} \]
  • where \(\bar{X}\) is the sample mean, \(\mu_{0}\) is the hypothesized population mean value, and \(\sigma\) is the population standard deviation, and \(n\) is the sample size.

  • For a two-sided test:

\[\begin{split} \begin{aligned} H_{0} &: \mu = \mu_{0} \nonumber \\ H_{1} &: \mu \neq \mu_{0} \nonumber \end{aligned} \end{split}\]
  • Reject \(H_{0}\) if \(z^{*} \leq -z(1-\alpha/2)\) OR if \(z^{*} \geq z(1-\alpha/2)\), otherwise fail to reject \(H_{0}\).

  • Here, \(z(1-\alpha/2)\) refers to the upper \(1-\alpha/2\) percentage point of the standard normal distribution.

  • For a one-sided upper-tail test:

\[\begin{split} \begin{aligned} H_{0} &: \mu \leq \mu_{0} \nonumber \\ H_{1} &: \mu > \mu_{0} \nonumber \end{aligned} \end{split}\]
  • Reject \(H_{0}\) if \(z^{*} \geq z(1-\alpha)\), otherwise fail to reject \(H_{0}\).

  • Here, \(z(1-\alpha)\) refers to the upper \(\alpha\) percentage point of the standard normal distribution.

  • For a one-sided lower-tail test:

\[\begin{split} \begin{aligned} H_{0} &: \mu \geq \mu_{0} \nonumber \\ H_{1} &: \mu < \mu_{0} \nonumber \end{aligned} \end{split}\]
  • Reject \(H_{0}\) if \(z^{*} \leq z(\alpha)=-z(1-\alpha)\), otherwise fail to reject \(H_{0}\).

  • The P-value is the probability that we would observe a more extreme statistic than we did if the null hypothesis were true.

  • If \(z^{*}\) is the computed value of the test statistic, the P-value is:

\[\begin{split} P-value = \begin{cases} 2[1-\Phi(|z^{*}|)] & \quad {for} \quad H_{0} : \mu = \mu_{0} \quad \text{vs} \quad H_{1} : \mu \neq \mu_{0} \nonumber \\ 1-\Phi(z^{*}) & \quad {for} \quad H_{0} : \mu \leq \mu_{0} \quad \text{vs} \quad H_{1} : \mu > \mu_{0} \nonumber \\ \Phi(z^{*}) & \quad {for} \quad H_{0} : \mu \geq \mu_{0} \quad \text{vs} \quad H_{1} : \mu < \mu_{0} \nonumber \end{cases} \end{split}\]
  • where \(\Phi (z^{*}) = Pr(Z \leq z^{*})\) is the standard normal cumulative distribution function.

  • Decision rule: If \(P-value\) \leq \(\alpha\), then reject \(H_{0}\), otherwise fail to reject \(H_{0}\).

Example

  • At a certain production facility that assembles computer keyboards, from past experience the assembly time is known to follow a normal distribution with mean (\(\mu\)) of 130 seconds and standard deviation (\(\sigma\)) of 15 seconds.

  • The new production supervisor suspects that the average time to assemble the keyboards does not quite follow the specified value.

  • To examine this problem, he measures the times for 100 assemblies and found that the sample assembly time average (\(\bar{X}\)) is \(126.8\) seconds.

  • Can the supervisor conclude at the \(5\%\) level of significance that the mean assembly time of 130 seconds is incorrect?

# define the hypothesis testing function for testing single normal population
# mean when pop. variance is known

import numpy as np
import scipy.stats as stats
    
def OneSampZ(Xbar, sigma1, n1, mu0 = 0, alpha = 0.05, direction = "two_sided"):
    
    z_star = (Xbar-mu0)/np.sqrt(sigma1**2/n1)

    if (direction == "two_sided"): 
        p_val = 2*(1 - stats.norm.cdf(abs(z_star)))
    elif (direction == "one_sided_upper_tail"): 
        p_val = (1 - stats.norm.cdf(z_star))
    else:
        p_val = stats.norm.cdf(z_star)
    
    if (p_val < alpha):
        Hypothesis_Status = 'Reject Null Hypothesis : Significant' 
    else:
        Hypothesis_Status = 'Do not reject Null Hypothesis : Not Significant'
 
    print (Hypothesis_Status)

    z_star = np.round(z_star,4)
    p_val = np.round(p_val,4)
    
    return z_star, p_val
OneSampZ(Xbar = 126.8, sigma1 = 15, n1 = 100, mu0 = 130, alpha = 0.05, direction = "two_sided")
Reject Null Hypothesis : Significant
(-2.1333, 0.0329)

Hypothesis testing for single population mean when the variance \(\sigma^{2}\) is unknown

  • When the sample is coming from a normally distributed population with unknown variance \(\sigma^{2}\), the test concerning the population mean \(\mu\), is based on the following test statistic, under \(H_{0}\): $\( \begin{aligned} t^{*} =\frac{\bar{X}-\mu_{0}}{s/\sqrt{n}}\sim t_{n-1} \end{aligned} \)$

  • where \(\bar{X}\) is the sample mean, \(\mu_{0}\) is the hypothesized population mean value, and \(s\) is the sample standard deviation, and \(n\) is the sample size.

  • For a two-sided test:

\[\begin{split} \begin{aligned} H_{0} &: \mu = \mu_{0} \nonumber \\ H_{1} &: \mu \neq \mu_{0} \nonumber \end{aligned} \end{split}\]
  • Reject \(H_{0}\) if \(t^{*} \leq -t_{(1-\alpha/2,n-1)}\) OR if \(t^{*} \geq t_{(1-\alpha/2,n-1)}\), otherwise fail to reject \(H_{0}\).

  • Here, \(t(1-\alpha/2, n-1)\) refers to the upper \(1-\alpha/2\) percentage point of the t distribution with \(n-1\) degrees of freedom.

  • For a one-sided upper-tail test: $\( \begin{aligned} H_{0} &: \mu \leq \mu_{0} \nonumber \\ H_{1} &: \mu > \mu_{0} \nonumber \end{aligned} \)$

  • Reject \(H_{0}\) if \(t^{*} \geq t_{(1-\alpha,n-1)}\), otherwise fail to reject \(H_{0}\).

  • Here, \(t(1-\alpha, n-1)\) refers to the upper \(1-\alpha\) percentage point of the t distribution with \(n-1\) degrees of freedom.

  • For a one-sided lower-tail test:

\[\begin{split} \begin{aligned} H_{0} &: \mu \geq \mu_{0} \nonumber \\ H_{1} &: \mu < \mu_{0} \nonumber \end{aligned} \end{split}\]
  • Reject \(H_{0}\) if \(t^{*} \leq t_{(\alpha,n-1)}=-t_{(1-\alpha,n-1)}\), otherwise fail to reject \(H_{0}\).

  • Here, \(t(1-\alpha, n-1)\) refers to the upper \(1-\alpha\) percentage point of the t distribution with \(n-1\) degrees of freedom.

Example

  • In past summers in a large library system, the mean number of books borrowed per cardholder was 8.50.

  • The library administration would like to test whether the mean number of books borrowed per cardholder this summer under modified loan arrangements differs from the level of past summers (\(H_{1}\)) or not (\(H_{0}\)).

  • A random sample of 25 cardholders showed the following results for borrowing this summer: \(\bar{X}=9.34\) books and \(s=3.31\) books.

  • Conduct hypothesis test to test the claim, controlling the significance level at \(0.05\).

# define the hypothesis testing function for testing single normal population
# mean when pop. variance is known

import numpy as np
import scipy.stats as stats
    
def OneSampt(Xbar, sd1, n1, mu0 = 0, alpha = 0.05, direction = "two_sided"):
    
    t_star = (Xbar-mu0)/np.sqrt(sd1**2/n1)

    if (direction == "two_sided"): 
        p_val = 2*(1 - stats.t.cdf(abs(t_star), df = n1-1))
    elif (direction == "one_sided_upper_tail"): 
        p_val = (1 - stats.t.cdf(t_star, df = n1-1))
    else:
        p_val = stats.t.cdf(t_star, df = n1-1)
    
    if (p_val < alpha):
        Hypothesis_Status = 'Reject Null Hypothesis : Significant' 
    else:
        Hypothesis_Status = 'Do not reject Null Hypothesis : Not Significant'
 
    print (Hypothesis_Status)

    t_star = np.round(t_star,4)
    p_val = np.round(p_val,4)
    
    return t_star, p_val
OneSampt(Xbar = 9.34, sd1 = 3.31, n1 = 25, mu0 = 8.5, alpha = 0.05, direction = "two_sided")
Do not reject Null Hypothesis : Not Significant
(1.2689, 0.2167)

Example

  • The developer of a decision-support software package wishes to test whether users consider a color graphics enhancement to be beneficial, on balance, given its list price of $800.

  • A random sample of \(10\) users of the package is invited to try out the enhancement and rate it on a scale ranging from \(-5\) (completely useless) and \(5\) (very beneficial).

  • The results are as follows: the sample mean \(\bar{X}=0.535\) and the sample standard deviation \(s=2.3\).

  • Test the hypothesis \(H_{0}:\mu \leq 0\) and \(H_{1}:\mu > 0\), where \(\mu\) denotes the mean rating of users at the significance level of \(0.01\).

OneSampt(Xbar = 0.535, sd1 = 2.3, n1 = 10, mu0 = 0, alpha = 0.01, direction = "one_sided_upper_tail")
Do not reject Null Hypothesis : Not Significant
(0.7356, 0.2404)

Session Info

import session_info
session_info.show()
Click to view session information
-----
numpy               1.22.4
scipy               1.8.1
session_info        1.0.0
-----
Click to view modules imported as dependencies
asttokens           NA
backcall            0.2.0
beta_ufunc          NA
binom_ufunc         NA
colorama            0.4.4
cython_runtime      NA
dateutil            2.8.2
debugpy             1.6.0
decorator           5.1.1
entrypoints         0.4
executing           0.8.3
hypergeom_ufunc     NA
ipykernel           6.13.0
ipython_genutils    0.2.0
jedi                0.18.1
mpl_toolkits        NA
nbinom_ufunc        NA
packaging           21.3
parso               0.8.3
pexpect             4.8.0
pickleshare         0.7.5
pkg_resources       NA
prompt_toolkit      3.0.29
psutil              5.9.1
ptyprocess          0.7.0
pure_eval           0.2.2
pydev_ipython       NA
pydevconsole        NA
pydevd              2.8.0
pydevd_file_utils   NA
pydevd_plugins      NA
pydevd_tracing      NA
pygments            2.12.0
six                 1.16.0
sphinxcontrib       NA
stack_data          0.2.0
tornado             6.1
traitlets           5.2.1.post0
wcwidth             0.2.5
zmq                 23.0.0
-----
IPython             8.4.0
jupyter_client      7.3.1
jupyter_core        4.10.0
notebook            6.4.11
-----
Python 3.8.12 (default, May  4 2022, 08:13:04) [GCC 9.4.0]
Linux-5.13.0-1023-azure-x86_64-with-glibc2.2.5
-----
Session information updated at 2022-05-28 16:28