Mathematics 215: Introduction to Statistics

Study Guide

Unit 4: Self-Test Answer Key

Show all your work and keep your calculations to four decimal places, unless otherwise stated.

Circle True (T) or False (F) for each of the following:
1. TFalse is correct answerF
  
  The standard deviation of the sampling distribution of the sample mean is equal to the population standard deviation.
2. TFalse is correct answerF
  
  If the population distribution is positively skewed, then the sampling distribution of the sample mean is also positively skewed.
3. TFalse is correct answerF
  
  When the population standard deviation is unknown and the sample size exceeds 30, the $z$ distribution is used to compute a confidence interval for the population mean.
4. True is correct answerTF
  
  When the population standard deviation is known and the sample size exceeds 30, the $z$ distribution is used to compute a confidence interval for the population mean.
5. True is correct answerTF
  
  A larger sample size will tend to reduce the width of a confidence interval.
6. TFalse is correct answerF
  
  In conducting a test of hypothesis, if the $p -value$ exceeds the level of significance, we reject the null hypothesis.
7. True is correct answerTF
  
  In conducting a test of hypothesis, if the alternative hypothesis consists of a “ $<$ ” expression, the critical value will be a negative number.
8. TFalse is correct answerF
  
  In conducting a test of hypothesis, if the $p -value$ is less than 0.001, the evidence against the null hypothesis is considered to be very weak.
Past census surveys in a large Canadian province indicate that 40% of provincial voters favor the implementation of a carbon tax to combat global warming.

Consider a sampling (random) experiment where 100 voters are selected at random and the sample proportion of voters who favor a carbon tax is to be observed.
1. What would be the shape of the sampling distribution of the sample proportion in favor of the carbon tax, and why?
  
  Answer: The sampling distribution would be a normal distribution according to the central limit theorem, as both ( $n p = 40$ ) and ( $n q = 60$ ) exceed five.
2. Determine the mean of the sampling distribution of the sample proportion in favor of the carbon tax.
  
  Solution: $μ_{\hat{p}} = p = 0.40$
3. Determine the standard deviation of the sampling distribution of the sample proportion in favor of the carbon tax.
  
  Solution:
  
  $σ_{\hat{p}} = \sqrt{\frac{p q}{n}} = \sqrt{\frac{(0.4) (0.6)}{100}} = \sqrt{0.0024} = 0.0490$
4. Find the probability (to 4 decimal places) that, in the random sample of 100 voters, the sample proportion who favor a carbon tax is:
  1. less than 0.30
    
    Solution:
    
    $\begin{array}{l} P (\hat{p} < 0.30) & = P (z < \frac{0.30 - 0.40}{0.0490}) \\ = P (z < - 2.0408) \\ = 0.0207 \end{array}$
  2. between 0.30 and 0.40
    
    Solution:
    
    $\begin{array}{l} P (\hat{p} between 0.30 and 0.40) & = P (z between - 2.04 and 0) \\ = 0.50 - 0.0207 \\ = 0.4793 \end{array}$
5. If, among the 100 voters selected at random, 44 are in favor of a carbon tax, compute the sampling error. Assume that there are no non-sampling errors.
  
  Solution:
  
  $Sampling error = \hat{p} - p = 0.44 - 0.40 = 0.04$
Recent studies involving all students in a community college found that these students spend an average of 20 hours a week on homework outside of the classroom, with a standard deviation of 4 hours per week. Assume that the data collected follows a normal distribution.

Note that the shaded areas of the diagrams below are exaggerated so that the areas are clearly marked.

If a random sample of 25 students from this community college is selected, find the probability (to 4 decimals) that the sample mean weekly homework hours will be
1. at least 22 hours.
  
  Solution:
  
  $\begin{array}{l} P (\bar{x} \geq 22) & = P (z \geq (\frac{22 - 20}{\frac{4}{\sqrt{25}}})) \\ = P (z \geq 2.5) \\ = 1 - 0.9938 \\ = 0.0062 \end{array}$
  
  Note that $\frac{4}{\sqrt{25}} = 0.80$ .
2. between 18 and 22 hours.
  
  Solution:
  
  $\begin{array}{l} P (\bar{x} between 18 and 22) & = P (z between \frac{18 - 20}{0.8} and \frac{22 - 20}{0.8}) \\ = P (z between −2.5 and 2.5) \\ = 0.9938 - 0.0062 \\ = 0.9876 \end{array}$
3. less than 10 hours per week.
  
  Solution:
  
  $\begin{array}{l} P (\bar{x} < 10) & = P (z < \frac{10 - 20}{.8}) \\ = P (z < - 12.5) \\ \sim almost 0 \end{array}$
  
  Looks like this was a random sample of students who were really, serious, hard-working students. A larger sample may have produced a different result.
What is the minimum sample size needed for a 99% confidence interval estimate for the population proportion to have a maximum margin of error of 0.06
1. if there is a preliminary estimate of 0.80?
  
  Solution:
  
  Given $\hat{p} = 0.80$ , $\hat{q} = 0.20$ , $E = 0.06$ ,
  
  find $z$ such that 0.995 of the distribution is to the left of $z$ .
  
  $z = 2.58$
  
  $\begin{array}{l} n & = \frac{z^{2} \hat{p} \hat{q}}{E^{2}} \\ = \frac{{2.58}^{2} (0.80) (0.20)}{{0.06}^{2}} \\ = 295.84 \\ \sim 296 is the sample size with the given preliminary estimate . \end{array}$
  
  Always round up to a whole number for a sample size.
  
  Note: If you take $z = 2.575$ (as an approximation of $2.58$ ), then $n = 294.69 \sim 295$ is the sample size. This is also a correct answer.
2. if there is no preliminary estimate, so the most conservative estimate must be used?
  
  Solution:
  
  $\begin{array}{l} n & = \frac{z^{2} (0.50) (0.50)}{{0.06}^{2}} \\ = \frac{{2.58}^{2} (0.25)}{{0.06}^{2}} \\ = 462.25 \\ \sim 463 is the most conservative estimate sample size in this case . \end{array}$
  
  Note: If you take $z = 2.575$ (rather than $2.58$ ), then $n = 460.46 \sim 461$ is the most conservative sample size. This is also a correct answer.
In a recent municipal survey, 2,000 randomly selected taxpayers were sampled and 1,200 adults stated that they are in favor of constructing a new hockey arena.

Construct a 90% confidence interval (calculated to 4 decimal places) to estimate the percentage of all municipal taxpayers that are in favor of constructing the hockey arena.

Solution:

Find $\hat{p} = \frac{1200}{2000} = 0.60$ .

Find $z$ from Table IV (Appendix B in the eText), such that the area to the left of $z = 0.95$ :

$z = 1.645$

The confidence interval is:
$\begin{array}{l} \hat{p} \pm z \sqrt{\frac{\hat{p} \hat{q}}{n}} & = 0.60 \pm 1.645 \sqrt{\frac{(0.6) (0.40)}{2000}} \\ = 0.60 \pm 1.645 (0.0110) \\ = 0.60 \pm 0.0181 \\ = 0.5819 to 0.6181 \end{array}$ .

Past market research indicates that the ages of all the regular customers of a large fitness club are normally distributed. A recent sample of 6 randomly selected regular customers resulted in the following stem-and-leaf display of the ages of the selected customers:

1	8	9
2	2	4	6
3	3

Construct a 95% confidence interval estimate for the population mean age of all the club’s regular customers.

Solution:

Formula to use: confidence interval: $\bar{x} \pm t \frac{s}{\sqrt{n}}$

To find the sample mean and standard deviation:

Note that the six ages in the sample are 18, 19, 22, 24, 26 and 33.

$x$	$x^{2}$
$18$	$324$
$19$	$361$
$22$	$484$
$24$	$576$
$26$	$676$
$33$	$1089$
$142 = \sum x$	$3510 = \sum x^{2}$

Sample mean $= \frac{\sum x}{n} = \frac{142}{6} = 23.6667$

$\begin{array}{l} Sample standard deviation & = \sqrt{\frac{\sum (x^{2}) - \frac{({(\sum x)}^{2})}{n}}{n - 1}} \\ = \sqrt{\frac{3510 - \frac{({(142)}^{2})}{6}}{6 - 1}} \\ = \sqrt{\frac{(3510) - 3360.6667}{5}} \\ = 5.4650 \end{array}$

To find $t$ : degrees of freedom $= n - 1 = 6 - 1 = 5$ .

solution

Find $t$ such that the area to the right of $t$ equals 0.025: $t = 2.571$ .

Confidence interval:
$\begin{array}{l} \bar{x} \pm t \frac{s}{\sqrt{n}} & = 23.6667 \pm (2.571) \frac{5.4650}{\sqrt{6}} \\ = 23.6667 \pm (2.571) (2.2311) \\ = 23.6667 \pm 5.7362 \end{array}$

Answer: $17.9305 to 29.4029$

A medical researcher wishes to estimate, within 2 points, the average systolic blood pressure of university students located in a Canadian province. If the researcher wishes to be 96% confident, how large a sample should she select if the population standard deviation systolic blood pressure for all the provincial university students is 6.0?

Solution:

Formula to use: $n = \frac{z^{2} σ^{2}}{E^{2}}$

To find $z$ :

$z$ is such that 98% of the distribution falls to the left of $z$ . So, from the normal table, $z = 2.05$ .

$n = \frac{z^{2} σ^{2}}{E^{2}} = \frac{({2.05}^{2}) (6^{2})}{2^{2}} = 37.8225$ , so $38$ is the desired sample size.

A census survey indicates that the national average family size was 3.25 persons per family in 2015. A 2018 sample of families randomly selected across the country results in the following family sizes:

4, 2, 3, 2, 1, 3, 4, 2, 5, 4

Assuming that the population of family sizes is normally distributed, conduct a test of hypothesis at the 5% level to determine if the average family size has decreased between 2015 and 2018.

Show all key steps using the $p -value$ approach.

Solution:

Let $µ$ = mean family size.

Step 1: State the null hypothesis ( $H_{0}$ ) and the alternative hypothesis ( $H_{1}$ ).

$H_{0}$ : $μ = 3.25$ (Mean family size in 2015)

$H_{1}$ : $μ < 3.25$ (Mean family size less than 3.25)

Step 2: Select the distribution to use.

Select the $t$ distribution, as the population is normally distributed and the population standard deviation is unknown.

Step 3: Calculate the $p -value$ .

To find the $p -value$ , we must first find the sample test statistic “ $t$ ” as follows:

$t = \frac{\bar{x} - μ}{(\frac{s}{\sqrt{n}})}$ , where $μ = 3.25$ and $n = 10$ .

To find $\bar{x}$ and $s$ :

$x$	$x^{2}$
$4$	$16$
$2$	$4$
$3$	$9$
$2$	$4$
$1$	$1$
$3$	$9$
$4$	$16$
$2$	$4$
$5$	$25$
$4$	$16$
$30 = \sum x$	$104 = \sum x^{2}$

$\begin{matrix} \bar{x} & = \frac{\sum x}{n} \\ = \frac{30}{10} \\ = 3 \\ s & = \sqrt{\frac{\sum (x^{2}) - \frac{({(\sum x)}^{2})}{n}}{n - 1}} \\ = \sqrt{\frac{104 - \frac{({(30)}^{2})}{10}}{9}} \\ = \sqrt{\frac{(104) - 90}{9}} \\ = 1.2472 \\ t & = \frac{\bar{x} - μ}{(\frac{s}{\sqrt{n}})} \\ = \frac{3 - 3.25}{(\frac{1.2472}{\sqrt{10}})} \\ = \frac{- 0.25}{0.3944} \\ = - 0.6339 \\ = sample test statistic \end{matrix}$

The $p -value$ is the area to the left of the sample test statistic -0.6339 under the $t -distribution$ curve as shown below.

solution

Steps in estimating the $p -value$ : Read down the $t -table$ in Table V in Appendix B of the eText until you find the appropriate degrees of freedom, in this case $d f = n - 1 = 9$ . Ignoring the sign of the calculated test statistic, locate $0.6339$ in the row with 9 degrees of freedom ( $d f$ ). It falls in front of (to left of) 1.383. This value corresponds to an area that exceeds 0.10. So, we can conclude that the $p -value$ exceeds 0.10.

$d f$	Area in the Right Tail Under the $t$ Distribution Curve
$d f$	.10	.05	.025	.01	.005	.001
1	3.078	6.314	12.706	31.821	63.657	318.309
2	1.886	2.920	4.303	6.965	9.925	22.327
3	1.638	2.353	3.182	4.541	5.841	10.215
4	1.533	2.132	2.776	3.747	4.604	7.173
5	1.476	2.015	2.571	3.365	4.032	5.893
6	1.440	1.943	2.447	3.143	3.707	5.208
7	1.415	1.895	2.365	2.998	3.499	4.785
8	1.397	1.860	2.306	2.896	3.355	4.501
9	1.383	1.833	2.262	2.821	3.250	4.297
10	1.372	1.812	2.228	2.764	3.169	4.144

Source: Adapted from Prem S. Mann, Introductory Statistics, 9th ed., B21.

Step 4: Make a decision.

Since the $p -value$ , which exceeds 0.10, exceeds the level of significance of 0.05, we do not reject $H_{0}$ . We cannot conclude that the average family size has decreased below the 2015 value of 3.25 persons per family.

Show all key steps using the critical value approach.

Solution:

Let $μ$ = mean family size

Step 1: State the null hypothesis ( $H_{0}$ ) and the alternative hypothesis ( $H_{1}$ ).

$H_{0}$ : $μ = 3.25$ (mean family size in 2015)

$H_{1}$ : $μ < 3.25$ (mean family size less than 3.25)

Step 2: Select the distribution to use.

Select the $t$ distribution, as the population is normally distributed and the population standard deviation is unknown.

Step 3: Determine the rejection and non-rejection regions.

Given that $α = 0 .05$ and that $μ < 3.25$ for $H_{1}$ , the area in the left tail is 0.05. The degrees of freedom are $n - 1 = 10 - 1 = 9$ . The critical value is -1.833.

Step 4: Calculate the value of the test statistic.

$\begin{array}{l} t & = \frac{\bar{x} - μ}{(\frac{s}{\sqrt{n}})} \\ = \frac{3 - 3.25}{(\frac{1.2472}{\sqrt{10}})} \\ = \frac{- 0.25}{0.3944} \\ = - 0.6339 \end{array}$

Step 5: Make a decision.

Since the sample test statistic -0.6339 is NOT in the rejection region, we do NOT reject the null hypothesis. We cannot conclude that the average family size has decreased below the 2015 value of 3.25 persons per family.

A large online retail company claims that more than 80% of all its orders are delivered to customers’ homes within 72 hours. A researcher working for the Department of Consumer and Corporate Affairs, suspicious of this claim, took a random sample of 400 orders and found that 330 of them were delivered to homes within a 72 hour period. Conduct a test of hypothesis at the 1% level to determine if the random sample supports the retailer’s claim.
1. Show all key steps using the $p -value$ approach.
  
  Solution:
  
  Note: In this solution, there are two meanings for “ $p$ .” The $p -value$ is a probability. The $p$ in the hypothesis statements means the proportion of orders delivered within the stated time period.
  
  Let $p$ = proportion of orders delivered within 72 hours.
  
  Step 1: State the null hypothesis ( $H_{0}$ ) and alternative hypothesis ( $H_{1}$ ).
  
  $H_{0}$ : $p \leq 0.80$
  
  $H_{1}$ : $p > 0.80$ (proportion of delivered orders exceeds 80%)
  
  Step 2: Select the distribution to use.
  
  We use the $z$ distribution, as both $n p (400 \times 0.80)$ and $n q (400 \times 0.20)$ exceed five.
  
  Step 3: Calculate the $p -value$ .
  
  To find the $p -value$ we must first find the sample test statistic $z$ as follows:
  
  $\hat{p} = \frac{330}{400} = 0.825$ $σ_{\hat{p}} = \sqrt{\frac{(0.80) (0.20)}{400}} = 0.02$ $z = \frac{0.825 - 0.80}{0.02} = 1.25$
  
  The $p -value$ is the area to the right of the sample test statistic 1.25 under the $z -distribution$ curve, which equals $1 - 0.8944 = 0.1056$ as shown below.
  
  Step 4: Make a decision.
  
  Since the $p -value$ of 0.1056 exceeds the level of significance of 0.01, we do not reject he null hypothesis, so, we cannot conclude that more than 80% of all the company’s orders are delivered to customers’ homes within 72 hours.
2. Show all key steps using the critical value approach.
  
  Solution:
  
  Let $p$ = proportion of orders delivered within 72 hours.
  
  Step 1: State the null hypothesis ( $H_{0}$ ) and alternative hypothesis ( $H_{1}$ ).
  
  $H_{0}$ : $p \leq 0.80$
  
  $H_{1}$ : $p > 0.80$ (proportion of delivered orders exceeds 80%)
  
  Step 2: Select the distribution to use.
  
  We use the $z$ distribution, as both $n p (400 \times 0.80)$ and $n q (400 \times 0.20)$ exceed five.
  
  Step 3: Determine the rejection and non-rejection regions.
  
  Given that $α = 0.01$ and that $p > 0.80$ is in $H_{1}$ , the critical value is 2.33 and the rejection and non-rejection regions are:
  
  Step 4: Calculate the value of the test statistic.
  
  $\hat{p} = \frac{330}{400} = 0.825$ $σ_{\hat{p}} = \sqrt{\frac{(0.80) (0.20)}{400}} = 0.02$ $z = \frac{0.825 - 0.80}{0.02} = 1.25$
  
  Step 5: Make a decision.
  
  Since the sample test statistic 1.25 is not beyond the critical value 2.33 and therefore, is in the non-rejection region, we do not reject the null hypothesis. We cannot conclude that more than 80% of all the company’s orders are delivered to customers’ homes within 72 hours.
In 2014 the average cost of all weddings in the country was $23,000. A recent sample of 64 couples who got married this year produced a mean wedding cost of $24,500 with a standard deviation of $4,400. Conduct a test of hypothesis at the 5% level to determine if the average cost of weddings has changed.
1. Show all key steps using the $p -value$ approach.
  
  Solution:
  
  Let $μ$ = average wedding cost
  
  Step 1: State the null hypothesis ( $H_{0}$ ) and alternative hypothesis ( $H_{1}$ ).
  
  $H_{0}$ : $μ = 23000$
  
  $H_{1}$ : $μ \neq 23000$ (mean wedding cost has changed)
  
  Step 2: Select the distribution to use.
  
  Select the $t$ distribution as the sample size is large and the population standard deviation is unknown.
  
  Step 3: Calculate the $p -value$ .
  
  To find the $p -value$ we must first find the sample test statistic “ $t$ ” as follows:
  
  $\begin{array}{l} t & = \pm \frac{\bar{x} - μ}{(\frac{s}{\sqrt{n}})} \\ = \pm \frac{24500 - 23000}{(\frac{4400}{\sqrt{64}})} \\ = \pm \frac{1500}{(550)} \\ = \pm 2.7273 \end{array}$
  
  The $p -value$ is the combined area beyond the two $t$ values $\pm 2.7273$ as shown below.
  
  Steps in estimating the $p -value$ : Read down the $t -table$ in Appendix V of the eText until you find the appropriate degrees of freedom, which in this case is: $d f = 64 - 1 = 63$ . Ignoring the sign of the calculated test statistic, locate it (2.7273) in the row with 63 degrees of freedom. It falls in between 2.656 and 3.225. Read to the top of the table to locate the areas related to 2.656 and 3.225, which are 0.005 and 0.001. Since this is a two-tailed test, the estimated $p -value$ is between $(2 \times 0.005 = 0.01)$ and $(2 \times 0.001 = 0.002)$ .
  
  Step 4: Make a decision.
  
  Since the estimated $p -value$ is between 0.002 and 0.01, which is less than the 0.05 level of significance, we reject $H_{0}$ . Therefore, we can conclude that the mean wedding cost has changed from its 2014 amount.
2. How strong is the evidence against the null hypothesis ( $H_{0}$ )? Explain your reasoning. (See Additional Topic 4C: The $p -Value$ and Critical Value Approaches in Unit 4 of the Study Guide, Section 4-7.)
  
  Answer: A $p -value$ ranging between 0.01 and 0.002 is classed as “strong evidence” against the null hypothesis.

TOP