Mathematics 215: Introduction to Statistics

Study Guide

Unit 4: Self-Test Answer Key

Show all your work and keep your calculations to four decimal places, unless otherwise stated.

  1. Circle True (T) or False (F) for each of the following:
    1. TFalse is correct answerF
      The standard deviation of the sampling distribution of the sample mean is equal to the population standard deviation.
    2. TFalse is correct answerF
      If the population distribution is positively skewed, then the sampling distribution of the sample mean is also positively skewed.
    3. TFalse is correct answerF
      When the population standard deviation is unknown and the sample size exceeds 30, the z distribution is used to compute a confidence interval for the population mean.
    4. True is correct answerTF
      When the population standard deviation is known and the sample size exceeds 30, the z distribution is used to compute a confidence interval for the population mean.
    5. True is correct answerTF
      A larger sample size will tend to reduce the width of a confidence interval.
    6. TFalse is correct answerF
      In conducting a test of hypothesis, if the p-value exceeds the level of significance, we reject the null hypothesis.
    7. True is correct answerTF
      In conducting a test of hypothesis, if the alternative hypothesis consists of a “ < ” expression, the critical value will be a negative number.
    8. TFalse is correct answerF
      In conducting a test of hypothesis, if the p-value is less than 0.001, the evidence against the null hypothesis is considered to be very weak.
  2. Past census surveys in a large Canadian province indicate that 40% of provincial voters favor the implementation of a carbon tax to combat global warming.

    Consider a sampling (random) experiment where 100 voters are selected at random and the sample proportion of voters who favor a carbon tax is to be observed.

    1. What would be the shape of the sampling distribution of the sample proportion in favor of the carbon tax, and why?

      Answer: The sampling distribution would be a normal distribution according to the central limit theorem, as both ( np=40 ) and ( nq=60 ) exceed five.

    2. Determine the mean of the sampling distribution of the sample proportion in favor of the carbon tax.

      Solution: μ p ^ = p = 0.40

    3. Determine the standard deviation of the sampling distribution of the sample proportion in favor of the carbon tax.

      Solution:

      σ p ^ = p q n = 0.4 0.6 100 = 0.0024 = 0.0490

    4. Find the probability (to 4 decimal places) that, in the random sample of 100 voters, the sample proportion who favor a carbon tax is:
      1. less than 0.30

        Solution:

        P p ^ <0.30 =P z< 0.300.40 0.0490 =P z<2.0408 =0.0207

        solution

      2. between 0.30 and 0.40

        Solution:

        P p ^  between0.30and0.40 =P zbetween2.04and0 =0.50-0.0207 =0.4793

        solution

    5. If, among the 100 voters selected at random, 44 are in favor of a carbon tax, compute the sampling error. Assume that there are no non-sampling errors.

      Solution:

      Samplingerror= p ^ p=0.440.40=0.04

  3. Recent studies involving all students in a community college found that these students spend an average of 20 hours a week on homework outside of the classroom, with a standard deviation of 4 hours per week. Assume that the data collected follows a normal distribution.

    Note that the shaded areas of the diagrams below are exaggerated so that the areas are clearly marked.

    If a random sample of 25 students from this community college is selected, find the probability (to 4 decimals) that the sample mean weekly homework hours will be

    1. at least 22 hours.

      Solution:

      P x ¯ 22 = P z 22 20 4 25 = P z 2.5 = 1 0.9938 = 0.0062

      solution

      Note that 4 25 = 0.80 .

    2. between 18 and 22 hours.

      Solution:

      P ( x ¯  between  18  and  22 ) = P z  between  18 20 0.8  and  22 20 0.8 = P z  between  −2.5  and  2.5 = 0.9938 0.0062 = 0.9876

      solution

    3. less than 10 hours per week.

      Solution:

      P ( x ¯ < 10 ) = P z < 10 20 .8 = P z < 12.5 almost  0

      solution

      Looks like this was a random sample of students who were really, serious, hard-working students. A larger sample may have produced a different result.

  4. What is the minimum sample size needed for a 99% confidence interval estimate for the population proportion to have a maximum margin of error of 0.06
    1. if there is a preliminary estimate of 0.80?

      Solution:

      Given p ^ =0.80 , q ^ =0.20 , E=0.06 ,

      find z such that 0.995 of the distribution is to the left of z .

      z=2.58

      solution

      n = z 2 p ^ q ^ E 2 = 2.58 2 0.80 0.20 0.06 2 = 295.84 296  is the sample size with the given preliminary estimate .

      Always round up to a whole number for a sample size.

      Note: If you take z = 2.575 (as an approximation of 2.58 ), then n = 294.69 295 is the sample size. This is also a correct answer.

    2. if there is no preliminary estimate, so the most conservative estimate must be used?

      Solution:

      n = z 2 0.50 0.50 0.06 2 = 2.58 2 0.25 0.06 2 = 462.25 463  is the most conservative estimate sample size in this case .

      Note: If you take z = 2.575 (rather than 2.58 ), then n = 460.46 461 is the most conservative sample size. This is also a correct answer.

  5. In a recent municipal survey, 2,000 randomly selected taxpayers were sampled and 1,200 adults stated that they are in favor of constructing a new hockey arena.

    Construct a 90% confidence interval (calculated to 4 decimal places) to estimate the percentage of all municipal taxpayers that are in favor of constructing the hockey arena.

    Solution:

    Find p ^ = 1200 2000 =0.60 .

    Find z from Table IV (Appendix B in the eText), such that the area to the left of z=0.95 :

    solution

    z=1.645

    The confidence interval is:
    p ^ ±z p ^ q ^ n =0.60±1.645 0.6 0.40 2000 =0.60±1.645 0.0110 =0.60±0.0181 =0.5819to0.6181 .

  6. Past market research indicates that the ages of all the regular customers of a large fitness club are normally distributed. A recent sample of 6 randomly selected regular customers resulted in the following stem-and-leaf display of the ages of the selected customers:

    1 8 9  
    2 2 4 6
    3 3    

    Construct a 95% confidence interval estimate for the population mean age of all the club’s regular customers.

    Solution:

    Formula to use: confidence interval: x ¯ ±t s n

    To find the sample mean and standard deviation:

    Note that the six ages in the sample are 18, 19, 22, 24, 26 and 33.

    x x 2
    18 324
    19 361
    22 484
    24 576
    26 676
    33 1089
    142=x 3510= x 2

    Sample mean = x n = 142 6 =23.6667

    Samplestandarddeviation = x 2 x 2 n n1 = 3510 142 2 6 61 = 3510 3360.6667 5 =5.4650

    To find t : degrees of freedom =n-1=6-1=5 .

    solution

    Find t such that the area to the right of t equals 0.025: t=2.571 .

    Confidence interval:
    x ¯ ±t s n =23.6667± 2.571 5.4650 6 =23.6667± 2.571 2.2311 =23.6667±5.7362

    Answer: 17.9305to29.4029

  7. A medical researcher wishes to estimate, within 2 points, the average systolic blood pressure of university students located in a Canadian province. If the researcher wishes to be 96% confident, how large a sample should she select if the population standard deviation systolic blood pressure for all the provincial university students is 6.0?

    Solution:

    Formula to use: n= z 2 σ 2 E 2

    To find z :

    solution

    z is such that 98% of the distribution falls to the left of z . So, from the normal table, z=2.05 .

    n= z 2 σ 2 E 2 = 2.05 2 6 2 2 2 =37.8225 , so 38 is the desired sample size.

  8. A census survey indicates that the national average family size was 3.25 persons per family in 2015. A 2018 sample of families randomly selected across the country results in the following family sizes:

    4, 2, 3, 2, 1, 3, 4, 2, 5, 4

    Assuming that the population of family sizes is normally distributed, conduct a test of hypothesis at the 5% level to determine if the average family size has decreased between 2015 and 2018.

    1. Show all key steps using the p-value approach.

      Solution:

      Let µ = mean family size.

      Step 1: State the null hypothesis ( H 0 ) and the alternative hypothesis ( H 1 ).

      H 0 : μ=3.25 (Mean family size in 2015)

      H 1 : μ<3.25 (Mean family size less than 3.25)

      Step 2: Select the distribution to use.

      Select the t distribution, as the population is normally distributed and the population standard deviation is unknown.

      Step 3: Calculate the p-value .

      To find the p-value , we must first find the sample test statistic “ t ” as follows:

      t= x ¯ μ s n , where μ=3.25  and n=10 .

      To find x ¯  and s :

      x x 2
      4 16
      2 4
      3 9
      2 4
      1 1
      3 9
      4 16
      2 4
      5 25
      4 16
      30=x 104= x 2

      x ¯ = x n = 30 10 = 3 s = x 2 x 2 n n 1 = 104 30 2 10 9 = 104 90 9 = 1.2472 t = x ¯ μ s n = 3 3.25 1.2472 10 = 0.25 0.3944 = 0.6339 = sample test statistic

      The p-value is the area to the left of the sample test statistic -0.6339 under the t-distribution curve as shown below.

      solution

      Steps in estimating the p-value : Read down the t-table in Table V in Appendix B of the eText until you find the appropriate degrees of freedom, in this case df=n-1=9 . Ignoring the sign of the calculated test statistic, locate 0.6339 in the row with 9 degrees of freedom ( df ). It falls in front of (to left of) 1.383. This value corresponds to an area that exceeds 0.10. So, we can conclude that the p-value exceeds 0.10.

      df Area in the Right Tail Under the t Distribution Curve
      .10 .05 .025 .01 .005 .001
       1  3.078 6.314 12.706 31.821 63.657 318.309
       2  1.886 2.920  4.303  6.965  9.925  22.327
       3  1.638 2.353  3.182  4.541  5.841  10.215
       4  1.533 2.132  2.776  3.747  4.604   7.173
       5  1.476 2.015  2.571  3.365  4.032   5.893
       6  1.440 1.943  2.447  3.143  3.707   5.208
       7  1.415 1.895  2.365  2.998  3.499   4.785
       8  1.397 1.860  2.306  2.896  3.355   4.501
      9 1.383 1.833  2.262  2.821  3.250   4.297
      10  1.372 1.812  2.228  2.764  3.169   4.144

      Source: Adapted from Prem S. Mann, Introductory Statistics, 9th ed., B21.

      Step 4: Make a decision.

      Since the p-value , which exceeds 0.10, exceeds the level of significance of 0.05, we do not reject H 0 . We cannot conclude that the average family size has decreased below the 2015 value of 3.25 persons per family.

    2. Show all key steps using the critical value approach.

      Solution:

      Let μ = mean family size

      Step 1: State the null hypothesis ( H 0 ) and the alternative hypothesis ( H 1 ).

      H 0 : μ=3.25 (mean family size in 2015)

      H 1 : μ<3.25 (mean family size less than 3.25)

      Step 2: Select the distribution to use.

      Select the t distribution, as the population is normally distributed and the population standard deviation is unknown.

      Step 3: Determine the rejection and non-rejection regions.

      Given that α=0.05  and that μ<3.25  for H 1 , the area in the left tail is 0.05. The degrees of freedom are n-1=10-1=9 . The critical value is -1.833.

      solution

      Step 4: Calculate the value of the test statistic.

      t = x ¯ μ s n = 33.25 1.2472 10 = 0.25 0.3944 =0.6339

      Step 5: Make a decision.

      Since the sample test statistic -0.6339 is NOT in the rejection region, we do NOT reject the null hypothesis. We cannot conclude that the average family size has decreased below the 2015 value of 3.25 persons per family.

  9. A large online retail company claims that more than 80% of all its orders are delivered to customers’ homes within 72 hours. A researcher working for the Department of Consumer and Corporate Affairs, suspicious of this claim, took a random sample of 400 orders and found that 330 of them were delivered to homes within a 72 hour period. Conduct a test of hypothesis at the 1% level to determine if the random sample supports the retailer’s claim.
    1. Show all key steps using the p-value approach.

      Solution:

      Note: In this solution, there are two meanings for “ p .” The p-value is a probability. The p in the hypothesis statements means the proportion of orders delivered within the stated time period.

      Let p = proportion of orders delivered within 72 hours.

      Step 1: State the null hypothesis ( H 0 ) and alternative hypothesis ( H 1 ).

      H 0 : p0.80

      H 1 : p>0.80 (proportion of delivered orders exceeds 80%)

      Step 2: Select the distribution to use.

      We use the z distribution, as both np 400×0.80 and nq 400×0.20 exceed five.

      Step 3: Calculate the p-value .

      To find the p-value we must first find the sample test statistic z as follows:

      p ^ = 330 400 =0.825 σ p ^ = 0.80 0.20 400 =0.02 z= 0.8250.80 0.02 =1.25

      The p-value is the area to the right of the sample test statistic 1.25 under the z-distribution curve, which equals 1-0.8944=0.1056 as shown below.

      solution

      Step 4: Make a decision.

      Since the p-value of 0.1056 exceeds the level of significance of 0.01, we do not reject he null hypothesis, so, we cannot conclude that more than 80% of all the company’s orders are delivered to customers’ homes within 72 hours.

    2. Show all key steps using the critical value approach.

      Solution:

      Let p = proportion of orders delivered within 72 hours.

      Step 1: State the null hypothesis ( H 0 ) and alternative hypothesis ( H 1 ).

      H 0 : p0.80

      H 1 : p>0.80 (proportion of delivered orders exceeds 80%)

      Step 2: Select the distribution to use.

      We use the z distribution, as both np 400×0.80 and nq 400×0.20 exceed five.

      Step 3: Determine the rejection and non-rejection regions.

      Given that α=0.01 and that p>0.80 is in H 1 , the critical value is 2.33 and the rejection and non-rejection regions are:

      solution

      Step 4: Calculate the value of the test statistic.

      p ^ = 330 400 =0.825 σ p ^ = 0.80 0.20 400 =0.02 z= 0.8250.80 0.02 =1.25

      Step 5: Make a decision.

      Since the sample test statistic 1.25 is not beyond the critical value 2.33 and therefore, is in the non-rejection region, we do not reject the null hypothesis. We cannot conclude that more than 80% of all the company’s orders are delivered to customers’ homes within 72 hours.

  10. In 2014 the average cost of all weddings in the country was $23,000. A recent sample of 64 couples who got married this year produced a mean wedding cost of $24,500 with a standard deviation of $4,400. Conduct a test of hypothesis at the 5% level to determine if the average cost of weddings has changed.
    1. Show all key steps using the p-value approach.

      Solution:

      Let μ = average wedding cost

      Step 1: State the null hypothesis ( H 0 ) and alternative hypothesis ( H 1 ).

      H 0 : μ=23000

      H 1 : μ23000 (mean wedding cost has changed)

      Step 2: Select the distribution to use.

      Select the t distribution as the sample size is large and the population standard deviation is unknown.

      Step 3: Calculate the p-value .

      To find the p-value we must first find the sample test statistic “ t ” as follows:

      t =± x ¯ μ s n =± 2450023000 4400 64 =± 1500 550 =±2.7273

      The p-value is the combined area beyond the two t values ±2.7273  as shown below.

      solution

      Steps in estimating the p-value : Read down the t-table in Appendix V of the eText until you find the appropriate degrees of freedom, which in this case is: df=64-1=63 . Ignoring the sign of the calculated test statistic, locate it (2.7273) in the row with 63 degrees of freedom. It falls in between 2.656 and 3.225. Read to the top of the table to locate the areas related to 2.656 and 3.225, which are 0.005 and 0.001. Since this is a two-tailed test, the estimated p-value is between 2×0.005=0.01 and 2×0.001=0.002 .

      Step 4: Make a decision.

      Since the estimated p-value is between 0.002 and 0.01, which is less than the 0.05 level of significance, we reject H 0 . Therefore, we can conclude that the mean wedding cost has changed from its 2014 amount.

    2. How strong is the evidence against the null hypothesis ( H 0 )? Explain your reasoning. (See Additional Topic 4C: The p-Value and Critical Value Approaches in Unit 4 of the Study Guide, Section 4-7.)

      Answer: A p-value ranging between 0.01 and 0.002 is classed as “strong evidence” against the null hypothesis.