Mathematics 215: Introduction to Statistics
Study Guide
Unit 6: Self-Test Answer Key
Show all your work and keep your calculations to four decimal places, unless otherwise stated.
- Circle True (T) or False (F) for each of the following:
-
TFalse is correct answerFIn regression analysis, the variable that is being explained is called the independent variable.
-
True is correct answerTFIn the estimated regression equation , is the slope of the regression line.
-
TFalse is correct answerFThe number of degrees of freedom for a simple linear regression model is ().
-
True is correct answerTFThe coefficient of variation measures the proportion of the variation in explained by the variation in .
-
True is correct answerTFWhen conducting a test of hypothesis regarding in the simple linear regression model, the null hypothesis of “” states that will not be a significant predictor of .
-
True is correct answerTFIn correlation analysis, if there is a perfect positive correlation between and , then the value of , the correlation coefficient, equals 1.
-
-
A fast food chain would like to predict the number of breakfast burgers that an individual will consume per month, based on the price charged per burger. A random sampling of the breakfast burger consumption of six individuals results in the data recorded in the following table.
Price 1.50 3.00 2.50 4.00 2.00 1.00 Number of burgers
consumed per month10  6  8  3  9 12 -
Which variable is the independent variable, , and which is the dependent variable, ?
Solution:
The variable that is to be predicted, number of breakfast burgers consumed per month, is the dependent variable (). The other variable, price (the basis used to predict burgers consumed), is the independent variable ().
-
Find the equation of the least squares regression line. Keep all your work to 4 decimal places.
Solution:
To find and in the estimated regression equation , as well as the answers to subsequent questions, we suggest forming the following table to work out , , , , and .
Price per burger Number of burgers consumed per month                                     = number of individuals involved in the study = sample size = 6
, and
Regression equation (line): or
-
Construct a scatter diagram for the sample data for the six randomly selected individuals. The price is to be measured on the horizontal axis, and the number of burgers consumed is to be indicated on the vertical axis. Plot the estimated regression line on the same set of axes. In plotting the regression line, compute for and then for .
Solution:
When ,
When ,
-
Based on the results from part c. above, would you hypothesize a positive or negative linear relation between the price and the number of burgers consumed?
Solution:
Since the regression line is downward-sloping, I hypothesize a negative linear relation between the price and the number of burgers consumed.
-
Interpret the meaning of the slope of the estimated regression line.
Solution:
The slope suggests that for every one-dollar increase in price, the number of burgers consumed per month will decrease by 2.9143.
-
Compute the coefficient of determination and interpret your answer. Keep your work to 4 decimal places.
Solution:
The coefficient of determination .
The coefficient of determination of 0.9909 means that 99.09 % of the total variation in monthly burger consumption is explained by the variation in the burger price.
-
Test at the 1% significance level whether (the population regression line slope) is negative. Keep your work to 4 decimal places.
Solution:
Step 1: State the null hypothesis () and the alternative hypothesis ().
:
: (the population regression line slope is negative)
Step 2: Select the distribution to use.
Use the distribution.
Step 3: Determine the rejection and non-rejection regions.
Since contains “”, this is a one-tailed test with the left tail having an area of .
The .
Step 4: Calculate the value of the test statistic. Keep all your work to 4 decimal places.
,where and , assuming is true.
, where
, and soStep 5: Make a decision.
Since falls in the rejection region, we reject .
We conclude that there is significant evidence from the study to indicate that the slope of the population regression line is negative. -
Construct a 99% confidence interval for . Keep your work to 4 decimal places.
Solution:
The formula for the confidence interval for is , where .
from part g. above.
To find , , and the right tail area is equal to , as shown:
The 99% confidence interval is
-
Predict (using a point estimate) the number of burgers consumed per month if the price per burger is set to $3.50. Keep all your work to 4 decimal places.
Solution:
The estimated regression equation is .
Point estimate: burgers per month
-
-
A sleep therapist is examining the relationship between a person’s age and their hours of sleep on a typical weeknight. The therapist selects eight people at random. These people’s ages are paired with their hours of sleep in the following table.
Age (x) 70 24 36 42 65 58 29 50 Hours of sleep (y) 11  5  7  8 11 11  6  8 In answering the following, you may use the following sums of squares and cross products. Keep all your work to 4 decimal places.
= sample size = number of individuals participating in the study = 8
-
Find the least squares regression line .
Solution:
Note that the mean of the in the data set is , and the mean of the in the data set is .
Regression equation:
-
Compute the simple linear correlation coefficient, . Interpret your answer.
Solution:
Since the linear correlation coefficient is close to +1, this means that there is a strong positive linear correlation between a person’s age and their hours of sleep on a typical weeknight. In other words, the higher the person’s age, the greater their hours of sleep on a typical weeknight.
-
Using the 1% significance level, test whether the population linear correlation coefficient is different from zero.
Solution:
Step 1: State the null hypothesis () and the alternative hypothesis ().
:
: (population linear correlation coefficient differs from zero)
Step 2: Select the distribution to use. The distribution is to be used.
Step 3: Determine the rejection and non-rejection regions.
Since contains “”, this is a two-tailed test with each tail having an area of .
The .
Step 4: Calculate the value of the test statistic.
Step 5: Make a decision.
Since the falls in the rejection region, we reject .
Conclusion: There is evidence to indicate that the population linear correlation coefficient is different from zero within a significance level of 1%.
-
What percentage of the total variation in weeknight hours of sleep occurs because of the variation in age?
Solution:
This question is asking you to find the coefficient of determination, which is . Since we know from part b. above that , the coefficient of determination equals .
This means that 94.15% of the variation in the nightly hours of sleep can be explained by the variation in age.
-
Construct a 90% prediction interval for the weeknight hours of sleep for a 40-year-old individual.
Solution:
The formula for the prediction interval is ,
where .To find :
, and the right tail , as shown:So, .
To find
the prediction interval, , is
-