Mathematics Homework Help

MATH 160 Cuyamaca College Regression Line and Absolute Prediction Error Questions

 

Instructions

Progress Check

Use this activity to assess whether you can:

  • Use StatCrunch to graph a scatterplot with its least squares regression line and to simultaneously produce the equation of the regression line along with its correlation coefficient, r.
  • Identify the x with the largest absolute prediction error.
  • Explain why a given data point is an outlier.

In this activity you will use StatCrunch and embed your results in an essay question. The essay questions are not automatically graded; your instructor will enter the points for these questions later. WARNING: you will need to enter your response to each essay question with every attempt. Your instructor will only grade the essay for your attempt with the highest total score for the automatically graded questions.

Discussion Board

Use the Module 27 discussion board (opens in a new tab) to ask questions or provide feedback about the problems in any Module 27 activity – including this lab.

Attempt History

Attempt Time Score
LATEST Attempt 1 25 minutes 4 out of 10 *

* Some questions not yet graded

Correct answers are hidden.

Score for this attempt: 4 out of 10 *

Submitted May 2 at 10:36am

This attempt took 25 minutes.

Learn by Doing

Some features of this activity may not work well on a cell phone or tablet. We highly recommend that you complete this activity on a computer.

A list of StatCrunch directions is provided at the bottom of this text-box.

Context

The modern Olympic Games have changed dramatically since their inception in 1896. Are athletes getting better? We will use regression to investigate the change in winning times for one event—the men’s 1,500 meter race.

Variables

Year: the year of the Olympic Games, from 1896 to 2000.
Time: the winning time for the 1,500 meter race, in seconds.

Since the winning time depends on the year, the Year since 1896 is the explanatory variable, and the Winning time is the response variable.

Data

Download the olympics (Links to an external site.) datafile for the men’s 1,500 meter race. Then upload the datafile in StatCrunch. If you need a reminder about how to do this, review the list of StatCrunch directions below.

Prompt

In the first two questions below, you will use StatCrunch to produce and examine the scatterplot for the olympics datafile. You will also use StatCrunch to find the regression equation and correlation coefficient.

List of StatCrunch Directions

As you work through numbers 1) and 2) below, refer back to these StatCrunch directions when you need a quick reminder.

UnansweredQuestion 1Not yet graded / 3 pts

These directions assume that you have uploaded the olympics datafile in StatCrunch, and the StatCrunch worksheet with the data is open. If not, please see the Data section above.

  1. Using the year since 1896 as the explanatory variable and the winning time as the response variable: graph the scatterplot with the regression line and produce the regression equation with the correlation coefficient – all at the same time (directions)
  2. Toggle to the output page with the scatterplot and regression line. Notice that the data has a strong linear association, so it makes sense to use linear regression. (Always check the form of the scatterplot before using linear regression.)
  3. Download the StatCrunch output page with your scatterplot and regression line graphed together. (directions)
  4. Save the .png file (the graph of your scatterplot and regression line) to your Stats-Class folder. (directions)
  5. Embed the .png file for your scatterplot and regression line in the text-box below. (directions)

scatterplot with regression line

UnansweredQuestion 2Not yet graded / 3 pts

These directions assume you have produced the Simple linear regression results in a multipage StatCrunch output window. If not please see the previous question.

  • Toggle to the StatCrunch output page with the regression equation, correlation coefficient, and other statistics.
  • Under the heading Simple linear regression results, copy and paste the first five lines (dependent variable, independent variable, linear equation, sample size, and R) into the text-box below. (directions)

Simple linear regression results:

Dependent Variable: Time
Independent Variable: Year
Time = 994.19341 – 0.39304496 Year
Sample size: 24
R (correlation coefficient) = -0.89075356

Question 32 / 2 pts

For which of the years 1900, 1940, or 2000 is the absolute prediction error the largest?

undefinedundefinedundefined

Correct. Vertical distance from the regression line is the prediction error. The data point for 2000 is father from the regression line than the other options, so the prediction error is largest.

Question 42 / 2 pts

For the year 1896, the winning time for the men’s 1500-meter race is an outlier. In what ways is this data point an outlier? Check all that apply

undefined

Correct. The winning time in 1896 is much larger than the other winning times.

undefinedundefinedundefined

Correct. The data point (1896, 273) deviates from the pattern in the rest of the data. It does not follow the strong, negative association or the linear pattern.

Score: 4 out of 10