Okay...but how do I know if my R value is close enough?

The minimum r value for an acceptable correlation in a regression is dependent mainly on the number of data points being tested and the desired level of accuracy (the alpha level). There are equations for determining it, but they lead to a lengthy derivation process, so instead we've provided a handy little tool for you to use. All you need to know are the degrees of freedom and alpha level. Note, however, that a TI-83 calculator will not give you the degrees of freedom, and as such cannot be used to statistically prove the accuracy of regressions. Graph



Statistical Hypothesis

Null: Slope of Regression line = 0
Alternate: Slope of Regression Line ¹ 0

This is where the P-value comes into play. If the P-value is less than alpha, the Null Hypothesis is rejected. If The P-value is greater than alpha, the Null is retained. In order for a regression line to have any predicting value, it must have some sort of nonzero slope.

But where did that R value come from?

When you perform a regression, you select an alpha value, usually 0.05. One of the outputs yielded by the regression test is a P-value. In order for there to be a significant correlation between your data and the regression line, the P-value must be smaller than your selected alpha level. When you input your alpha level and degrees of freedom into our handy little tool, it returns the r value that will yield the largest P-value that is still less than alpha; it is the minimum r value for a good correlation.

NOTE: It is important to pay attention to both the P-value and the r value. The P-value will tell you if the line has any predicting power, and the r value will tell you if the line's predicting power is good enough. Even if there is a perfect correlation, if there is a slope of zero, the line has no predicting power and r is considered to be 0.

The reader should also note that when using Microsoft Excel to perform a regression, it yields 3 r values: multiple r, r square, and adjusted r square. The r value discussed here, the multiple r, is the SQUARE ROOT of the "r square" value given.

Exploring Regression

In the previous menu, the Diamond, Tokamak, and Nuclear activities use linear regression to analyze data sets. The activities assume that you have access to Excel, a TI-83 calculator or another software package capable of performing inferential tests.


Original work on this document was done by Central Virginia Governor's School students Ashley Farmer, Josh Nelson and Sara Throckmorton (Class of '98). Revisions were made by Ryan Malec, John Lewis and Terri Kendrick (Class of '05)


Copyright © 2004 Central Virginia Governor's School for Science and Technology Lynchburg, VA