Correlation of the Lifespan Data

In this activity you will be performing a correlation analysis of lifespan data to determine whether or not there is any statistical basis for a possible relationship between the ratio of doctors to people and the average lifespan. The data was collected from the forty most populated countries in the world. It consists of the name of the country, the number of citizens per doctor, and the average female lifespan (measured in years). Though the name of the country will not be used in the correlation, it is included for reference.

The correlation analysis will tell you how "strong" the relationship between the data sets is. The analysis will result in a correlation coefficient (r-value) between -1 and 1. An r-value of zero means that there is no correlation between the data. Alternatively, an r-value value of ±1 means that there is a perfect correlation between the data. You can assume that if the r-value is close to zero it is a poor correlation. A good correlation is when, the r-value is close to ±1. Using Excel (the TI-83 is excluded from this activity because of its inability to deal with such a large data set), perform a correlation of the average female lifespan and the number of citizens per doctor. Then answer the questions at the bottom of the page. You may want to create a scatter plot for an aid in analyzing the data.

This data exists in two files; one is in Excel Data format, and one is in Text format.

What is the correlation coefficient of the data? What does this coefficient tell you about the relationship between these two variables? What is the difference between a positive and negative coefficient? Does it change anything about these results?


Original work on this document was done by Central Virginia Governor's School students Christian Neeley (Class of '98) and Patrick Burke (Class of '99).


Copyright © 1997 Central Virginia Governor's School for Science and Technology Lynchburg, VA