Correlation

Correlation analysis measures how two variables are related. Thecorrelation coefficient (r) is a statistic that tells you the strengthand direction of that relationship. It is expressed as a positive ornegative number between -1 and 1. The value of the number indicates the strengthof the relationship:

The sign of the correlation coefficient indicates whether the direction ofthe relationship is positive (direct) or negative (inverse).

Variables whichhave a direct relationship (a positive correlation) increase together and decrease together.

In aninverse relationship (a negative correlation), one variable increases while the other decreases.

While the sign indivates how one variable changes with respect to anothervariable, the magnitude of the number indicates the strength of a relationship.

It is important to remember that while correlation coefficients can be usedfor prediction (i.e. if we know the value for one variable, and thecorrelation, we can predict what the value of the second variable will be) theymay NOT be used for causation (i.e. we cannot say that one variable causesanother).

Example

Suppose you are reading a study of Regents exams. The investigator wantedto know if performance in grade school was related to scores on the Regentsexams. He did a correlation analysis on grade school performance and Regentsexam score, and found that r = .75 in his study. This tells you two things:

  1. r is positive, so grade school performance and Regents exam score tendto increase and decrease together.
  2. r is fairly close to 1, so the direct relationship is fairly strong.

If a correlation exists between two variables, this does NOT imply that onevariable causes another. Causation and correlation are two very differentthings.

The two correlation coefficients that appear most often in the literatureare the Pearson-product moment and the Spearmanrank sum.