Standard Error of Measurement

An individual's true score would equal the average of his or herscores(observed scores) on every possible version of a particular test inorder to account for measurement error associated with a test design. Becausethe latter is impossible, standardized tests usually have an associated standarderror of measurement (SEM), an index of the expected variation in observedscores due to measurement error. The SEM is in standard deviation units and canbe related to the normal curve.

Relating the SEM to the normal curve,using the observed score as the mean, allows educators to determine the range ofscores within which the true score may fall. For example, if a student receivedan observed score of 25 on an achievement test with an SEM of 2, the student canbe about 95% (or ±2 SEMs) confident that his true score falls between 21and 29 (25 ± (2 + 2, 4)). He can be about 99% (or ±3 SEMs) certainthat his true score falls between 19 and 31.

Viewed another way, the student can determine that if he took a differentedition of the exam in the future, assuming his knowledge remains constant, hecan be 95% (±2 SD) confident that his score will fall between 21 and 29,and he can be 99% (±3 SD) confident that his score will fall between 19 and31. Based on this information, he can decide if it is worth retesting toimprove his score.

SEM is a related to reliability. As the reliability increases, the SEMdecreases. The greater the SEM or the less the reliability, the more variancein observed scores can be attributed to poor test design rather, than atest-taker's ability.

Think about the following situation. You are taking the NTEs or anotherimportant test that is going to determine whether or not you receive a licenseor get into a school. You want to be confident that your score is reliable,i.e. that the test is measuring what is intended, and that you would getapproximately the same score if you took a different version. (Moststandardized tests have high reliability coefficients (between 0.9 and 1.0 andsmall errors of measurement.)

Because no test has a reliability coefficient of 1.00, or an error ofmeasurement of 0, observed scores should be thought of as a representation of arange of scores, and small differences in observed scores should be attributedto errors of measurement.

Go to first page of tutorial.

Go to subheading Standardized TestStatistics.