Whether it is in a classroom setting, a requirement for college admission, or to acquire something as simple as a driver’s license, few individuals in today’s society can avoid the necessity of taking tests. The scores received on these tests say something about what we know. Our knowledge level, as measured by the test, can either provide comparison with other individuals as a relative standard or with an absolute or predetermined standard. When an individual’s performance on an examination is compared with a predetermined or absolute standard, it is called a criterion-referenced test.
Although the notion of taking a test has not changed, the way we develop and interpret tests has changed drastically over the past 40 to 50 years. One of those drastic changes began taking shape in the early 1960s, when a gentleman named Robert Glaser coined the phrase criterion-referenced measurement and wrote about the distinction between a norm-referenced and criterion-referenced measurement. Up until that time, the use of norm-referenced tests, which compare examinees with a relative standard, was the customary model. Since the early 1960s, when criterion-referenced measurement was introduced, the procedures associated with development and use of criterion-referenced tests have been refined into well-accepted practice.
The purpose of a criterion-referenced test is to measure an individual’s level of skill or mastery over a specific body of knowledge being represented by the test. As a result, there are certain design characteristics that must be considered when developing criterion-referenced tests. First we must consider what material the examination should cover. Because we want to be able to make inferences from test performance about mastery, subject matter covered in a criterion-referenced test needs to be dictated by specific goals, instructional objectives, or outcomes that accurately and narrowly define the domain. The format and number of items written for the examination must be a representative sampling from the content area over which we are determining mastery. The final piece in the development of a criterion-referenced test involves setting the performance standard or cutoff score. There are many descriptive phrases associated with criterion-referenced test results that categorize examinees. Examples include, pass/fail, mastery/ nonmastery, certified/not certified, licensed/not licensed, and proficient/not proficient. The setting of the performance standard, or cutoff score, allows us to know at exactly what point that decision should be made for the individual test taker.
Criterion-referenced tests are useful when we want to make inferences from test performance about what a person can do. Success on a criterion-referenced test does not imply perfect knowledge; rather, it implies that an individual has met the established performance standard. The examinee, at this point, has demonstrated an acceptable level of the skills and abilities required to be considered a master, proficient, or certified.
Everyday examples of criterion-referenced tests abound. During the elementary school years, there are tests to determine something as simple as whether or not students can tell time or whether or not they know foundational mathematical concepts such as multiplication tables. A few years later, most people take a criterion-referenced test to obtain a driver’s license, demonstrating that they have the acceptable skills to safely operate a vehicle on the roadway. Upon entering the workforce many members of our society are required to pass a criterion-referenced test in order to enter their chosen profession, such as physicians proving that they are capable of caring for and treating patients appropriately.
Few stages of life are exempt from criterion-referenced testing in one form or another. Due to the increased demand for testing in general, and the immense practicality of criterion-referenced testing, its place in the measurement arena is guaranteed.
- Berk, A. (Ed.). (1984). A guide to criterion-referenced test construction. Baltimore: John Hopkins University Press. Bond, L. A. (1996). Norm and criterion-referenced testing.
- Washington, DC: ERIC Clearinghouse on Assessment and Ev (ERIC Document Reproduction Service No. ED410316). Retrieved from http://www.ericdigests.org/1998-1/norm.htm
- Ebel, L. (1979). Essentials of educational measurement (3rd ed.). Englewood Cliffs, NJ: Prentice-Hall.
- Linn, L. (Ed.). (1989). Educational measurement (3rd ed.). New York: American Council on Education–Macmillan.
- Lyman, B. (1998). Test scores and what they mean (6th ed.). Needham Heights, MA: Allyn & Bacon.
- Popham, W. (1978). Criterion referenced measurement. Englewood Cliffs, NJ: Prentice-Hall.