Norm and Criterion Referenced Assessment


Norm-referenced tests (or NRTs) compare an examinee’s performance to that of other examinees. Standardized examinations such as the SAT are norm-referenced tests. The goal is to rank the set of examinees so that decisions about their opportunity for success (e.g. college entrance) can be made.

Criterion-referenced tests (or CRTs) differ in that each examinee’s performance is compared to a pre-defined set of criteria or a standard. The goal with these tests is to determine whether or not the candidate has the demonstrated mastery of a certain skill or set of skills. These results are usually “pass” or “fail” and are used in making decisions about job entry, certification, or licensure. A national board medical exam is an example of a CRT. Either the examinee has the skills to practice the profession, in which case he or she is licensed, or does not.

norm-referencing is that students are awarded their grades on the basis of their ranking within a particular cohort. Norm-referencing involves fitting a ranked list of students’ ‘raw scores’ to a pre-determined distribution for awarding grades. Usually, grades are spread to fit a ‘bell curve’ (a ‘normal distribution’ in statistical terminology), either by qualitative, informal rough-reckoning or by statistical techniques of varying complexity. For large student cohorts (such as in senior secondary education), statistical moderation processes are used to adjust or standardise student scores to fit a normal distribution. This adjustment is necessary when comparability of scores across different subjects is required (such as when subject scores are added to create an aggregate ENTER score for making university selection decisions).

Dimension Criterion-Referenced
Purpose To determine whether each student has achieved specific skills or concepts.

To find out how much students know before instruction begins and after it has finished.

To rank each student with respect to the
achievement of others in broad areas of knowledge.

To discriminate between high and low achievers.

Content Measures specific skills which make up a designated curriculum. These skills are identified by teachers and curriculum experts.

Each skill is expressed as an instructional objective.

Measures broad skill areas sampled from a variety of textbooks, syllabi, and the judgments of curriculum experts.
Each skill is tested by at least four items in order to obtain an adequate sample of student
performance and to minimize the effect of guessing.

The items which test any given skill are parallel in difficulty.

Each skill is usually tested by less than four items.

Items vary in difficulty.

Items are selected that discriminate between high
and low achievers.

Each individual is compared with a preset standard for acceptable achievement. The performance of other examinees is irrelevant.

A student’s score is usually expressed as a percentage.

Student achievement is reported for individual skills.

Each individual is compared with other examinees and assigned a score–usually expressed as a percentile, a grade equivalent
score, or a stanine.

Student achievement is reported  for broad skill areas, although some norm-referenced tests do report student achievement for individual skills.


