PRINCIPLES OF NEUROPSYCHOMETRIC ASSESSMENT

Published on 10/04/2015 by admin

Filed under Neurology

Last modified 10/04/2015

Print this page

rate 1 star rate 2 star rate 3 star rate 4 star rate 5 star
Your rating: none, Average: 2 (3 votes)

This article have been viewed 6366 times

CHAPTER 2 PRINCIPLES OF NEUROPSYCHOMETRIC ASSESSMENT

It might legitimately be asked why this chapter was written by both a neurologist and a neuropsychologist. The answer, in part, is that a neurologist who has worked closely with neuropsychologists is perhaps in the best position to interpret the discipline to his or her colleagues; neuropsychology is often a “black box” to neurologists, to a greater extent than neuropsychologists themselves may realize. This can lead to uncritical acceptance of neuropsychologists’ conclusions without the productive interaction that characterizes, for example, neuroradiological review sessions. At the other extreme, the real added value of expert neuropsychological assessment may be discounted by those unconvinced of its validity. In any event, the value of neuropsychological assessment is considerably increased when the neurologist requesting it understands its strengths, limitations and pitfalls, and the sort of data on which its conclusions are based.

COGNITIVE DOMAINS AND NEUROPSYCHOLOGICAL TESTS

Cognitive Domains

Cognitive domains are constructs (intellectual conceptualizations to explain observed phenomena, such as gravity) invoked to provide a coherent framework for analysis and testing of cognitive functions. The various cognitive processes in each domain are more or less related and are more or less independent of processes in other domains. Although these domains do not have strict, entirely separable neuroanatomical substrates, they do each depend on particular (but potentially overlapping) neural networks.1 In view of the way in which cognitive domains are delineated, it is not surprising that there is some variation in their stated number and properties, but commonly recognized ones with their potential neural substrates are listed in Table 2-1.

TABLE 2-1 Commonly Assessed Cognitive Domains and Their Potential Neural Substrate

Domain Main Neural Substrate
Attention Ascending reticular activating system, superior colliculus, thalamus, parietal lobe, anterior cingulate cortex, and the frontal lobe
Language Classical speech zones, typically in the left dominant hemisphere, including Wernicke’s and Broca’s areas, and the angular gyrus
Memory Hippocampal-entorhinal cortex complex
Frontal regions
Left parietal cortex
Object recognition (visual) Ventral visual system: occipital regions to anterior pole of temporal lobe
Spatial processing Posterior parietal cortex, frontal eye fields, dorsal visual system
Inferotemporal/midtemporal and polar temporal cortex
Executive functioning Frontal-subcortical circuits, including dorsolateral prefrontal, orbital frontal, and anterior cingulate circuits

Prerequisites for Meaningful Testing

Adequate testing within some domains requires that some others are sufficiently intact. For example, a patient whose sustained, focused attention (concentration) is severely compromised by a delirium is unable to register a word list adequately. Consequently, delayed recall is impaired, even in the absence of a true amnesia or its usual structural correlates. A patient with sufficiently impaired comprehension may perform poorly on the Wisconsin Card Sorting Test because the instructions were not understood, rather than because hypothesis generation was compromised. These considerations give rise to the concept of a pyramid of cognitive domains, with valid testing at each level dependent on the adequacy of lower level performance2 (Fig. 2-1).

In addition to intact attention and comprehension, patient performance may be compromised by poor motivation—for example, as a result of depression or in the setting of potential secondary gain—or by anxiety. Neurological impairments (e.g., poor vision, ataxia), psychiatric comorbid conditions, preexisting cognitive impairments (e.g., mental retardation), specific learning difficulties or lack of education (e.g., resulting in illiteracy), and lack of mastery of the testing language can all interfere with valid testing and must be carefully considered by the neuropsychologist in interpreting test results.3

BASIC PRINCIPLES OF PSYCHOMETRICS

Test Reliability

For a neuropsychological test (or any other test) to be clinically useful, it must be both reliable and valid. A reliable test is one for which differences in scores reflect true differences in what is being measured, rather than random variation (“noise”) or systematic bias (e.g., consistent differences between test scores at different centers). The reliability coefficient of a test is the proportion of total test result variability that is attributable to true differences in test results. It may also be conceptualized as the variability that would remain after multiple administrations of a test resulted in random variations that canceled each other out, with no systematic bias assumed. (An analogy familiar to neurologists would be electronic averaging in evoked potentials.) Reliability coefficients of standard neuropsychological tests typically vary from about 0.70 (acceptable) to 0.95 (high).

Reliability may be assessed in a number of ways. Test-retest reliability accounts for both random variability resulting from the test itself and systematic bias resulting from practice effects, although it cannot enable the clinician to easily distinguish between the two. It presupposes a stable test population, which may be an unattainable ideal over longer periods of time, inasmuch as acute pathological conditions such as results of strokes and traumatic injuries tend to improve and degenerative conditions tend to worsen. The internal consistency of a multi-item test can be gauged by split-half reliability, whereby scores from half the test items are compared with scores from the other half (but this leaves moot how the division is performed), or by calculating the mean reliability coefficient obtained from all possible split-half comparisons. The latter strategy generates a statistic called Cronbach’s α. Sometimes, alternative (parallel) versions of tests are constructed, often in order to facilitate serial testing in an effort to avoid practice effects. The reliabilities of the different versions can then be compared, in a process very similar to split-half reliability testing. The difficulty, of course, is in knowing whether the two versions really are equivalent, so that variance between the two represents unreliability rather than differences in difficulty or in the variable or variables actually being measured.

Interrater reliability accounts for the variation in test scores resulting from administration by different testers. This is clearly important particularly in multicenter studies and is an essential property for semiquantitative clinical rating scales.

The importance of test reliability underlies the importance of test administration with standardized materials in a standardized manner and a conducive environment, and by appropriately trained personnel (e.g., not by an intern in a noisy ward).

Test Validity

A valid test measures what it is purported to measure. Whereas an unreliable test cannot be valid (as score variations reflecting true differences in the intended measured variable are concealed by noise or systematic bias), reliability itself is no guarantee of validity. Consideration of the following test of semantic knowledge illustrates this point:

All readers presumably score 75% on this test, which is therefore absolutely reliable but quite invalid as a test of semantic knowledge.

Validity can be gauged in a number of ways. Criterion validity reflects the utility of the test in decision making. Perhaps the ideal form of criterion validity is predictive validity, in which test results are used to make a decision or prediction, such as in which patients amnestic mild cognitive impairment will convert to Alzheimer’s disease, and the validity of the decision is subsequently established on follow-up. Such studies tend to be long and expensive, however, and so other methods of assessing validity are often required.

Concurrent validity, another form of criterion validity often used instead, involves comparing test results with a nontest parameter of relevance, such as sustained, directed attention in children with their class disciplinary records. Ecological validity, a related concept, reflects the predictive value of the test for performance in real-world situations. For example, neuropsychological tests of visual attention and executive function, but not of other domains, have been found to have reasonable ecological validity for predicting driving safety, in comparison with the “gold standard” of on-road testing.4

Construct validity assesses whether, for example, a test purportedly of a particular cognitive domain is correlated with other established tests of that particular domain and functions as tests of that domain are expected to function.

Content validity concerns checking the test items against the boundaries and content of the domain (or portion of the domain) to be assessed. Face validity exists when, to a layperson (such as the subject undergoing testing), a test seems to measure what it is purported to measure. Thus, a driving simulator has good face validity as a test of on-road safety, whereas an overlapping figures test of figure/ground discrimination may not, even though it may actually be relevant to perceptual tasks during driving. More detailed discussions of reliability and validity were given by Mitrushina and associates (2005), Halligan and colleagues (2003), or Murphy and Davidshofer (2004).