6: Diagnostic Rating Scales and Psychiatric Instruments

Published on 24/05/2015 by admin

Filed under Psychiatry

Last modified 24/05/2015

Print this page

rate 1 star rate 2 star rate 3 star rate 4 star rate 5 star
Your rating: none, Average: 0 (0 votes)

This article have been viewed 6264 times

CHAPTER 6 Diagnostic Rating Scales and Psychiatric Instruments


Unlike other medical specialties, psychiatry relies almost exclusively on patient interviews and on observation for diagnosis and treatment monitoring. With the absence of specific physical or biomarker findings in psychiatry, the mental status examination (MSE) represents our primary diagnostic instrument. The MSE provides a framework to collect the affective, behavioral, and cognitive symptoms of psychiatric disorders. Often, the MSE provides enough detail for psychiatrists to categorize symptom clusters into recognized clinical syndromes, and to initiate appropriate treatment.

However, in some settings, the MSE alone is insufficient to collect a complete inventory of patient symptoms or to yield a unifying diagnosis. For example, if a psychotic patient has symptoms of avolition, flat affect, and social withdrawal, it might be difficult to determine (from the standard diagnostic interview alone) whether this pattern reflects negative symptoms, co-morbid depression, or medication-induced akinesia.1 At other times, performing a MSE may not be an efficient use of time or resources to achieve the desired clinical goal: imagine how many fewer patients might be identified during depression screening days if the lengthy, full MSE were the screening instrument of choice.2 Finally, the subjective nature of the MSE often renders it prohibitive in research studies, in which multiple clinicians may be assessing subjects; without the use of an objective, reliable diagnostic tool, subjects may be inadequately or incorrectly categorized, generating results that are difficult to interpret and from which it is difficult to generalize.3

By using diagnostic rating scales, clinicians can obtain objective, and sometimes quantifiable, information about a patient’s symptoms in settings where the traditional MSE is either inadequate or inappropriate. Rating scales may serve as an adjunct to the diagnostic interview, or as stand-alone measures (as in research or screening milieus). These instruments are as versatile as they are varied, and can be used to aid in symptom assessment, diagnosis, treatment planning, or treatment monitoring. In this chapter, an overview of many of the psychiatric diagnostic rating scales used in clinical care and research is provided (Table 6-1). Information on how to acquire copies of the rating scales discussed in this chapter is available in the Appendix.

Table 6-1 Diagnostic Rating Scales Described in Chapter 6

General Ratings
SCID-I and SCID-CV Structured Clinical Interview for DSM-IV Diagnosis
MINI Mini-International Neuropsychiatric Interview
SCAN Schedules for Clinical Assessment in Neuropsychiatry
GAF Global Assessment of Function Scale
CGI Clinical Global Impressions Scale
Mood Disorders
HAM-D Hamilton Depression Rating Scale
BDI Beck Depression Inventory
IDS Inventory of Depressive Symptomatology
Zung SDS Zung Self-Rating Depression Scale
HANDS Harvard Department of Psychiatry National Depression Screening Day Scale
MSRS Manic State Rating Scale
Y-MRS Young Mania Rating Scale
Psychotic Disorders
PANSS Positive and Negative Syndrome Scale
BPRS Brief Psychiatric Rating Scale
SAPS Scale for the Assessment of Positive Symptoms
SANS Scale for the Assessment of Negative Symptoms
SDS Schedule of the Deficit Syndrome
AIMS Abnormal Involuntary Movement Scale
BARS Barnes Akathisia Rating Scale
EPS Simpson-Angus Extrapyramidal Side Effects Scale
Anxiety Disorders
HAM-A Hamilton Anxiety Rating Scale
BAI Beck Anxiety Inventory
Y-BOCS Yale-Brown Obsessive Compulsive Scale
BSPS Brief Social Phobia Scale
CAPS Clinician Administered PTSD Scale
Substance Abuse Disorders
CAGE CAGE questionnaire
MAST Michigan Alcoholism Screening Test
DAST Drug Abuse Screening Test
FTND Fagerstrom Test for Nicotine Dependence
Cognitive Disorders
MMSE Mini-Mental State Examination
CDT Clock Drawing Test
DRS Dementia Rating Scale


How “good” is a given diagnostic rating scale? Will it measure what the clinician wants it to measure, and will it do so consistently? How much time and expense will it require to administer? These questions are important to consider regardless of which diagnostic ratings scale is used, and in what setting. Before describing the various ratings scales in detail, several factors important to evaluating rating scale design and implementation will be considered (Table 6-2).

Table 6-2 Factors Used to Evaluate Diagnostic Rating Scales

Reliability For a given subject, are the results consistent across different evaluators, test conditions, and test times?
Validity Does the instrument truly measure what it is intended to measure? How well does it compare to the gold standard?
Sensitivity If the disorder is present, how likely is it that the test is positive?
Specificity If the disorder is absent, how likely is it that the test is negative?
Positive predictive value If the test is positive, how likely is it that the disorder is present?
Negative predictive value If the test is negative, how likely is it that the disorder is absent?
Cost- and time-effectiveness Does the instrument provide accurate results in a timely and inexpensive way?
Administration Are ratings determined by the patient or the evaluator? What are the advantages and disadvantages of this approach?
Training requirements What degree of expertise is required for valid and reliable measurements to occur?

A first consideration concerns the psychometric measures of reliability and validity. For the psychotic patient mentioned earlier, what would happen if several different physicians watched a videotape of a diagnostic interview, and then independently scored her negative symptoms with a rating scale? The scale would be considered reliable if each of the physicians arrived at a similar rating of her negative symptoms. Reliability refers to the extent that an instrument produces consistent measurements across different raters and testing milieus. In this case, the negative symptom rating scale specifically demonstrates good inter-rater reliability, which occurs when several different observers reach similar conclusions based on the same information.

Recall that for this patient, though, negative symptoms constituted only one possible etiology for her current presentation. If the underlying problem truly reflected a co-morbid depression, and not negative symptoms, a valid negative symptom rating scale would indicate a low score, and a valid depression rating scale would yield a high score. The validity of a rating scale concerns whether it correctly detects the true underlying condition. In this case, the negative symptom scale produced a true negative result, and the depression scale produced a true positive result (Table 6-3). However, if the negative symptom scale had indicated a high score, a type 1 error would have occurred, and the patient may have been incorrectly diagnosed with negative symptoms. Conversely, if the depression scale produced a low score, a type 2 error will have led the clinician to miss the correct diagnosis of depression. The related measures of sensitivity, specificity, positive predictive value, and negative predictive value (defined in Table 6-2 and illustrated in Table 6-3) can provide estimates of a diagnostic rating scale’s validity, especially in comparison to “gold standard” tests.

Table 6-3 Validity Calculations

  Disorder Present Disorder Not Present
Test positive A (true positive) B (type 1 error)
Test negative C (type 2 error) D (true negative)

Sensitivity = A/(A + C)

Specificity = D/(B + D)

Positive predictive value = A/(A + B) Negative predictive value = D/(C + D)

False-positive rate = 1 minus positive predictive value

False-negative rate = 1 minus negative predictive value

Several important logistical factors also come into play when evaluating the usefulness of a diagnostic test. Certain rating scales are freely available, whereas others may be obtained only from the author or publisher at a cost. Briefer instruments require less time to administer, which can be essential if large numbers of patients must be screened, but they may be less sensitive or specific than longer instruments and lead to more diagnostic errors. Some rating scales may be self-administered by the patient, reducing the possibility of observer bias; however, such ratings can be compromised in patients with significant behavioral or cognitive impairments. Alternatively, clinician-administered rating scales tend to be more valid and reliable than self-rated scales, but they also tend to require more time and, in some cases, specialized training for the rater. A final consideration is the cultural context of the patient (and the rater): culture-specific conceptions of psychiatric illness can profoundly influence the report and interpretation of specific symptoms and the assignment of a diagnosis. The relative importance of these factors depends on the specific clinical or research milieu, and each factor must be carefully weighed to guide the selection of an optimal rating instrument.4,5


Suppose that a research study will evaluate brain differences between individuals with an anxiety disorder and healthy subjects. Anxiety is highly co-morbid with a number of psychiatric conditions, which if present among subjects in the anxiety group might confound the study results. At the same time, assurance that the “healthy” subjects are indeed free of anxiety (or of other psychiatric illnesses) would also be critical to the design of such an experiment. The use of general psychiatric diagnostic instruments, described in this section, can provide a standardized measure of psychopathology across diagnostic categories. These instruments are frequently used in research studies to assess baseline mental heath and to ensure the clinical homogeneity of both patient and healthy control subjects.

One of the most frequently used general instruments is the Structured Clinical Interview for DSM-IV Axis I Diagnosis (SCID-I). The SCID-I is a lengthy, semistructured survey of psychiatric illness across multiple domains (Table 6-4). An introductory segment uses open-ended questions to assess demographics, as well as medical, psychiatric, and medication use histories. The subsequent modules ask specific questions about diagnostic criteria, taken from the DSM-IV, in nine different realms of psychopathology. Within these modules, responses are generally rated as “present,” “absent (or subthreshold),” or “inadequate information”; scores are tallied to determine likely diagnoses. The SCID-I can take several hours to administer, although in some instances, raters use only portions of the SCID that relate to clinical or research areas of interest. An abbreviated version, the SCID-CV (Clinical Version), includes simplified modules and assesses the most common clinical diagnoses.

Table 6-4 Domains of the Structured Clinical Interview for DSM-IV Axis I Diagnosis (SCID)

I. Overview section
II. Mood episodes
III. Psychotic symptoms
IV. Psychotic disorders differential
V. Mood disorders differential
VI. Substance use
VII. Anxiety disorders
VIII. Somatoform disorders
IX. Eating disorders
X. Adjustment disorders

While the SCID-I is generally considered user-friendly, its length precludes its routine clinical use. An alternative general rating instrument is the Mini-International Neuropsychiatric Interview (MINI), another semistructured interview based on DSM-IV criteria. Questions tend to be more limited with this test than with the SCID-I, and are answered in “yes/no” format; however, unlike the SCID-I, the MINI includes a module on antisocial personality disorder and has questions that focus on suicidality. Because the overall content of the MINI is more limited than the SCID-I, the MINI requires much less time to administer (15 to 30 minutes).

A third general interview, the Schedules for Clinical Assessment in Neuropsychiatry (SCAN), focuses less directly on DSM-IV categories and provides a broader assessment of psychosocial function (Table 6-5). The SCAN evolved from the older Present State Examination, which covers several categories of psychopathology, but also includes sections for collateral history, developmental issues, personality dis-orders, and social impairment. However, like the SCID-I, the SCAN can be time consuming, and administration requires familiarity with its format. While the SCAN provides a more complete history in certain respects, it does not lend itself to making a DSM-IV diagnosis in as linear a fashion as does the SCID-I and the MINI.

Table 6-5 Components of the Schedules for Clinical Assessment in Neuropsychiatry (SCAN)

I. Present State Examination
Part I: Demographic information; medical history; somatoform, dissociative, anxiety, mood, eating, alcohol and substance abuse disorders
Part II: Psychotic and cognitive disorders, insight, functional impairment
II. Item Group Checklist
Signs and symptoms derived from case records, other providers, other collateral sources
III. Clinical History Schedule
Education, personality disorders, social impairment

Adapted from Skodol AE, Bender DS: Diagnostic interviews in adults. In Rush AJ, Pincus HA, First MB, et al, editors: Handbook of psychiatric measures, Washington, DC, 2000, American Psychiatric Association.

Two additional general diagnostic scales may be used to track changes in global function over time and in response to treatment. Both are clinician-rated and require only a few moments to complete. The Global Assessment of Functioning Scale (GAF) consists of a 100-point single-item rating scale that is included in Axis V of the DSM-IV diagnosis (Table 6-6). Higher scores indicate better overall psychosocial function. Ratings can be made for current function and for highest function in the past year. The Clinical Global Impressions Scale (CGI) consists of two scores, one for severity of illness (CGI-S), and the other for degree of improvement following treatment (CGI-I). For the CGI-S, scores range from 1 (normal) to 7 (severe illness); for the CGI-I, they range from 1 (very much improved) to 7 (very much worse). A related score, the CGI Efficacy Index, reflects a composite index of both the therapeutic and adverse effects of treatment. Here, scores range from 0 (marked improvement and no side effects) to 4 (unchanged or worse and side effects outweigh therapeutic effects).

Table 6-6 Global Assessment of Functioning Scale (GAF) Scoring

Score Interpretation
91-100 Superior function in a wide range of activities; no symptoms
81-90 Good function in all areas; absent or minimal symptoms
71-80 Symptoms are transient and cause no more than slight impairment in functioning
61-70 Mild symptoms or some difficulty in functioning, but generally functions well
51-60 Moderate symptoms or moderate difficulty in functioning
41-50 Serious symptoms or serious difficulty in functioning
31-40 Impaired reality testing or communication, or seriously impaired functioning
21-30 Behavior considerably influenced by psychotic symptoms or inability to function in almost all areas
11-20 Some danger of hurting self or others, or occasionally fails to maintain hygiene
1-10 Persistent danger of hurting self or others, serious suicidal act, or persistent inability to maintain hygiene
0 Inadequate information

Adapted from Diagnostic and statistical manual of mental disorders DSM-IV-TR fourth edition (text revision), Washington, DC, 2000, American Psychiatric Association.

General diagnostic instruments survey a broad overview of psychopathology across many domains. They can be useful as screening tools for both patients and research subjects, and in some cases can determine whether individuals meet DSM-IV criteria for major psychiatric disorders. However, they do not provide the opportunity for detailed investigations of affective, behavioral, or cognitive symptoms, and often do not provide diagnostic clarification for individuals with atypical or complex presentations. Diagnostic rating scales that focus on specific domains (such as mood, psychotic, or anxiety symptoms) can be of greater value in these situations, as well as in research efforts that focus on symptom-specific areas. In the following sections, rating scales that are tailored to explore specific clusters of psychiatric illness and related medication side effects will be discussed.


The diagnosis and treatment of mood disorders present unique challenges to the psychiatrist. The cardinal features of major depressive disorder can mimic any number of other distinct neuropsychiatric illnesses, including (but not limited to) dysthymia, anxiety, bipolar-spectrum disorders, substance abuse, personality disorders, dementia, and movement disorders. In some cases, overt depressive symptoms precede the telltale presentation (e.g., mania) for these other disorders, and more subtle signs and symptoms of the root disorder are easily missed in standard diagnostic interviews. Moreover, because most antidepressant medications and psychotherapies take effect gradually, daily or even weekly progress can be difficult to gauge subjectively. Diagnostic rating scales can be invaluable in the clarification of the diagnosis of mood disorders and the objective measurement of incremental progress during treatment.

The Hamilton Rating Scale for Depression (HAM-D) is a clinician-administered instrument that is widely used in both clinical and research settings. Its questions focus on the severity of symptoms in the preceding week; as such, the HAM-D is a useful tool for tracking patient progress after the initiation of treatment. The scale exists in several versions, ranging from 6 to 31 items; longer versions include questions about atypical depression symptoms, psychotic symptoms, somatic symptoms, and symptoms associated with obsessive-compulsive disorder (OCD). Patient answers are scored by the rater from 0 to 2 or 0 to 4 and are tallied to obtain an overall score. Scoring for the 17-item HAM-D-17, frequently used in research studies, is summarized in Table 6-7. A decrease of 50% or more in the HAM-D score suggests a positive response to treatment. While the HAM-D is considered reliable and valid, important caveats include the necessity of training raters and the lack of inclusion of certain post-DSM-III criteria (such as anhedonia).

Table 6-7 Scoring the 17-Item Hamilton Rating Scale for Depression (HAM-D-17)

Score Interpretation
0-7 Not depressed
8-13 Mildly depressed
14-18 Moderately depressed
19-22 Severely depressed
≥23 Very severely depressed
Buy Membership for Psychiatry Category to continue reading. Learn more here