61: Psychiatric Epidemiology

Published on 24/05/2015 by admin

Filed under Psychiatry

Last modified 22/04/2025

Print this page

rate 1 star rate 2 star rate 3 star rate 4 star rate 5 star
Your rating: none, Average: 3.1 (33 votes)

This article have been viewed 3345 times

CHAPTER 61 Psychiatric Epidemiology

EPIDEMIOLOGICAL MEASURES OF DISEASE FREQUENCY

The frequency of disease or some other outcome within a population group is described using different concepts: the rate at which new cases are observed, or the proportion of a given population that exhibits the outcome of interest.

Incidence refers to the number of new events that develop in a population over a specified period of time (t0 to t1). If this incidence rate is described as the number of events (outcomes) in proportion to the population at risk for the event, it is called the cumulative incidence (CI), and is calculated by the following equation:

image

The denominator equals the total number of persons at risk for the event at the start of the time period (t0) without adjustment for any subsequent reduction in the cohort size for any reason, for example, loss to follow-up, death, or reclassification to “case” status. Therefore, CI is best used to describe stable populations where there is little reduction in cohort size during the time period of interest. An example would be a study of the incidence of major depressive disorder (MDD) in a residential program. If, at the beginning of the study, 8 of the 100 residents have MDD, and of the 92 remaining patients, 8 develop MDD over the next 12 months, the CI for MDD would be (8/92 × 100) = 8.7% for this period (i.e., 1 year). Note that the denominator does not include those in the population with the condition at t0, since they are not at risk for newly experiencing the outcome.

When patients are followed for varying lengths of time (e.g., due to loss to follow-up, death, or reclassification to “case” status) and the denominator value representing the population at risk changes significantly, incidence density provides a more precise measure of the rate at which new events occur. Incidence density (ID) is defined as the number of events occurring per unit population per unit time:

image

The denominator is the population that is actively at risk for the event, and is adjusted as people no longer belong in that pool. In a study of psychosis, for instance, if a person develops hallucinations and delusions, he or she becomes “a case” and no longer contributes to the denominator. Similarly, a person lost to follow-up would also contribute to the denominator only so long as he or she is being tracked by the study. To illustrate, suppose in a 100-person study of human immunodeficiency virus (HIV) infection, 6 people are lost to follow-up at the end of 6 months, and 4 develop HIV at the end of the third month, the person-years of observation would be calculated as follows: (90 × 1 year) + (6 × 0.5 year) + (4 × 0.25 year) = 94 person-years, and incidence density = (4 cases)/(94 person-years) = 4.26 cases/100 person-years of observation.

Prevalence is the proportion of individuals who have a particular disease or outcome at a point or period in time. In most psychiatric studies, “prevalence” refers to the proportion of the population that has the outcome at a particular point in time, and is called the point prevalence:

image

In stable populations, prevalence (P) can be related to incidence density (ID) by the equation P = ID × D, where D is the average duration of the disease before termination (by death or remission, for example). At times, the numerator is expanded to include the number of all cases, existing and new, in a specified time period; this is known as a period prevalence. When the period of interest is a lifetime, it is a type of period prevalence called lifetime prevalence, which is the proportion of people who have ever had the specified disease or attribute in their lifetime.

Lifetime prevalence is often used to convey the overall risk for someone who develops an illness, particularly psychiatric ones that have episodic courses, or require a certain duration of symptoms to qualify for a diagnosis (e.g., depression, anxiety, or posttraumatic stress disorder). In practice, however, an accurate lifetime prevalence rate is difficult to determine since it often relies on subject recall and on sampling populations of different ages (not necessarily at the end of their respective “lifetimes”). It is also an overall rate that does not account for changes in incidence rates over time, nor for possible differences in mortality rates in those with or without the condition.

CRITERIA FOR ASSESSMENT INSTRUMENTS

There are a number of concepts that are helpful in the evaluation of assessment instruments. These involve the consistency of the results that the instrument provides, and its fidelity to the concept being measured.

Reliability is the degree to which an assessment instrument produces consistent or reproducible results when used by different examiners at different times. Lack of reliability may be the result of divergence between observers, imprecision in the measurement tool, or instability in the attribute being measured. Interrater reliability (Table 61-1) is the extent to which different examiners obtain equivalent results in the same subject when using the same instrument; test-retest reliability is the extent to which the same instrument obtains equivalent results in the same subject on different occasions.

Table 61-1 Interrater Reliability

image

Reliability is not sufficient for a measurement instrument—it could, for example, consistently and reliably give results that are neither meaningful nor accurate. However, it is a necessary attribute, since inconsistency would impair the accuracy of any tool. The demonstration of the reliability of an assessment tool is thus required before its use in epidemiological studies. The use of explicit diagnostic criteria, trained examiners to interpret data uniformly, and a structured assessment that obtains the same types of information from all subjects can enhance the reliability of assessment instruments.

There are several commonly used measures to indicate the degree of consistency between sets of data, which in psychiatry is often used to quantify the degree of agreement between raters. The kappa statistic (κ) is used for categorical or binary data, and the intraclass correlation coefficient (ICC, usually represented as r) for continuous data. Both measures have the same range of values (−1 to +1), from perfect negative correlation (−1), to no correlation (0), to perfect positive correlation (+1). For acceptable reliability, the kappa statistic value of 0.7 or greater is generally required; for the ICC, a value of 0.8 or greater is generally required.

Calculation of the kappa statistic (κ) requires only arithmetic computation, and accounts for the degree of consistency between raters with an adjustment for the probability of agreement due to chance. When the frequency of the disorder is very low, however, the kappa statistic will be low despite having a high degree of consistency between raters; it is not appropriate for the measurement of reliability of infrequent disorders.

image

where Po is the observed agreement and Pc is an agreement due to chance. Po = (a + d)/n and Pc = [(a + c)(a + b) + (b + d)(c + d)]/n2. Calculation of the ICC is more involved and is beyond the scope of this text.

Validity is a term that expresses the degree to which a measurement instrument actually measures what it purports to measure. When translating a theoretical concept into an operational instrument that purports to assess or measure it, several aspects of validity need to be accounted for.

For any abstract concept, there are an infinite number of criteria that one might use to assess it. For example, if one wants to develop a questionnaire to diagnose bipolar disorder, one should ask about mood, thought process, and energy level, but probably not whether the subject owns a bicycle. Content validity is the extent to which the instrument adequately incorporates the domain of items that would accurately measure the concept of interest.

Criterion validity is the extent to which the measurement can predict or agree with constructs external to the construct being measured. There are two types of criterion validity generally distinguished, predictive validity and concurrent validity. Predictive validity is the extent to which the instrument’s measurements can predict an external criterion. For instance, if we devise an instrument to measure math ability, we might postulate that math ability should be correlated to better grades in college math courses. A high correlation between the measure’s assessment of math ability and college math course grades would indicate that the instrument can correctly predict as it theoretically should, and has predictive validity. Concurrent validity refers to the extent to which the measurement correlates to another criterion at the same point in time. For example, if we devise a measure relying on visual inspection of a wound to determine infection, we can correlate it to a bacteriological examination of a specimen taken at the same time. A high correlation would indicate concurrent validity, and suggest that our new measure gives valid results for determining infection.

Construct validity refers to the extent to which the measure assesses the underlying theoretical construct that it intends to measure. This concept is the most complex, and both content and criterion validity point to it. An example of a measure lacking construct validity would be a test for assessing algebra skills using word problems that inadvertently assesses reading skills rather than factual knowledge of algebra. Construct validity also refers to the extent that the construct exists as theorized and can be quantified by the instrument. In psychiatry, this is especially difficult since there are no “gold standard” laboratory (e.g., chemical, anatomical, physiological) tests, and the criteria if not the existence of many diagnoses are disputed. To establish the validity for any diagnosis, certain requirements have been proposed, and include an adequate clinical description of the disorder that distinguishes it from other similar disorders and the ability to correlate the diagnosis to external criteria such as laboratory tests, familial transmission patterns, and consistent outcomes, including response to treatment.

Because there are no “gold standard” diagnostic tests in psychiatry, efforts to validate diagnoses have focused around such efforts as increasing the reliability of diagnostic instruments—by defining explicit and observationally based diagnostic criteria (DSM-III and subsequent versions), or employing structured interviews, such as the Diagnostic Interview Schedule (DIS)––and conducting genetic and outcome studies for diagnostic categories. The selection of a “gold standard” criterion instrument in psychiatry, however, remains problematic.

Assessment of New Instruments

If we assume that a reliable criterion instrument that pro-vides valid results exists, the assessment of a new measurement instrument would involve comparing the results of the new instrument to those of the criterion instrument. The criterion instrument’s results are considered “true,” and a judgment of the validity of the new instrument’s results are based on how well they match the criterion instrument’s (Table 61-2).

Table 61-2 Validity of a New Instrument

image

Sensitivity is the proportion of true cases, as identified by the criterion instrument, who are identified as cases by the new instrument (also known as the true positive rate).

Specificity is the proportion of non-cases, as identified by the criterion instrument, who are identified as non-cases by the new instrument (also known as the true negative rate).

For any given instrument, there are tradeoffs between sensitivity and specificity, depending on where the threshold limits are set to distinguish “case” from “non-case.” For example, in the Hamilton-Depression Scale (HAM-D) instrument, the cutoff value for the diagnosis of MDD (often set at 15) would determine whether an individual would be identified as “case” or “non-case.” If the value were instead set at 5, which most clinicians would consider “normal” or not depressed, the HAM-D would be an unusually sensitive instrument (e.g., using a structured clinical interview as the criterion instrument) since most anyone evaluated with even a modicum of depressive thinking would be considered a “case” as would anybody typically considered to have major depression. However, the test would not be especially specific, since it would be poor at identifying those without depression. Conversely, if the cutoff value were set at 25, sensitivity would be low but the specificity high.

In practice, the threshold values in any given evaluation instrument, whether creatine kinase (CK) levels for determining myocardial infarction, the number of colonies on a petri dish to determine infection, or criteria to determine attention-deficit/hyperactivity disorder (ADHD) (e.g., 6 of 9 from group one, 6 of 9 from group two), are chosen to balance the need for both sensitivity and specificity. To improve both these measures without a tradeoff, either the instrument itself or its administration must be improved, or efforts made to ensure maximum stability of the attribute being measured (e.g., administering them concurrently, or in similar circumstances, such as at the same time of day, or a similar clinical setting).

Two other useful measures are the positive predictive value (PPV), the proportion of those with a positive test that are true cases as determined by the criterion instrument. Negative predictive value (NPV) is the proportion of those with a negative test that are true non-cases as determined by the criterion instrument.

Study Designs

There are six basic study types, presented here in the order of their respective ability to infer causality.

Cross-sectional Studies

Cross-sectional studies examine individuals and determine their case status and risk factor exposures at the same time. Outcome rates between those with exposure can then be compared to those without. Data are collected by surveys, laboratory tests, physical measurements, or other procedures, and there is no follow-up or other longitudinal component. Cross-sectional studies are also called prevalence studies (more precisely, they are point prevalence studies), and, as with ecological studies, are relatively inexpensive and are useful for informing future research. They also aid in public health planning (e.g., determining the number of hospital beds needed) and generating more specific hypotheses around disease etiology by looking at specific risk factors.

As with the previously discussed study types, linking outcome and exposure is problematic. Although the data are collected for individuals, a person’s exposure status may differ from when the disease actually began and when the study was conducted. To illustrate, if smokers tend to quit smoking and start exercising once diagnosed with lung cancer, a cross-sectional study looking at these factors would systematically underestimate the link between smoking and lung cancer, and suggest a link between exercise and lung cancer. Another problem with cross-sectional studies is that point prevalence rates are affected both by the rate at which the outcome develops and by the chronicity of the outcome. For instance, if a given disease has a longer time course in men than in women but identical incidence rates, the point prevalence rate in men would be higher than in women.

DEVELOPMENT OF ASSESSMENT TOOLS

Case Definition

In 1972, Cooper and colleagues4 published a United States/United Kingdom study that showed high variability in the diagnosis of psychotic disorders. It highlighted the need for having explicit operational criteria for case identification. The development of such diagnostic criteria with the publishing of the Diagnostic and Statistical Manual of Mental Disorders, Third Edition (DSM-III) in 1980 represented a notable step toward increasing the reliability and validity of psychiatric diagnoses.

Standardized Instruments for Case Assessment

The clinical interview is generally used to diagnose psychiatric illness. However, differences in personal styles and theoretical frameworks, among other factors, can affect the process and conclusions of a psychiatric interview. To increase interrater reliability, a number of standardized interview instruments have been developed. The first was the Present State Examination (PSE), initially used in the International Pilot Study of Schizophrenia sponsored by the World Health Organization (WHO). The PSE was designed for use by psychiatrists or experienced clinicians, however, so its use in larger epidemiological studies was impractical. In 1978, epidemiologists at the National Institute of Mental Health (NIMH) began developing a comprehensive diagnostic instrument for large-scale, epidemiological studies that could be administered by either laypeople or clinicians. The result was the Diagnostic Interview Schedule (DIS), which used the then newly published DSM-III (1980), and elements of other research instruments, including the PSE, the Renald Diagnostic Interview (RDI), the St. Louis criteria, and the Schedule for Affective Disorders and Schizophrenia (SADS). The DIS has been used extensively in the United States and many other countries for surveys of psychiatric illness. Over time, the DIS has undergone revisions, first to incorporate DSM-III-R and then DSM-IV diagnoses. The WHO and the NIMH have also jointly developed the Composite International Diagnostic Interview (CIDI) that is structurally similar to the DIS and provides both ICD-10 and DSM-IV diagnoses.

CONTEMPORARY STUDIES IN PSYCHIATRIC EPIDEMIOLOGY

The Baseline National Comorbidity Survey (NCS)

The NCS, conducted between 1990 and 1992, was the first national survey of mental disorders in the United States. Face-to-face structured diagnostic interviews were administered by nonclinicians to a representative sample of all people living in households within the continental United States. The 8,098 NCS respondents were selected from over 1,000 neighborhoods in over 170 counties distributed over 34 states, and assessed with a modified CIDI.

The most important CIDI modifications involved the use of diagnostic stem questions, which were a small number of initial questions to assess core features of psychiatric disorders. Follow-up questions would only be asked when the subject responded positively. Another innovation of the NCS was the use of a two-phase clinical interview design for patients with evidence of schizophrenia or other nonaffective psychoses. Because prior studies had shown that these types of patients could not provide reliable self-reports, they were reinterviewed and diagnosed by experienced clinicians using a structured clinical interview.

In order to collect information on nonrespondents to the study, the NCS also systematically evaluated about one-third of nonrespondents using telephone interviews. Using the results of the nonresponse survey, the NCS study was able to adjust for the bias due to the lower rates of survey participation, especially among patients with anxiety disorders.

The NCS General Findings

DSM-III-R disorders were more prevalent than had been expected. About 48% of the sample reported at least one lifetime disorder, and 30% of respondents reported at least one disorder in the 12 months preceding the interview. The most common disorders were major depression and alcohol dependence, followed by social and simple phobias. As a group, substance use and anxiety disorders were more prevalent than affective disorders, with approximately one in four respondents meeting criteria for a substance use disorder in their lifetime, one in four for an anxiety disorder, and one in five respondents for an affective disorder (Table 61-4).

Table 61-4 Lifetime and 12-Month Prevalence Estimates for Psychiatric Disorders, NCS Results

  Lifetime Prevalence Estimate (%) 12-Month Prevalence Estimate (%)
Major depression 17.1 10.3
Mania 1.6 1.3
Dysthymia 6.4 2.5
Generalized anxiety disorder 5.1 3.1
Panic disorder 3.5 2.3
Social phobia 13.3 7.9
Simple phobia 11.3 8.8
Agoraphobia without panic 5.3 2.8
Alcohol abuse 9.4 2.5
Alcohol dependence 14.1 7.2
Drug abuse 4.4 0.8
Drug dependence 7.5 2.8
Antisocial personality disorder 2.8
Nonaffective psychosis* 0.5 0.3

* Nonaffective psychosis: schizophrenia, schizophreniform disorder, schizoaffective disorder, delusional disorder, and atypical psychosis.

Source: Adapted from Tsuang and Tohen (2002).5

There were no differences by gender in the overall prevalence of psychiatric disorders. For individual disorders, men were more likely than women to have substance use disorders and antisocial personality disorder, whereas women were more likely than men to have anxiety and affective disorders (with the exception of mania).

The European Study of the Epidemiology of Mental Disorders (ESEMeD) Project

The ESEMeD is a cross-sectional epidemiological study, conducted between January 2001 and August 2003, that assessed the psychiatric epidemiology of 212 million noninstitutionalized adults from Belgium, France, Germany, Italy, the Netherlands, and Spain.6 Individuals were assessed in-person at their homes using computer-assisted psychiatric interview (CAPI) instruments, and data from 21,425 respondents were collected. A stratified, multistage, clustered area, probability sample design was used to analyze the data.

EPIDEMIOLOGY OF MAJOR PSYCHIATRIC DISORDERS

Schizophrenia

Risk Factors

Genetic loading is a robust risk factor for schizophrenia (Table 61-5). The prevalence of schizophrenia in a monozygotic twin of a schizophrenia patient is 50%, and 15% in a dizygotic twin. The prevalence for a child with two schizophrenic parents is 46.3%, and 12.8% for a child with one schizophrenic parent.

Table 61-5 Prevalence of Schizophrenia in Specific Populations

Population Prevalence (%)
General population 0.3
First-degree relatives of parents of schizophrenic patients 5.6
Children with one schizophrenic parent 12.8
Dizygotic twins of a schizophrenic patient 15.0
Children of two schizophrenic parents 46.3
Monozygotic twins of a schizophrenic patient 50.0

Other risk factors of schizophrenia include being a member of a lower social class, being unmarried, having birth complications, and being born during the winter months. The inverse relationship between social class and schizophrenia may be a result of social impairment and downward social drift rather than the cause of the illness. Studies have also shown that stressful life events, high levels of “expressed emotions” (critical and overprotective behavior and verbalizations toward the family member with schizophrenia), and substance use can precipitate psychotic episodes.

Bipolar I Disorder