Geographic Variation in Health Care and the Affluence-Poverty Nexus
This article examines the validity of the 30% thesis. It is divided into 6 parts:
Evolution of 30%
Health care is caught in the contradictory realities that spending growth is implicit, but excessive growth cannot be countenanced. Yet the pressures to spend are unrelenting. One source is the availability of new treatments and a second is the large reservoir of unmet need [2]. However, economic growth is the principal driver [3]. Indeed, health care is a major engine of economic growth and a major source of jobs.
Most economists predict a continuing growth of health care spending [4] and, faced with caps on the growth of graduate medical education [5], most workforce experts believe that the nation is headed toward deepening shortages of physicians [6]; but not all agree. Based on their studies of geographic variation, researchers associated with the Dartmouth Atlas have concluded that as much as 30% of health care spending is unnecessary [7–9] and that the nation already has enough physicians [10,11].
The fact that health care use varies so widely among regions is one of the great enigmas of health care. Yet it is not a new phenomenon. Geographic differences have been observed since adequate measuring tools were developed in the 1930s and even before. However, over the past 15 years John Wennberg and his associates at Dartmouth have documented this phenomenon more fully, and have proposed explanations for its causes [7–11]. These investigators have concluded that most geographic variation in health care cannot be explained by patients’ needs or preferences, nor by their illness levels or demographic characteristics. Knowing that practices vary among physicians, they have attributed much of this variation to the overuse of “supply-sensitive” specialty services, a consequence of the perverse incentives of the fee-for-service system. Remedies that have been proposed include fewer specialists and more primary care physicians, less fee-for-service and more managed care, less physician autonomy and more regulation, and more direct involvement of patients in shared medical decisions. This approach carries the promise that, if all areas of the nation could spend at the rate of those that spend the least, 30% could be saved, enough to pay for health care reform. It is a powerful message with appeal to powerful constituencies.
Birth of the atlas
The notion that “more is less” emerged during the 1970s and 1980 when Wennberg [12] observed that the frequency of certain surgical procedures differed in Boston as compared with New Haven and also among small towns in Maine and New Hampshire, a phenomenon that Glover had noted in the use of tonsillectomies in Britain 40 years earlier. However, the Dartmouth team examined this phenomenon systematically, concluding that it was not due to differences among the patients but to differences in the efficiency of physicians’ practices.
As the Clinton Health Plan was evolving in the early 1990s, the Dartmouth team proposed that rather than simply comparing towns, they could create a national atlas of health care. With support from the Robert Wood Johnson Foundation and data from Medicare, they divided the nation into 306 hospital referral regions (HRRs), each a closed system wherein most of the patients received most of their care most of the time [13]. But the towns they had studied, such as Lebanon, New Hampshire (population 12,000) and Portland, Maine (population 64,000), were much smaller and homogeneous than their HRRs, which ranged in population from 200,000 to more than 5 million. Some were confined to a major urban center and some spanned the breadth of a rural state. Wide differences existed in income, race, and ethnicity, both within and between them. The patch-quilt map that resulted proved to have no epidemiologic integrity [14].
Methodological pitfalls
Unexplained variation
The most accurate measures were of input prices and special payments, which can be obtained from Medicare data and other government reports. However, Medicare data do not provide similar precision in measuring illness levels. Almost 20 years ago Fisher, the Dartmouth group’s current leader, showed how inaccuracies in Medicare’s condition-specific coding and its application by hospitals can lead to substantial error in assessing risk [15]. Although efforts have been made to improve its value in risk adjustment, Medicare administrative data remain a limited source.
Adjusting for race and poverty
Demographic adjustments employed in the Dartmouth Atlas present even greater challenges. Whereas age and gender adjustments are generally valid, the race adjustment, which simply distinguishes black versus nonblack, is not. Based on their statistical approach, Dartmouth researchers have concluded that “race has virtually no impact on use” [16,17], despite a vast literature on race and health care that shows the opposite [18]. Indeed, the author and his colleagues have found that rates of health care use among poor African Americans in major urban centers are more than double those of affluent whites.
The problem in adjusting for income is even greater. As others have done, Dartmouth researchers based the income of enrollees on the average income of the ZIP codes in which they resided. This measure proves to be a reliable for working-age adults, whose incomes tend to reflect their current wealth and both their past and present socioeconomic circumstances, and whose place of residence reflects their income. However, none of this is true for seniors. Income among retirees is not a valid proxy for either wealth or past economic circumstances. Moreover, housing for low-income seniors is often located outside low-income neighborhoods. As a result, income among seniors captures less than half of the poverty effect that is revealed by ZIP-code income among working-age adults, making it almost impossible to risk-adjust Medicare data. Possibly because of this, Dartmouth researchers have also concluded that “poverty and income explain almost none of the variation” [16,17], which is surprising since the effect of income on health is pervasive [19]. In Dartmouth’s formulation, the resulting gap simply adds to the “unexplained” residual, which is attributed to practice variation. While there are many methodological pitfalls in the Dartmouth Atlas, the failure to properly adjust for income is the most profound.
Medicare as the source of data
A second deficiency of the Dartmouth Atlas is its dependence on Medicare data as the measure of use and expenditures. Because the Atlas is called the Dartmouth Atlas of Health Care, most assume that it applies to health care generally. But for that to be true, Medicare data would have to explain not only the care of Medicare patients but also of non-Medicare patients. Indeed, Dartmouth researchers insist that they do [20], but the data show the contrary [21,22]. Medicare expenditures per enrollee bear no resemblance to overall health care spending per capita, nor should they (Figs. 1 and 2). The resources that are brought to bear on the care of any individual patient do not flow simply from that patient’s payment sources but from the aggregate of all revenues. Regions differ not only in Medicare revenues but in the revenues they receive from employer-sponsored insurance, Medicaid, and on behalf of the uninsured. It is the aggregate revenues from all sources that determines the personnel and other resources available for care [21]. As Fig. 1A displays, the numbers of health care workers per capita in the various states correlates closely with the total health care expenditures per capita, but they bear no relationship to the levels of Medicare expenditures per enrollee, and quality follows total spending [21].
Medicare doing better versus medicare catching up
One reason for the seemingly anomalous geographic distribution of Medicare spending is that there are several different categories of Medicare beneficiaries. For many beneficiaries, Medicare coverage continues the access to care that they previously enjoyed as employees, and for them the relationship between Medicare spending and quality is simple and unambiguous: more is more. More benefits and more access to benefits yields better outcomes, not in every circumstance, but on average. The second group of beneficiaries consists of younger adults who are disabled. These individuals also attain better proximate outcomes from more Medicare spending, but as a group they are on a path toward diminishing health and display poorer long-term outcomes. This leads to the third group: individuals who, unlike the first, were previously uninsured, but like the second, have poorer overall health status, use more resources on entering Medicare and remain sicker, despite their high use, because it is not easy to repair a lifetime of poor health [23,24]. The latter 2 groups are located disproportionately in poor areas in the south, where overall health care spending is low, and in pockets of poverty in the urban north, where overall spending is high. The metric of Medicare sees these dissimilar areas as having the same high Medicare spending and the Dartmouth Atlas displays them that way (Fig. 2A). But they are a peculiar mélange that obfuscates the real dynamics of health care.
Employing death as the outcome
The final methodological pitfall is the need to relate use and expenditures to outcomes. While Medicare administrative data provide information on certain process-of-care outcomes, they do not provide measures of clinical outcomes. To circumvent this deficiency, Dartmouth researchers declared that death was the outcome. Their Web site explains, “we focused only on patients who died so we could be sure that all patients were similarly ill. By definition, the prognosis was identical—all were dead. Therefore, variations cannot be explained by differences in the severity of patient’s illnesses” [13]. Accordingly, they limited their studies to patients during last 6 or 24 months of life. Of course, similarly dead is not similarly ill, a point made more forcefully by others [25–27]. Vast differences exist in the complexities of illness and in its costs among patients who eventually die. Moreover, little of that care is administered with the expectation that death will be the outcome. Nonetheless, it is the notions of “dying” as the risk factor and “being dead” as the outcome that led the Dartmouth group to conclude that wasteful regions [7] and inefficient hospitals [28] cause 30% more spending than is necessary.
Alternative estimates = “more is more”
Several investigators have developed alternative models to explore the relationship between resource use and outcomes. Bach [27] refined the Dartmouth approach in a comparison of hospitals nationally by adjusting for risk, using the All Patient Refined Diagnostic-Related Groups (APR-DRG) system, which is based on 30 comorbidities that can be extracted from Medicare administrative data. He then compared the intensity of care in hospitals, as listed in Dartmouth Atlas, with each hospital’s APR-DRG-adjusted severity of illness among decadents, and found that hospitals where patients consumed more resources (by Dartmouth’s measure) had sicker patients (measured by the APR-DRG scale). Although this did not answer questions about outcomes, since the outcome observed still was death, it did explain why some hospitals used more resources for patients during their last 2 years of life. Their patients had more comorbidities and were sicker.
Silber and colleagues [29] took another approach. Instead of comparing resource use among patients during their last 6 months or 2 years of life, they looked prospectively at surgical outcomes. The outcomes they examined were mortality at 30-days and mortality following a postoperative complication. Like Bach, they applied detailed risk adjustments and gauged risk-adjusted mortality against the intensity of care among hospitals nationally, as listed in the Dartmouth Atlas. It was found that “more aggressive” hospitals had decreased postoperative mortality and that their mortality following postoperative complications was decreased even more. In other words, when the outcomes of care were examined in temporal proximity to the care, the hospitals with a higher intensity of care (by Dartmouth’s measure) had lower postoperative mortality and lower mortality after postsurgical complications.
A study conducted by Barnato and colleagues [30] was also prospective. This group examined mortality among patients who had a high predicted probability of dying (PPD) in Pennsylvania hospitals. They used the PPD scale to risk-adjust their patients and, rather than using hospital intensity measures from the Dartmouth Atlas, they constructed their own, based on days in the intensive care unit and various critical care interventions. Their key observation was that 1 year risk-adjusted mortality in patients with a high PPD was lower in hospitals with higher treatment intensities, about 12% lower.
Ong and colleagues [26] drew a similar conclusion from a prospective study of patients with heart failure at 6 California teaching hospitals. These investigators used a risk-adjustment methodology similar to that used by Bach and Silber, but added 2 sociodemographic measures: race and Medicaid insurance status. Like Barnato, they used their own measure of intensity, based on the number of hospital days and associated costs during the first 180 days after the index admission, but costs were related specifically to the patients studied and not to some broader measure of intensity within the hospitals. Ong and colleagues also found a very close correlation between greater resource use and lower mortality.
Schreyögg and Stargardt [31] enlarged on this approach in a study of acute myocardial infarction (AMI) among more than 35,000 patients at 115 Veterans Administration hospitals. As in Ong’s study [26], rather than measuring the hospitals’ “intensity,” they measured the costs of care for the cohort of patients with AMI treated at each hospital, adjusted for a broad set of risks, and they compared the average costs in each hospital with the cohort’s average outcomes. They found that for every 10% increase in spending above the mean, mortality at 1 year was decreased by 5% and the hazard ratio for readmission was decreased by 10%.
An identical conclusion was reached by Doyle [32]