Chapter 212 Art and Science of Guideline Formation
Clinical practice guidelines have become an integral part of the practice of medicine. They are meant to be used by physicians as resources to consider when making treatment decisions for individual patients. They are also frequently used by various organizations for policy and payment decisions. As of February, 2009, 2408 sets of clinical guidelines are listed on the National Guidelines Clearinghouse (NGC), with 479 additional guideline sets registered as “in progress (http://www.guideline.gov).” One hundred and seventy-four guideline sets in this one database focus on disorders of the spine. Only 25 of these were produced by organized spine surgery, sponsored by either the AANS/CNS Section on Disorders of the Spine or the North American Spine Society. These sets do not include myriad “technology assessments” commissioned by third-party payers, nor do they include a multitude of guidelines, evidence-based reviews, evidence-informed consensus statements, or other similarly titled systematic literature reviews published and disbursed outside of the NGC system. Clinical practice guidelines are here to stay and have proven to be important for the assessment of current best practices, guidance for future research, and defense of unpopular yet effective treatment strategies. The purpose of this chapter is to describe how guidelines are created in both the ideal situation and in the real world.
Author Group
One of the most useful tools for learning about evidence-based medicine, guidelines, and the application of guidelines to the real world is a small text by David Sackett and the McMaster University group called Evidence Based Medicine.1 We refer to this text several times in this chapter when discussing how to rate evidence and how to apply evidence to clinical situations. In the chapter devoted to a discussion of the creation of clinical practice guidelines, Dr. Sackett offers the reader the following advice:
We hope, …, that you see how doubly dumb it is for one or a small group of local clinicians to try and create the evidence component of a guideline all by themselves. Not only are we ill equipped and inadequately resourced for the task, but by taking it on we steal energy away from …our real expertise… This chapter closes with the admonition to frontline clinicians: when it comes to lending a hand with guideline development, work as a “B-keeper*” not a meta-analyst.1
Despite this warning, it is absolutely critical that physicians with clinical expertise participate in the formation of clinical practice guidelines. Although epidemiologic support is necessary for the analysis of study design, clinical data cannot be accurately interpreted and the translation of data to recommendation cannot be made without an understanding of the clinical significance of the data. This understanding does not come from textbooks. A more reasonable interpretation of Sackett’s statement is that it is not efficient or desirable to have individual groups spend the resources to develop practice guidelines at a local level. It makes more sense to have guidelines produced at a national level and leave the interpretation of those guidelines to the local experts. A series of review articles published in the Journal of the American Medical Association by the same author group offers detailed explanations of many of the concepts to be discussed in this chapter. The level of detail is inappropriate for this particular review, but the reader is encouraged to use these as references for further inquiry.2–20
Conflict of interest is an important issue in the formation of a guidelines author group. Disclosure of such conflicts is the first step in managing conflicts, and the organizing body, be it a medical society, university work group, or insurance carrier, must decide how to manage or resolve the conflict. In some situations, compromises are necessary in order to garner sufficient topical expertise. In most situations, however, author groups can be constructed and organized to mitigate against the possibility of industry-related conflicts. It is our opinion that industry-sponsored “study groups” are an inappropriate source for clinical practice guidelines because the membership of and strategic direction of these panels may be easily influenced by the sponsoring body. Similarly, technology assessments produced by centers that are funded largely by third-party payers cannot be considered practice guidelines since they are paid for by entities primarily desiring to limit economic exposure as opposed to evaluating clinical efficacy. Furthermore, these panels notoriously lack relevant physician input and tend to place a higher value on study design and author interpretation of data than common sense and clinical fact. (e.g., go to www.ecri.org and review their assessment of “decompressive procedures for lumbosacral pain.” You will note that the author group contained only one physician, an Emergency Care Research Institute [ECRI] employee who practices internal medicine. No spine surgeon, physical therapist, rehabilitation physician, or other specialist input was solicited, and the topic is clearly ridiculous to anyone who regularly cares for these patients—decompression is not done as a treatment for low back pain, it is done for radiculopathy or stenosis.)
Those in the field of organized spine surgery, including the American Association of Neurological Surgeons and Congress of Neurological Surgeons Joint Section on Disorders of the Spine (Spine Section) and the North American Spine Society (NASS), have been active in guidelines development. The first significant product produced using modern evidence-based review techniques was the set of clinical practice guidelines dealing with cervical spine and spinal cord injury.21 The author group was recruited by Mark Hadley and consisted exclusively of neurosurgeons, both because of the funding agency (the spine section) and because of relative inexperience in guidelines formation. The group included general neurosurgeons, pediatric neurosurgeons, and neurosurgical spine specialists. Beverly Walters, a neurosurgeon who had trained in clinical epidemiology at McMaster University served as the epidemiologist. Each of the authors was employed at an academic center and had the support of local expertise in library science and statistics if necessary. The authors were tutored in evidence-based medicine techniques during 4-week-long sessions in order to solidify their ability to interpret the medical literature.
These guidelines were unique in the spine world and were qualitatively different from the various consensus-based guidelines that had been published previously (e.g., the NASS Low Back Pain Treatment Guidelines published in 1999). Because they applied to a relatively small patient population and because they were originally published as a supplement to Neurosurgery, a journal with virtually no penetrance into emergency medicine or orthopaedics, they did not receive immediate notoriety. With the exception of chapters dealing with the administration of steroids and the safety of traction reduction without MRI, very few recommendations were considered controversial.21
Question Formation
Once an author group is formed, a set of questions is developed. The questions asked are a very important determinant of the utility of the ultimate guideline document. Questions need to be both relevant and answerable. A question such as, “What is the best treatment for low back pain?” is unanswerable. Patients with low back pain are a heterogeneous population. Back pain may be caused by muscular strain, traumatic injury, degeneration of the intervertebral disc, or spinal tumors. It may be a symptom of renal calculi, dissecting aortic aneurysm, or a somatization disorder. There is, therefore, no one best treatment for back pain, and attempting to answer such a question is a frustrating and fruitless endeavor. A better question would be “In a patient with recalcitrant low back pain and neurogenic claudication due to spondylolisthesis and stenosis, does surgical intervention improve outcomes compared with the natural history of the disease?” Here, the patient population is well defined and the treatment modalities are well described, allowing a meaningful review of the medical literature. During the literature search, it may become apparent that multiple surgical interventions are employed, resulting in the parsing of the question into subcomponents related to individual surgical techniques.
Literature Search
The availability of computerized search engines has greatly simplified the ability to identify potentially useful references. Most guidelines groups use two different search engines and databases to ensure a thorough search. Familiarity with mesh headings or consultation with a librarian is very useful in creating an effective search that will not be overly inclusive. Unfortunately, the era of electronic publishing has greatly increased the number of potentially useful references (when just the title and abstract are available for initial screening), and it is not uncommon to obtain several hundred or even several thousand references that require individual review. Several strategies can be used to speed this process. First of all, if sufficient high-quality evidence, such as several concordant randomized trials, exists, lower-quality evidence may be ignored except as background information. For example, about 7 billion papers deal with the use of microdiscectomy for lumbar radiculopathy (OK, an exaggeration). Of these papers, 99.9% are case reports, small case series, technical notes, or historic anecdotes. There are a few large cohort studies with admittedly fatal flaws. Fortunately, several attempted randomized studies have been published within the last few years22,23 that provide higher-quality evidence than all of the other papers. Instead of spending months describing each case series, we can focus our review on a detailed analysis of the higher-quality papers and simply summarize the findings of the various case series. If the primary references are flawed, however, then we must incorporate the lower-quality evidence into the analysis.