When a test is compared with a criterion standard, the results can be expressed as a contingency table (Table 16.1). Such a table shows the number of patients who have the condition according to the criterion standard, and how many do not; and in how many of each category the test in question was positive or negative. Four cells emerge. The ‘a’ cell is the number of patients in whom the condition is present and in whom the results of the test are positive. These are patients with true-positive responses. The ‘b’ cell contains those patients who do not have the condition but in whom the test was nevertheless positive. These responses are false-positive. The ‘c’ cell represents those patients who have the condition but the test is negative. These responses are false-negative, for the test failed to detect the condition when it should have done so. The ‘d’ cell represents those patients who do not have the condition and in whom the test is negative. The test correctly identified these patients as not having the condition, and their responses are true-negative.

Table 16.1 A contingency table from which to determine the construct validity of a diagnostic test

From such a table, several descriptive statistics can be derived, which can be used to quantify the virtues of a diagnostic test, or the lack thereof. Paramount amongst these are the sensitivity and the specificity of the test.

Sensitivity is the extent to which the test correctly detects the condition that the test is supposed to detect. Conceptually, this is read down the first column of the figure. Numerically, sensitivity is the ratio between ‘a’ and ‘a+c’, for ‘a’ is the number of patients known to have the condition in whom the test was positive, while ‘a+c’ is the total number of patients who had the condition. Sensitivity is also known as the true-positive rate, for it describes the proportion of cases who should have been positive that the test actually did find, correctly, as positive.

Specificity is the extent to which the test correctly detects the absence of the condition. Conceptually it is read up the second column of the figure. Numerically, specificity is the ratio between ‘d’ and ‘b+d’, for ‘d’ is the number of patients known not to have the condition in whom the test was negative, while ‘b+d’ is the total number of patients who did not have the condition. Specificity is also known as the true-negative rate, for it describes the proportion of cases who should has been negative that the test actually did find, correctly, as negative.

A companion statistic is the false-positive rate. This is the proportion of cases who did not have the condition but in whom the test was, incorrectly, positive. Numerically it is the ration of ‘b’ to ‘b+d’. It is also the complement of the specificity, i.e.:

Failure to recognize both the occurrence and the prevalence of false-positive responses has been one of the major transgressions of medicine in the past. It is both false and illusory to assume that every test result that is positive is correctly positive. A test can be positive for reasons other than the sought-for condition being present. Unless the prevalence of false-positive results is known, the validity of the test remains in question, for an investigator cannot otherwise tell if a positive result is true-positive or false-positive.

The significance of false-positive responses can be realized by analyzing the contingency table. The total number of positive responses to the test is the number of true-positive cases (‘a’) and the number of false-positive cases (‘b’). The confidence that an investigator can have, that a given positive response is true-positive, is determined by the ratio of ‘a’ to ‘b’, or the ratio of ‘a’ to ‘a+b’. The greater the value of ‘b’, the less confidence an investigator can have that a given positive response is true-positive. In other words, false-positive responses compromise diagnostic confidence.

The ratio, a:(a+b), is known as the positive predictive value of the test. It should be contrasted with the false-positive rate, which is b:(b+d). Although both ratios contain ‘b’, and are both compromised by high values of ‘b’, they reflect different properties of the test. In the false-positive rate, ‘b’ indicates how often the test is positive in patients who should not be positive. In the positive predictive value, ‘b’ indicates how often the test is wrong when the result is positive.

Notwithstanding these differences, the relationship between false-positive rate and positive predictive value means that once the false-positive rate is known, the value of ‘b’ can be derived and used to calculate the positive predictive value. In that latter form it indicates how often a positive test is likely to be wrong. For that reason it is imperative that false-positive rates of diagnostic tests be known.

Predictive validity

A final form of validity lies in the ability of a diagnostic block to lead to better treatment and to predict successful outcome from treatment. Optimally, there should be a treatment that can be implemented if a diagnostic block is positive. This constitutes the therapeutic utility of the block. If a positive response consistently predicts successful relief of pain following treatment, the block has predictive validity.

DIAGNOSTIC BLOCKS

There is no contention about the concept validity of diagnostic blocks. It makes perfect sense that anesthetizing a structure, or its nerve supply, should be a means of determining if that structure is a source of pain or not.

There should not be a problem with content validity. All that is required is that the component details of a diagnostic be carefully defined, and that all practitioners of the same block in name correctly execute those details. For this reason, some organizations, such as the International Spine Intervention Society, have formulated definitions and detailed descriptions of certain diagnostic blocks, in order to provide a reference standard.¹

It is with respect to face validity and construct validity that many diagnostic blocks used in practice fail. Either they have not been shown to have face validity, or they have been shown to lack face validity. Meanwhile, few blocks have been shown to have construct validity. Instead, the practice has been to assume that blocks are valid. The results of research show otherwise.

Face validity

Practitioners are not entitled to assume, or to believe, that just because they aim an injection of local anesthetic at a particular structure that either the structure will be anesthetized or that only that structure will be anesthetized. For many blocks used in pain medicine this assumption nevertheless prevails. Yet in those few instances where the assumption has been tested, the assumption has been proved false.

In the case of ‘blind’ epidural injections, several limitations have been described. Caudal epidural injections can fail to reach the epidural space in up to 30% or more of injections.²^–⁴ Instead, the injections were behind the sacrum or intravascular. Lumbar epidural injections may fail to reach the epidural space.⁵ Even when the injectate does reach the epidural space, it does not necessarily flow across the entire space; it can remain in the dorsal epidural space and fail to reach the ventral space.⁶ For these reasons several investigators have urged that epidural injections should be performed under fluoroscopic control,^3,^4,⁷ so that operators can determine whether or not their injectate reached the desired target.

Similarly, it has been shown that the flow of injectate during lumbar medial branch blocks depends on the technique used.⁸ If needles are placed too far anteriorly on the transverse process, or if the needle is aimed cephalad, injectate can spread to the intervertebral foramen, where it might compromise the specificity of the block. In order to avoid these problems, needles need to be directed slightly caudally, and placed opposite the middle of the neck of the superior articular process.

For procedures such as sympathetic blocks, their face validity remains contentious. For lumbar sympathetic blocks, some operators believe that they can achieve accurate placement of needles simply using surface markings.⁹ Other operators contend that in order to secure face validity, lumbar sympathetic blocks should be performed under fluoroscopic guidance.¹⁰ Only under those conditions can target specificity be secured, in each and every case.

Even classic blocks of the shoulder lack face validity. A study has shown that expert rheumatologists fail to gain access to the glenohumeral joint or the subacromial space in up to 70% of ‘blind’ injections.¹¹

Fluoroscopic guidance is the only means available at present by which the face validity of diagnostic blocks can be confidently demonstrated. Blocks of superficial, palpable, or easily accessible nerves might be exempt; but fluoroscopy is all but mandatory for diagnostic blocks of any deep structures.

Construct validity

For diagnostic blocks in pain medicine, there is no conventional criterion standard, for there is no more senior means of determining if a patient has pain from a particular source, Consequently, diagnostic blocks cannot be tested in the conventional manner, by comparing the results of blocks against a criterion standard. However, the construct validity of diagnostic blocks can be estimated piecemeal by other means.

Features such as the false-positive rate can be estimated by determining how often a diagnostic block is positive in patients who should not, or demonstrably do not, have the condition in question. Once the false-positive rate is known, the specificity of the test can be derived, as the complement of the false-positive rate.

Control blocks

If a patient genuinely has pain from a particular target structure, complete relief of that pain should be obtained consistently whenever that structure is anesthetized. Furthermore, there should not be relief if some other structure is anesthetized, or if an inactive agent is used to block the target structure. If a patient fails to respond according to these precepts, doubts can be raised about the target structure being the source of the pain. This becomes the theoretical basis of controlled blocks.

Controlled blocks involve repeating the diagnostic block either or both to test for consistency of response and for the effect of different agents. If a patient responds to a first block but fails to respond appropriately to subsequent, control blocks, their initial response can be deemed to have been false-positive. To this end, three types of control can be used.

Anatomical controls involve deliberately anaesthetizing some adjacent structure that is not the suspected source of pain. Construct validity is achieved if the patient obtains relief whenever the suspected source is anesthetized but not when the adjacent structure is anesthetized. Any other pattern of response constitutes a false-positive response. Anatomical controls, however, are suitable only if the targets are small and imperceptibly displaced, for otherwise blinding cannot be secured. If the block is performed at an obviously different location, the patient will know that they are undergoing the control procedure.

The most rigorous from of control involves using a placebo agent. The protocol requires a sequence of three blocks. The first block must involve an active agent, in order to establish, prima facie, that the target structure does appear to be the source of pain. There is no point wasting time and resources testing with placebo a structure that is patently not the source of pain. The second block cannot be the inactive control, for under those conditions patients would know that the second block is the control. In order to maintain chance, the second block must be a randomized choice of either normal saline or an active agent, administered on a double-blind basis. The third block will use the agent not used for the second block. The code is broken once all three blocks have been completed. Under these conditions, a true-positive response would be one in which the patient obtained relief on each occasion that an active agent was used but no relief when the inactive agent was used. Any other pattern constitutes a false-positive response. Failure to respond on each occasion that an active local anesthetic was used provides evidence of inconsistency of response. Relief of pain when the inactive agent was used indicates a placebo response, and refutes the target structure being the source of pain.

A less rigorous, but more pragmatic, approach is to use comparative blocks. The blocks are performed on separate occasions using local anesthetic agents with different durations of action.^9,¹²^–¹⁷ Two phenomena are tested: the consistency of response and the duration of response. In the first instance, the patient should obtain relief on each occasion that the block is performed. Secondly, they should obtain long-lasting relief when a long-acting agent is used, but short-lasting relief when a short-acting agent is used. Failure to respond to the second block constitutes inconsistency, and indicates that the first response was false-positive. A response concordant with the expected duration of action of the agent used strongly suggests a genuine, physiologic response. However, lack of concordance does not invalidate the response.

Comparative blocks are confounded by a peculiar property of local anesthetics, particularly lidocaine. Some patients can obtain prolonged effects from lidocaine,¹⁶ the basis of which is discussed elsewhere.¹⁷ Such prolonged responses do not invalidate the block or the response, provided that the relief is complete. When tested against placebo, comparative blocks prove valid, but to different extents according to the duration of response.¹⁸ If the patient reports a concordant response, the chances of the response being false-positive are only 14%. If the responses are complete but prolonged in duration, the chances of a false-positive response are 35%, but in 65% of patients the response is likely to be genuine.

Comparative blocks do not prove that the response is true-positive. They are designed to test specificity by detecting false-positive responses. The objective is to identify patients whose responses to repeated blocks is not consistent. If the response is patently false, no further arguments are required.

On the other hand, consistent responses to blocks do not necessarily prove that the responses are true-positive. They only imply a true-positive response by reducing the likelihood that the response is false-positive. There is always the possibility that all of the patient’s responses were due to placebo effects. That possibility persists even if blocks are repeated endlessly. However, if the number of repetitions is increased, but the responses remain consistent, the probability that the responses are false becomes dwindling small. Practitioners can judge how many blocks are required to reduce the probability of false-positive response to an acceptable level for practical purposes. Sometimes, the practitioner might care to perform three or four blocks, just to be certain that the response is consistent. The minimum number, however, would be two, because single diagnostic blocks are associated with unacceptably high false-positive rates.

For other types of diagnostic blocks, controls have not been routine, or even recommended practice. Instead, these blocks have simply been assumed to have construct validity and, therefore, are excused the need for controls. Research studies have refuted this assumption.

Price et al.¹⁹ performed stellate ganglion block on patients with complex regional pain syndromes of the upper limb, using either a local anesthetic or normal saline. They found that normal saline was virtually as effective as local anesthetic in relieving pain and other features. Indeed, there was no significant difference in the incidence of positive responses, or the degree of pain-relief, when the two agents were compared. Similarly, another study showed that the motor features of some patients with presumed complex regional pain syndrome are relieved by placebo infusions.²⁰

Such results seriously call into question the validity of uncontrolled sympathetic blocks or infusions. Consequently, practitioners are not entitled to assume that a positive response to a conventional sympathetic block is true-positive. That fact is that false-positive responses can and do occur. The only way that these responses can be detected is to perform controls in each and every case.

DISCUSSION

It is intriguing that there is an imbalance in attitude and demands concerning the validity of diagnostic blocks. Traditional blocks constitute accepted practice, but innovative procedures are subject to criticism and scrutiny. Traditional blocks are excused the requirement for controls, but validation studies are demanded for innovative procedures. Yet ironically, innovative procedures are the most validated procedures in pain medicine, while traditional procedures have not been validated.¹⁸

For example, medial branch blocks have been maligned in the literature,^21,²² but they have been thoroughly tested for face validity, construct validity, and therapeutic utility. For no other block has this been paralleled.

When performed using the correct technique, lumbar medial branch blocks accurately reach the target nerve, and do not affect any other structure that might confound the response.⁸ Therefore, lumbar medial branch blocks have face validity. Lumbar medial branch blocks protect normal volunteers from experimentally-induced zygapophyseal joint pain.²³ This reinforces their face validity as a test for zygapophyseal joint pain. Single blocks have an unacceptably high false-positive rate of 25–41%.²⁴^–²⁶ Given the low prevalence of lumbar zygapophyseal joint pain, this rate means that for every three blocks that appear to be positive, two will be false-positive.²⁴ To be valid, lumbar medial branch blocks have to be performed under controlled conditions. If positive under controlled conditions, lumbar medial branch blocks predict successful outcome following medial branch neurotomy.²⁷ False-negative responses to lumbar medial branch blocks can occur, but are due to inadvertent intravascular injection. This can be detected and avoided if a test dose of contrast medium is injected once the needle is placed. Accordingly, false-negatives are all but eliminated by using contrast medium. As a result, the sensitivity of controlled, lumbar medial branch blocks is virtually 100%.

Cervical medial branch blocks also have proven face validity. Material injected onto the waist of the articular pillar pools in that location, where it infiltrates the target nerve.²⁸ Otherwise, cervical medial branch blocks do not spread to adjacent levels, they do not spread to the spinal nerve, and they do not anesthetize the posterior neck muscles indiscriminately. Single blocks have a false-positive rate of 27%.²⁹ Therefore, to be valid, cervical medial branch blocks must be performed under controlled conditions in each and every case. The construct validity of cervical medial branch blocks has been established, both by statistical methods and by comparing their effects with those of placebo blocks.^16,¹⁸ Cervical medial branch blocks predict successful outcome from radiofrequency neurotomy. The outcomes are comparable irrespective if placebo-controlled blocks are used or if comparative blocks are used.^30,³¹

For no other diagnostic blocks have comparable data been produced. Face validity, construct validity, and predictive validity have not been comprehensively established for lumbar or cervical sympathetic blocks, spinal nerve blocks, greater occipital nerve blocks, blocks of the temporomandibular apparatus, or intra-articular blocks. Proponents and exponents of these procedures have a long way to come in order to catch up with the necessary data on the validity of their procedures. But unless these data emerge, pain medicine will remain a discipline based on convenient, if not self-serving, assumption.

References

1 Bogduk N, editor. Practice guidelines for spinal diagnostic and treatment procedures. San Francisco: International Spine Intervention Society, 2004.

2 White AH, Derby R, Wynne G. Epidural injections for diagnosis and treatment of low-back pain. Spine. 1980;5:78-86.

3 Stitz MY, Sommer HM. Accuracy of blind versus fluoroscopically guided caudal epidural injection. Spine. 1999;24:1371-1376.

4 Renfrew DL, Moore TW, Kathol MH, et al. Correct placement of epidural steroid injections: fluoroscopic guidance and contrast administration. AJNR. 1991;12:1003-1007.

5 Mehta M, Salmon N. Extradural block. Confirmation of the injection site by X-ray monitoring. Anaesthesia. 1985;40:1009-1012.

6 Botwin KP, Natalicchio J, Hanna A. Fluoroscopic guided lumbar interlaminar epidural injections: a prospective evaluation of epidurography contrast patterns and anatomical review of the epidural space. Pain Phys. 2004;7:77-80.

7 El-Khoury G, Ehara S, Weinstein JW, et al. Epidural steroid injection: a procedure ideally performed with fluoroscopic control. Radiology. 1988;168:554-557.

8 Dreyfuss P, Schwarzer AC, Lau P, et al. Specificity of lumbar medial branch and L5 dorsal ramus blocks: a computed tomographic study. Spine. 1997;22:895-902.

9 Buckley FP. Regional anesthesia with local anesthetics. In: Loeser JD, editor. Bonica’s management of pain. 3rd edn. Philadelphia: Lippincott Williams & Wilkins; 2001:1893-1952.

10 Hanningto-Kiff JG. Sympathetic nerve blocks in painful limb disorders. In: Wall PD, Melzack R, editors. Textbook of pain. 3rd edn. Edinburgh: Churchill Livingstone; 1994:1035-1052.