11. Acupuncture
Adrian White and Claudia M. Witt
Chapter Contents
The acupuncture intervention219
Control interventions223
Outcomes228
Comment on study design (explanatory and pragmatic studies)229
Conclusion231
This chapter should not be read in isolation, but in the context of both Chapter 1, which discusses the place for, interpretation of, and problems in complementary and alternative medicine (CAM) research, and Chapter 5, which discusses the fundamental principles of research, particularly the randomized controlled trial (RCT). Both these chapters illustrate several points with examples from acupuncture research, which will not be repeated here. This chapter will discuss the particular applications of these general principles to acupuncture research.
Introduction
At the time of writing, the recent history of acupuncture research has been dominated by the Modellvorhaben Akupunktur, a programme of trials funded by German health insurance companies using various designs to investigate the effectiveness, efficacy, safety and cost-effectiveness of acupuncture treatment for common conditions (headache, migraine, neck pain, back pain, osteoarthritis of hip and knee, allergic rhinitis and dysmenorrhoea) (Linde et al., 2006 and Witt et al., 2006a). Anyone planning an RCT of acupuncture would be well advised to study the published protocols and reports of these research projects in detail, as they were exemplary in many ways.
The results of the programme with regard to the effectiveness of acupuncture can be briefly summarized as follows: for musculoskeletal conditions, acupuncture was much more effective than usual care, or standardized care; so was sham acupuncture (shallow needling of wrong points, sometimes called ‘placebo’, as discussed below); there was a small trend for acupuncture to be superior to sham acupuncture, but the difference was significant in only one study (Witt et al. 2005). Similarly for tension headache, acupuncture was better than waiting list (Melchart et al. 2005) and not superior to sham (Endres et al. 2007). In the case of migraine, acupuncture’s effect was no different from the effect of standard prophylactic medication (beta-blockers, flunarizine or valproic acid) (Diener et al. 2006).
The programme was designed specifically to provide evidence for the decision on whether to reimburse treatment with acupuncture, and on the basis of the results it was decided to refund acupuncture treatment for:
• knee pain, where acupuncture was clearly effective and in one of two trials superior to sham acupuncture
• back pain, where acupuncture was clearly more effective than existing guideline-based care, though not superior to sham acupuncture
However, only doctors who have advanced diplomas in psychological medicine and in pain qualify for reimbursement.
The results of these trials have not been as conclusive as acupuncturists had hoped – particularly the lack of clear, consistent superiority of acupuncture over placebo. Various aspects of the studies have generated debate, such as what is ‘adequate’ acupuncture, how important is choice of point location and what is a satisfactory ‘placebo’ for acupuncture. For some commentators, acupuncture remains ‘on trial’ but others dismiss it as no more than a placebo. The history of acupuncture research is full of studies with negative results which could, in retrospect, have been anticipated from features of the design – such as inappropriate conditions or patients, suboptimal treatment regimes, poorly chosen control groups or insensitive measures. In order to end up with a representative and truthful evaluation of acupuncture, study design is crucial: studies must have the best chance of showing an effect as well as high methodological quality. This chapter will discuss important aspects of study design.
Approach
Acupuncture treatment can be seen as having two components, each of them complex: (1) the insertion and stimulation of the needle – which is largely mechanical and reproducible; and (2) the other aspects of the therapeutic interaction, such as the belief and expectations of the practitioner and the patient, the demeanour of the practitioner, the formulation of a diagnosis – which are essentially subjective and difficult to reproduce and measure. This chapter mainly deals with the mechanical aspects of needling, not because this is necessarily more important but because it is what acupuncturists spend much time and effort in learning. The process of needling is also what is understood by the world at large as the principle component of acupuncture. And it seems important to demonstrate somehow that the correct form of acupuncture needling is superior to placebo (even though, in practice, the effect of acupuncture is considerably enhanced by the other aspects of treatment), since health regulators have come to regard this as essential precondition for integration and reimbursement.
This chapter assumes that needles act principally by stimulating the nervous system. It will adopt the PICO (participants, intervention, control, outcomes) sequence, then will offer some comments on study design and economic evaluation.
Participants
It is still not known why some patients respond better than others (i.e. what are the predictor variables of an acupuncture response), apart from the general statement that less severe cases are more likely to respond than severe ones. It would be an ideal arrangement if controlled trials of acupuncture could include only known responders, as happens in many studies of non-steroidal anti-inflammatories: patients are recruited only if they are already taking the drug, and they are asked to stop (Bjordal et al. 2004). Then they are randomized only if they show a significant worsening of symptoms, i.e. they are responders. However, with acupuncture, patients who have received treatment are likely to identify if they are subsequently given placebo/sham acupuncture. In addition, selecting only the responders would decrease the external validity of the results.
It is also important to choose participants with a clinical condition that is known to respond. Acupuncture has gained a reputation as a panacea for all kinds of condition, but it is still most commonly used for musculoskeletal conditions, and this is where it seems to hold most promise according to systematic reviews. The effects of acupuncture on other forms of chronic pain, such as cancer pain or fibromyalgia, are less clear. The balance of evidence shows an effect in nausea and vomiting, though some well-run studies have also had negative results, and other promising areas include postoperative pain and allergic rhinitis.
It is probably most difficult to measure an effect of acupuncture when this is likely to be rather small in relation to the ‘placebo response’ of patients. Some conditions, such as menopausal hot flushes and irritable bowel syndrome, are known to have relatively large psychological responses to any treatment, so studies in these conditions are likely to have to use large sample sizes.
The acupuncture intervention
Acupuncture can be standardized (formula acupuncture); semi-standardized, for example allowing variation in prespecified ways in response to certain symptoms; or individualized, given as in daily practice. There is no evidence published so far that one type of acupuncture is more effective than any other, though individualized acupuncture is certainly more satisfying for the practitioner.
Adequate acupuncture
This section reproduces in part an article on dose of acupuncture to which many authors contributed in a consensus process (White et al. 2008).
Acupuncture’s development over 2000 years has taken place in different centres in China, Japan, Korea and other parts of the world, and understandably many different styles of practice now exist. Currently, there is considerable disagreement among acupuncturists, particularly those trained in different schools, about what constitutes the best treatment for different conditions and for different patients. A treatment protocol (i.e. a precise description of the procedures and the schedule for a course of treatment) that is one practitioner’s favourite may be dismissed by another.
It is inconceivable that any pharmaceutical company would spend resources on clinical trials of a new drug until they know the characteristics of the dosage and the patients likely to respond. Yet, because acupuncture research has, for the most part, skipped some of the necessary earlier phases of research in which the dose–response relationship is carefully examined (Campbell et al. 2000), there is a dearth of data upon which to base decisions about optimal acupuncture protocols.
A definition of the ‘dose’ of acupuncture has been suggested (Box 11.1). Clearly, the effect of needling will vary at different sites in the body, but for simplicity we shall not consider the location of needling in these general comments. The dose required to treat different health conditions will vary depending on the intended mechanism of the effect, e.g. whether local, segmental, extrasegmental or central. Some conditions, e.g. migraine or fibromyalgia, probably require several mechanisms to be activated if treatment is to be effective (Filshie & White 1998). And some conditions, again including fibromyalgia, may require wide variations in dose according to the degree to which the nervous system is sensitized in a particular patient (Lundeberg & Lund 2007).
BOX 11.1
The physical procedures applied in each session, using one or more needles, and the patient’s resulting perception, sensory as well as affective and cognitive
Different doses will be required for different conditions and for different states of the nervous system
Please note that the patient’s response is also part of the dose. This looks odd at first, because a response is usually what is elicited by the dose. However, the dose of acupuncture consists of more than just the mechanical stimulation: patients come to acupuncture with beliefs and expectations already formed, and these may subsequently be altered by the experience of the treatment. These cognitive factors are known to influence the effects of acupuncture, so they should be regarded as part of the total dose, and measured if we are to know fully what treatment the patient has received. Unfortunately, measurement of these factors is not easy or reliable.
The components that make up the purely mechanical aspects of acupuncture are set out in the Standards for Reporting Interventions in Controlled Trials of Acupuncture (STRICTA) guidelines (MacPherson et al. 2002), which are discussed below.
Deciding on an adequate dose of acupuncture
There are several approaches for deciding what is ‘adequate’ for any condition, and the original references should be consulted for full descriptions:
1. Clinical opinion. What clinicians think is the best treatment can be determined in several ways. Firstly, by observing different practitioners (Napadow et al. 2004). Secondly, by establishing some kind of clinical consensus, either using methods such as the Delphi method (Webster-Harrison et al. 2002), consensus surveys or conference-type processes (Foster et al., 1999, Witt et al., 2005, Molsberger et al., 2006a, Molsberger et al., 2006b and MacPherson and Schroer, 2007) or by using a stepwise procedure to develop protocols in a ‘treatment manualization’ process (Schnyer & Allen 2002). Thirdly, by examining the traditional acupuncture texts (Birch 1997), though the value of this approach may be limited by difficulties of translation.
2. Clinical trials. The most reliable method of determining an adequate acupuncture protocol would be by directly comparing different protocols in patients in a tightly controlled, explanatory trial, ideally in conditions in which one form of acupuncture has already been shown to be effective. The protocols would need to be established first, using various combinations of clinical consensus and basic research. The choice of protocol for testing would vary between different ‘schools’. One group of researchers used this approach for fibromyalgia, and found that relief of symptoms depended neither on the point location nor the manner of stimulation of the needles – though it did depend on how frequently the treatment was given (Harris et al. 2005). Studies in other conditions may have very different results. The major problem with direct comparisons is that studies need to be large to demonstrate the small differences that are likely to exist between protocols.
3. Basic research. Laboratory studies have mostly explored the effect of different treatment parameters on an outcome such as pain threshold in healthy human volunteers (Marcus, 1994, Barlas et al., 2000 and Zaslawski et al., 2003). Results suggest that stimulation intensity is the single most important determinant of analgesia, and that there is interindividual variation as to how people respond to this type of stimulation. However, it may not be appropriate to apply the findings of experiments in healthy volunteers to patients with clinical conditions since our knowledge of the mechanisms of symptoms such as chronic pain, joint stiffness, depression, hot flushes and so on is incomplete.
4. Reviews. Systematic reviews of RCTs offer the opportunity to compare the effects of different treatment regimens. In one well-known example, Ezzo and colleagues (2000) demonstrated that trials using six or more sessions of acupuncture for osteoarthritis of the knee were more likely to be positive than those using fewer than six.
5. Individual patient data. The different treatment effects in individual patients could be revealed by the use of individual patient data, as in the Acupuncture Trialists’ Collaboration. Individual patient data from many trials will be combined into a single database and analysed to determine whether characteristics of acupuncture such as the number of treatment sessions, treatment style or practitioner qualifications affect outcome.
A combination of clinical opinion and some published evidence was used in one systematic review to set a threshold for ‘adequate’ acupuncture for treating knee osteoarthritis: ‘at least six treatments, at least one per week, with at least four points needled for each painful knee for at least 20 minutes, and either needle sensation (deqi) achieved in manual acupuncture, or electrical stimulation of sufficient intensity to produce more than minimal sensation’ (White et al. 2007).
Reporting acupuncture treatment – STRICTA guidelines
Acupuncture is a procedure that has many variable components. In order to encourage researchers to report it in a way that can be reproduced and interpreted, a group of researchers formed a consensus on what criteria need to be reported. These are known as the STRICTA criteria – standing for Standards for Reporting Interventions in Clinical Trials of Acupuncture (MacPherson et al. 2002). They cover:
• rationale behind the particular use of acupuncture in the study
• details of needling
• regimen of treatment over time
• other components of treatment
• practitioner background
• control or comparison groups, where relevant.
Descriptions of the criteria can be presented either in the text or in a table. The criteria should be applied flexibly according to the context, as it will not be necessary to provide all details in all circumstances – for example, not all studies use co-interventions or control groups.
These guidelines are currently (2010) under revision and readers should search for the latest version. STRICTA should be used in addition to the general Consolidated Standards of Reporting Trials (CONSORT) guidelines for reporting trials and may come to be regarded as an extension to CONSORT.
Co-interventions
Most studies test acupuncture as a sole intervention, for the obvious reason that this is the only way to be sure that any changes are due to the acupuncture itself. At most, the only other intervention available to patients is rescue analgesia.
However, in practice acupuncture is often used together with massage, manipulation, exercise and so on, and a few studies have investigated such combinations. In one of the German studies of acupuncture for knee pain, all patients received physiotherapy involving strengthening and aerobic exercises, the acupuncture group received additional acupuncture whereas the control group received additional sham acupuncture (Scharf et al. 2006). Interestingly, a review found that the effect of acupuncture in this study was much smaller than in studies in which acupuncture patients received rescue medication only (White et al. 2007).
Further, in a UK study also in patients with knee osteoarthritis, acupuncture (or sham acupuncture) was given in addition to individualized strengthening and aerobic exercises. This study also found that acupuncture had no additional effect (Foster et al. 2007). It seems possible that appropriate exercise can have a ‘ceiling’ effect in osteoarthritis, so that acupuncture is unable to show any additional effect. The size of the effect of exercise in this study was rather similar to the average size of the effect of acupuncture in the review (White et al. 2007). If this ceiling effect of exercise is supported by other evidence, the implication for practice is that the choice between exercise and acupuncture could depend on their cost-effectiveness, or on patient preference.
Control interventions
Two of the fundamental research questions for acupuncture – does it have a useful effect for patients? does it have a biological effect? – require different control groups: standard care and ‘placebo’ control.
Standard care
There is now good evidence that acupuncture is a good alternative to conventional treatment for a number of conditions. Two of the German studies used standard care based on guidelines as the control. For back pain, for example, acupuncture was compared with standardized, multimodal care according to German, evidence-based guidelines (Haake et al. 2007). Participants received 10 consultations with the physician, exercise therapy and analgesics (paracetamol or non-steroidal anti-inflammatory drugs) during painful periods. The response rate in the acupuncture group was 47.6%, compared with 27.4% in the standardized care group (P<0.001). Interestingly, the response rate in the sham acupuncture group was 44.2%, also statistically superior to standardized care (P<0.001), and not significantly different from acupuncture.
Although acupuncture appears to have had an impressive effect in this study, we should bear in mind that these patients had experienced pain for at least 6 months, were recruited for a trial of acupuncture through media advertisements, and might expect or at least hope to receive acupuncture. Those who were randomized to continue with the same treatment would be disappointed, possibly leading to some negative bias in scoring the effect of treatment, though it might be unreasonable to argue that bias could account for the whole of the difference between groups. Therefore the results are open to interpretation: detractors of acupuncture will dismiss the results as due entirely to the placebo effect.
Sham (‘placebo’) acupuncture
When it comes to demonstrating the biological activity of acupuncture, the questions resolve into several issues, the main ones being: What are the active components of needle stimulation? Does needle stimulation have an effect at acupuncture points, according to classical theory? Does needle stimulation have an effect anywhere that it stimulates a nerve (i.e. including outside acupuncture points), according to neurophysiological theory? The existence of a true acupuncture placebo would greatly help in resolving these issues.
Terminology for sham acupuncture
Any realistic control intervention for acupuncture must press or at least touch the patient’s skin, and so stimulate nerve endings. Therefore it will not be completely inactive and cannot be called a ‘placebo’ control. Other terms have been suggested, though are not used consistently. ‘Sham’ acupuncture has been used to mean needling wrong points (Lewith et al., 1983, Hammerschlag, 1998 and Lewith and Vincent, 1998); ‘minimal’ acupuncture has been used for superficial needling. Some authors have argued that the term ‘sham’ should be used for all methods as it places emphasis on the psychological impact on the subject (Park et al. 1999), so the procedure should always be described in full.
Here, we shall use the term ‘sham’ for any procedure that pretends to be acupuncture. The sham may be either ‘penetrating’ or ‘non-penetrating’. This concept is somewhat similar to the recommend use of the terms ‘invasive needle control’ and ‘dummy needling control’ (White et al. 2006).
The essential features of the ideal acupuncture sham are: (1) that it should match what the subject (or at least, the acupuncture-naïve subject) expects to see and experience with needling; but (2) that it should not produce the specific needle sensation, deqi. Margolin et al. (1998) suggest an additional test, that it should have the same likelihood of adverse events leading to dropout as the real intervention, but this applies only to its appearance and acceptability, and not to its physiological effects.
Sham procedures
Several sham acupuncture procedures have been devised, including both penetrating and non-penetrating types:
1. Standard needles inserted into inappropriate sites and/or superficially. This is by far the most common form of sham acupuncture.
2. Standard needles used in an abnormal way, either pressing with the handle (Hesse et al. 1994) or just pricking the surface of the skin and immediately being removed (Moore & Berk 1976).
3. Other devices used to touch or press the skin, such as the fingernail (Junnila 1982), an empty guide tube (Lao et al. 1994) or a cocktail stick (White et al. 1996). Ingeniously, Lao et al. (1995) attached leads from inactive electrostimulation apparatus to both groups in order to reduce the perceived differences between the procedures.
4. Sham forms of other treatments, such as inactivated transcutaneous electrical nerve stimulation or laser apparatus (Macdonald et al., 1983 and Dowson et al., 1985). These are less than ideal, because the placebo effects of inactivated electrical devices are likely to be different from those of needles. Therefore, any differences in the outcomes of the two groups cannot be attributed solely to the specific effect of acupuncture.
5. Non-penetrating, blunted needles. A major advance was the development by Streitberger & Kleinhenz (1998) of a sham needle that is blunt and in which the shaft of the needle is free to move inside the handle. When pressed on the skin, the needle appears to penetrate but the handle simply telescopes over the shaft. The needle is supported vertically on the skin by an adhesive dressing applied over an O-ring around the point. In a validation study, 22% of volunteers could feel ‘a dull sensation’ with the sham needle, compared with 57% with a real needle. This ‘dull sensation’ was called deqi by the authors, though it may not accurately reflect true needle sensation. This sham needle needs to be tested in a variety of locations, with different methods of manual stimulation and with variation in direction of needle insertion (Kaptchuk 1998). An RCT using the sham needle showed a significant difference in treatment effect between real needle and sham needle in treatment of rotator cuff lesions in sportsmen (Kleinhenz et al. 1999). The credibility of the intervention was not different between the groups.
A different method of supporting the needle was developed by Park et al. (1999), consisting of an oversize guide tube with a silicon flange which adheres to the skin by means of double-sided tape. The standard guide tube makes a sliding fit within the Park tube.
Another sham needling set comprising genuine and sham needles has recently been developed, using a guide tube with adhesive base like the Park tube (Takakura & Yajima 2007). The needle handle has a stopper to determine depth of insertion. While the real needle penetrates skin and muscle, the sham needle penetrates a stopper at the base of the guide tube, designed to reproduce the feel of natural tissue. Evidence from the group that developed the new placebo suggests that it is successful at blinding both patient and practitioner, but more experience is required from other centres before it can be adopted with confidence.
Selection of sham points
Although the non-penetrating sham needle may prove a satisfactory control for use at genuine points, it may still have physiological activity. Until this is established, it is better to use it on sham points. Choice of sham points involves several considerations. A sufficient number and variety of sham points should be defined before the study to give the user some choice (Zaslawski et al. 1997). The practical process of locating them (for example, measuring from landmarks) should be comparable to using real points, and practitioners must become as familiar with using them as with real points. It is not known how close sham points should be to the site of the complaint in order to be credible. Some studies have placed needles in the knee (Jobst et al., 1986, Williamson et al., 1996 and Waite and Clough, 1998), but the credibility was not tested. The points should not be in the affected anatomical segment: a meta-analysis of 90 sham-controlled studies found that real acupuncture was much more likely to be superior when controls were not in the relevant segment than when they were close to it (Araujo 1998). Finally, sham points should be properly validated, as exemplified by Margolin and colleagues (1998) in preparation for a definitive study of auricular acupuncture for cocaine dependence.
Activity of sham acupuncture
Evidence is accumulating that whatever procedure is undertaken as a sham for acupuncture, it will have a physiological effect if given in a therapeutic setting. The clearest evidence for this showed that a blunt needle, described as an active treatment, produced responses in the limbic system (part of the brain related to pain control), whereas a shallow penetration described as a placebo did not. In the same study, deep penetration with deqi also stimulated the insula, which is particularly important for analgesia (Pariente et al. 2005).
The arguments that support the activity of sham acupuncture have been set out by Lundeberg and colleagues (2008) and the original article should be consulted for detailed references. The physiological findings include the following:
• Light touch on the skin, such as is obtained during sham acupuncture, results in activity in the insular region of the brain.
• Mechanical, non-penetrating, sham acupuncture as well as low-frequency electro-acupuncture evokes brain responses localized to the contralateral primary somatosensory cortex in healthy subjects.
• Superficial and deep acupuncture needle stimulation elicits similar blood oxygen level-dependent (BOLD) responses in the central nervous system of healthy subjects.
• Sham acupuncture and traditional Chinese acupuncture reduced both clinical and experimental pain in patients suffering from fibromyalgia. Both modalities resulted in neural activity in the brain, as assessed with functional magnetic resonance imaging, though traditional acupuncture generally had a more pronounced effect.
• Reduction of serum cortisol concentration and anxiety level were observed following both verum (real) and sham (placebo) acupuncture, although there were no significant differences in the changes between the two groups. These changes could not be attributed to rest.
• Acupuncture and sham acupuncture may activate the reward system and modulate the functional connectivity, including the default mode.
Evidence that sham acupuncture is not inert is also provided by some clinical trials:
• Superficial needling as sham acupuncture is superior to a placebo pill, demonstrating that superficial and sham acupuncture is not inert.
• Superficial needling has been advocated as a treatment technique in its own right.
• Sham acupuncture has been proven to be as effective as verum acupuncture and dedicated analgesics in headache and migraine.
• Sham acupuncture and verum acupuncture have been shown to be more effective than conventional therapeutic interventions in low-back pain.
• Sham (placebo) acupuncture was associated with a significantly higher overall pregnancy rate when compared with real acupuncture in in vitro fertilization treatment.
Testing the success of blinding
The success of blinding should be tested, either indirectly by comparing the credibility of real and sham interventions, to show that they have the same psychological impact, or directly by asking subjects which intervention they believe they received. Neither method is entirely straightforward.
The common indirect approach (Vincent & Lewith 1995) to credibility testing involves four questions:
1. How confident do you feel that this treatment can alleviate your complaint?
2. How confident would you be in recommending this treatment to a friend who suffered from similar complaints?
3. How logical does this treatment seem to you?
4. How successful do you think this treatment would be in alleviating other complaints?
The responses can be on a six-point scale or visual analogue scale (Petrie & Hazleman 1985). The original questionnaire was developed as an exercise to rate the credibility of novel therapies and control procedures that had just been described to (though not experienced by) a class of healthy psychology students (Borkovec & Nau 1972). Although the original context was very different from a clinical trial, the questionnaire was shown to have test–retest validity and internal consistency in the context of a clinical trial (Vincent 1990) and has been used in several subsequent studies (White et al., 1996, Wood and Lewith, 1998, Kleinhenz et al., 1999 and Linde et al., 2005). However, the subjects’ response could vary according to information they received on recruitment, for example whether they were told ‘You will receive one of two forms of acupuncture’ or ‘You will receive either acupuncture or a placebo’. The subjects must be judging the intervention they have actually experienced, and not just giving answers about acupuncture in general. For example, Kleinhenz et al. report that, even after treatment had failed, ‘acupuncture was continued to be judged as effective’. The questionnaire wording must be both precise and fully understood.
Other investigators have used a direct question, such as: ‘You were told you would receive either acupuncture or another treatment very similar to it. Which do you believe you have received?’ (Moore and Berk, 1976, White et al., 1996, Zaslawski et al., 1997 and Berman et al., 2004). This method may be less than ideal: subjects may try to give the answer which they think the researcher wants (interviewer bias), and the question focuses their attention on the details of therapy, which may cause doubt and confound the outcome. Also, we observed that it is quite stressful for subjects who do not want to ‘lose face’ by not recognizing the fact that they had false treatment. Zaslawski et al. (1997) analysed the responses in some detail and found that the subjects’ decision depended on four factors: (1) layout of the needles; (2) needle sensation; (3) general responses such as drowsiness; and (4) alteration of symptoms.
In order to provide some method of evaluating the success of blinding, two forms of a Blinding Index have been developed which test whether patients guess according to chance (Park et al. 2008). In view of the unknown factors (such as timing of the question, or how to deal with don’t know responses), it is probably premature to draw conclusions from these index calculations but hopefully with further development it will become meaningful.
It seems, then, that there is no entirely satisfactory method of testing subject blinding, and it is provisionally recommended to use the credibility questions as they are less stressful for subjects, taking great care in setting the correct context.
Practitioner blinding
The influence of the therapist can be powerful, possibly more powerful than many interventions (Balint 1957), and is best minimized by masking (blinding) the practitioner. However, it is difficult or impossible to mask an acupuncturist and still offer technically optimal treatment. In one attempt, a doctor with no special knowledge of acupuncture was trained specifically for the study, learning two sets of points without knowing which were the correct ones (Lagrue et al. 1977). In another attempt, the practitioner was given either the true or false diagnosis made by another physician (Godfrey & Morgan 1978). This method has been adapted for rigorous, truly ‘double-blind’ studies of individualized acupuncture (Allen et al. 1998): the diagnosing practitioner (or, preferably, team) would write down both appropriate and inappropriate selections of points, place them in different envelopes marked A and B, and leave the room. An independent practitioner would then enter, select one envelope according to a code and treat the subject. This way, all acupuncture practitioners would remain blinded as well as the subject, so it only remains to arrange a blinded assessor. This method is, however, still not acceptable by practitioners who use immediate feedback from the patient, e.g. by feeling the pulse, as a guide for further treatment.
One substitute for practitioner blinding is standardized, minimal interaction (Hansen & Hansen 1985), in which the acupuncturist must avoid discussing the therapy or the response with the patient. A modification involves answering questions about acupuncture using prepared responses (Kleinhenz et al. 1999). Social discussion should also be forbidden, even though this creates a rather stilted atmosphere. Care must be taken that the actual performance of the intervention is identical in both groups. If subjects might see different practitioners on subsequent attendances, the approaches must be standardized. In a study in the UK, it was found that interventions were more credible when they were given by a male practitioner than a female, and working in a holistic manner rather than symptomatically (Choi & Tweed 1996).
The recent development of a placebo needle that mimics insertion (described above) appears to offer the chance of double-blinding, though more experience with its use is required before it is generally adopted.
Outcomes
For measuring the main outcome, there is no reason why acupuncture studies should choose a different measure than conventional trials of the same condition. Triallists can add secondary measures for particular outcomes that are of interest, often quality of life. For cost-effectiveness analysis based on quality-adjusted life years (QALYs) it is is essential to use generic quality-of-life measurement instruments such as the SF-36 or the EuroQol (EQ-5D).
Expectation
As discussed in detail elsewhere (Chapter 5), the patient’s expectation for the treatment outcome can have a relevant effect on the outcome. This applies to acupuncture as well. In a pooled analysis of the data of four German studies (Linde et al. 2007), the authors found that a higher expectation could predict a better treatment outcome. It appears wise to account for patient expectations by measuring them at baseline and to analyse the main outcomes data by taking them into account.
Economic analysis in acupuncture studies
Patients want to know how effective a treatment is. Purchasers want to know in addition how much this will cost. Therefore national institutes such as the National Institute for Health and Clinical Excellence (NICE) in the UK introduced arbitrary cost limits for the reimbursement of new treatments. Yet for most established conventional treatments such information about cost-effectiveness is still missing. In this situation, several cost-effectiveness analyses for acupuncture have been published in recent years. Most of them are part of a large German research initiative after the first from the UK (Wonderling et al. 2004) had found acupuncture to be more cost-effective than conventional routine care.
Unfortunately, the results of cost-effectiveness analyses are mainly valid only for the country where the study was performed. They are based on the respective national health system, the local costs of the treatments, the assumptions on which the analysis is based (e.g. societal or third-party payer perspective), and thus vary between countries. This makes the field of economic analysis more complicated than other areas of medical research. All cost-effectiveness studies have shown that acupuncture is associated with better outcome than usual care, but at additional costs. The above German studies, taking a societal approach, calculated total and diagnosis specific costs. The total costs per QALY varied according to the diagnosis between € 3000 for dysmenorrhoea (Witt et al. 2008a) and €18 000 for osteoarthritis of the knee or the hip (Reinhold et al. 2008). The costs per QALY for low-back pain (Witt et al. 2006b), neck pain (Willich et al. 2006) and headache (Witt et al. 2008b) were ranked somewhere in between. It would be interesting to see whether acupuncture might help to save costs in the long term. The German cost-effectiveness studies lack such an analysis, whereas the study on low-back pain from the UK (Ratcliffe et al. 2006) stretched over a period of 2 years. Here, the costs for the additional acupuncture treatment were not compensated by other savings after 2 years. Based on the available data we can currently assume that additional acupuncture treatment is associated with a higher benefit for the patients but also with higher costs for the purchasers. Because economic aspects are important, future studies should include them whenever possible.
Comment on study design (explanatory and pragmatic studies)
The single-blinded studies in the Modellvorhaben Akupunktur series in Germany were three-arm trials, comparing acupuncture, sham acupuncture (shallow needling to non-classical sites) and no acupuncture – this last group was on a waiting list for acupuncture, or received usual care or standardized care. Superficially this is an efficient design, since it answers two questions (Does acupuncture have a biological effect? Is acupuncture clinically useful?) in one study.
However, when the differences between explanatory and pragmatic approaches (Chapter 5) are considered, it soon becomes clear that design features that suit the first question may not be ideal for the second. The ‘biological effect’ question has the best chance of a truthful answer if it recruits patients who are likely to respond (they have no other medical conditions, and are not receiving any other treatments); in contrast, the ‘useful’ question should include a sample representing potential patients who might receive this treatment in usual care, with other conditions and existing treatments. Studies of the ‘biological effect’ question should apply acupuncture and sham acupuncture to a high technical standard, which is best achieved by using experienced practitioners who receive extra training for the study purpose; in contrast, the ‘useful’ question requires practitioners trained to the standard commonly available in the health service. Finally, the ‘biological effect’ question can be answered at any appropriate time point, and usually the end of treatment is chosen as that is when the effect will be the greatest; in contrast, the ‘useful’ question should have a long follow-up of at least 6 months, since this is the information that is relevant to patients.
In one study, 340 practitioners recruited an average of three patients each: the effect was measured at 6 months (Haake et al. 2007). This study design is good for addressing the pragmatic question of the effectiveness of acupuncture in normal practice, but may not provide the most accurate information on the biological effect of acupuncture: there may be doubts about the technical standards of the acupuncture and sham procedures, and any effect at the end of treatment might have dissipated by 6 months. These limitations could reduce the internal validity of the study in addressing the explanatory (biological effect) question.
At the present time, a blunt, non-penetrating needle control applied away from acupuncture points seems the best approximation to placebo (inert) acupuncture, but does require considerable expertise. Placebo-controlled studies require careful control of all variables, which needs considerable resources for multicentre studies so may be best conducted in single centres.
The German study had large sample sizes, and were thought to be sufficient to detect even relatively small biological effect of acupuncture. However, with the exception of one study in knee pain (Witt et al. 2005) they found no significant differences between acupuncture and sham acupuncture and provoked an international discussion on the benefits and pitfalls of sham acupuncture procedures in clinical studies. Thus they have not produced the hoped-for ‘definitive answers’ to whether acupuncture has biological effects.
The main lessons from this experience are that explanatory (sham-controlled) studies that address the placebo question should:
• have a small number of well-trained centres
• preferably use non-penetrating needles as the control
Conclusion
The design of acupuncture studies is far from straightforward, but there is now considerable international expertise, and collaboration with experts is essential in designing studies. It is important to consider in detail the types of patient to be included, the type and dose of acupuncture to be used, the control group and the measurement of outcome. Different study designs are required to address different research questions. Pragmatic randomized studies evaluating acupuncture as an add-on or comparing it to standard treatment are helpful for decision-making. Studies that address the placebo question should carefully consider the limitations of the sham controls for acupuncture that are currently available, some of which may have activity of their own, as well as methods for testing the success of blinding. Cognitive factors in patients, particularly their expectations of acupuncture, should always be assessed as these influence the outcome in individuals.
References
Allen, J.J.B.; Schnyer, R.N.; Hitt, S.K., The efficacy of acupuncture in the treatment of major depression in women, Psychol. Sci. 9 (1998) 397–401.
Araujo, M.S., Does the choice of placebo determine the results of clinical studies on acupuncture?Forsch. Komplementärmed. 5 (Suppl) (1998) 8–11.
Balint, M., The doctor, the patient and the illness. (1957) Pitman Medical.
Barlas, P.; Lowe, A.S.; Walsh, D.M.; et al., Effect of acupuncture upon experimentally induced ischemic pain: a sham- controlled single-blind study, Clin. J. Pain 16 (3) (2000) 255–264.
Berman, B.M.; Lao, L.; Langenberg, P.; et al., Effectiveness of acupuncture as adjunctive therapy in osteoarthritis of the knee: a randomized, controlled trial, Ann. Intern. Med. 141 (12) (2004) 901–910.
Birch, S., Issues to consider in determining an adequate treatment in a clinical trial of acupuncture, Complement. Ther. Med. 5 (1997) 8–12.
Bjordal, J.M.; Ljunggren, A.E.; Klovning, A.; et al., Non-steroidal anti-inflammatory drugs, including cyclo-oxygenase-2 inhibitors, in osteoarthritic knee pain: meta-analysis of randomised placebo controlled trials, BMJ 329 (7478) (2004) 1317.
Borkovec, T.D.; Nau, S.D., Credibility of analogue therapy rationales, Journal of Behavioural and Experimental Psychiatry 3 (1972) 257–260.
Campbell, M.; Fitzpatrick, R.; Haines, A.; et al., Framework for design and evaluation of complex interventions to improve health, BMJ 321 (7262) (2000) 694–696.
Choi, P.Y.L.; Tweed, A., The holistic approach in acupuncture treatment: implications for clinical trials, J. Psychosom. Res. 41 (1996) 349–356.
Diener, H.C.; Kronfeld, K.; Boewing, G.; et al., Efficacy of acupuncture for the prophylaxis of migraine: a multicentre randomised controlled clinical trial, Lancet Neurol. 5 (4) (2006) 310–316.
Dowson, D.I.; Lewith, G.; Machin, D., The effects of acupuncture versus placebo in the treatment of headache, Pain 21 (1985) 35.
Endres, H.G.; Bowing, G.; Diener, H.C.; et al., Acupuncture for tension-type headache: a multicentre, sham-controlled, patient-and observer-blinded, randomised trial, J. Headache Pain 8 (2007) 306–314.
Ezzo, J.; Berman, B.; Hadhazy, V.; et al., Is acupuncture effective for the treatment of chronic pain? A systematic review, Pain 86 (3) (2000) 217–225.
Filshie, J.; White, A.R., Medical Acupuncture: a Western scientific approach. (1998) Churchill Livingstone, Edinburgh.
Foster, N.; Barlas, P.; Daniels, J.; et al., In: Use of acupuncture by physiotherapists in the treatment of osteoarthritis of the knee: current trends inform a clinical trialProceedings of the Chartered Society of Physiotherapists Congress, Birmingham, UK. (1999), p. 27.
Foster, N.E.; Thomas, E.; Barlas, P.; et al., Acupuncture as an adjunct to exercise based physiotherapy for osteoarthritis of the knee: randomised controlled trial, BMJ 335 (7617) (2007) 436.
Godfrey, C.M.; Morgan, P., A controlled trial of the theory of acupuncture in musculoskeletal pain, J. Rheumatol. 5 (2) (1978) 121–124.
Haake, M.; Muller, H.H.; Schade-Brittinger, C.; et al., German Acupuncture Trials (GERAC) for chronic low back pain: randomized, multicenter, blinded, parallel-group trial with 3 groups, Arch. Intern. Med. 167 (17) (2007) 1892–1898.
Hammerschlag, R., Methodological and ethical issues in clinical trials of acupuncture, J. Altern. Complement. Med. 4 (1998) 159–171.
Hansen, P.E.; Hansen, J.H., Acupuncture treatment of chronic tension headache – a controlled cross-over trial, Cephalalgia 5 (1985) 137–142.
Harris, R.E.; Tian, X.; Williams, D.A.; et al., Treatment of fibromyalgia with formula acupuncture: investigation of needle placement, needle stimulation, and treatment frequency, J. Altern. Complement. Med. 11 (4) (2005) 663–671.
Hesse, J.; Mogelvang, B.; Simonsen, H., Acupuncture versus metoprolol in migraine prophylaxis: a randomised trial of trigger point inactivation, J. Intern. Med. 235 (1994) 451–456.
Jobst, K.; Chen, J.H.; McPherson, K.; et al., Controlled trial of acupuncture for disabling breathlessness, Lancet 328 (1986) 1416–1418.
Junnila, S.Y.T., Acupuncture therapy for chronic pain, Am. J. Acupunct. 10 (1982) 259–262.
Kaptchuk, T., Placebo needle for acupuncture, Lancet 352 (1998) 992.
Kleinhenz, J.; Streitberger, K.; Windeler, J.; et al., Randomised clinical trial comparing the effects of acupuncture and a newly designed placebo needle in rotator cuff tendinitis, Pain 83 (2) (1999) 235–241.
Lagrue, G.; Poupy, J.L.; Grillot, A.; et al., Acupuncture anti-tabagique, La Nouvelle Presse Médicale 9 (1977) 966.
Lao, L.; Bergman, S.; Anderson, R.; et al., The effect of acupuncture on post-operative pain, Acupunct. Med. 12 (1994) 13–17.
Lao, L.; Bergman, S.; Langenberg, P.; et al., Efficacy of Chinese acupuncture on postoperative oral surgery pain, Oral Surg. Oral Med. Oral Pathol. Oral Radiol. Endod. 79 (1995) 423–428.
Lewith, G.T.; Vincent, C.A., The clinical evaluation of acupuncture, In: (Editors: Filshie, J.; White, A.) Medical Acupuncture: a Western Scientific Approach (1998) Churchill Livingstone, Edinburgh, pp. 205–224.
Lewith, G.; Field, J.; Machin, D., Acupuncture compared with placebo in post-herpetic pain, Pain 17 (1983) 361–368.
Linde, K.; Streng, A.; Jurgens, S.; et al., Acupuncture for patients with migraine: a randomized controlled trial, JAMA 293 (17) (2005) 2118–2125.
Linde, K.; Streng, A.; Hoppe, A.; et al., The programme for the evaluation of patient care with acupuncture (PEP-Ac) – a project sponsored by ten German social health insurance funds, Acupunct. Med. 24 (Suppl) (2006) 25–32.
Linde, K.; Witt, C.M.; Streng, A.; et al., The impact of patient expectations on outcomes in four randomized controlled trials of acupuncture in patients with chronic pain, Pain 128 (3) (2007) 264–271.
Lundeberg, T.; Lund, I., Did ‘The Princess on the Pea’ suffer from fibromyalgia syndrome? The influence on sleep and the effects of acupuncture, Acupunct. Med. 25 (4) (2007) 184–197.
Lundeberg, T.; Lund, I.; Naslund, J.; et al., The Emperors sham – wrong assumption that sham needling is sham, Acupunct. Med. 26 (4) (2008) 239–242.
Macdonald, A.J.R.; Macrae, K.D.; Master, B.R.; et al., Superficial acupuncture in the relief of chronic low back pain, Ann. R. Coll. Surg. Engl. 65 (1983) 44–46.
MacPherson, H.; Schroer, S., Acupuncture as a complex intervention for depression: A consensus method to develop a standardised treatment protocol for a randomised controlled trial, Complement. Ther. Med. 15 (2) (2007) 92–100.
MacPherson, H.; White, A.; Cummings, M.; et al., Standards for reporting interventions in controlled trials of acupuncture: The STRICTA recommendations.STandards for Reporting Interventions in Controlled Trails of Acupuncture, Acupunct. Med. 20 (1) (2002) 22–25.
Marcus, P., Towards a dose of acupuncture, Acupunct. Med. 12 (2) (1994) 78–82.
Margolin, A.; Avants, K.; Kleber, H., Investigating alternative medicine therapies in randomized controlled trials, JAMA 280 (1998) 1626–1627.
Melchart, D.; Streng, A.; Hoppe, A.; et al., Acupuncture in patients with tension-type headache: randomised controlled trial, BMJ 331 (7513) (2005) 376–382.
Molsberger, A.F.; Boewing, G.; Diener, H.C.; et al., Designing an acupuncture study: the nationwide, randomized, controlled, German acupuncture trials on migraine and tension-type headache, J. Altern. Complement. Med. 12 (3) (2006) 237–245.
Molsberger, A.F.; Streitberger, K.; Kraemer, J.; et al., Designing an acupuncture study: II. The nationwide, randomized, controlled German acupuncture trials on low-back pain and gonarthrosis, J. Altern. Complement. Med. 12 (8) (2006) 733–742.
Moore, M.E.; Berk, S.N., Acupuncture for chronic shoulder pain, Ann. Intern. Med. 84 (4) (1976) 381–384.
Napadow, V.; Liu, J.; Kaptchuk, T.J., A systematic study of acupuncture practice: acupoint usage in an outpatient setting in Beijing, China, Complement. Ther. Med. 12 (4) (2004) 209–216.
Pariente, J.; White, P.; Frackowiak, R.S.; et al., Expectancy and belief modulate the neuronal substrates of pain treated by acupuncture, Neuroimage 25 (4) (2005) 1161–1167.
Park, J.; Bang, H.; Canette, I., Blinding in clinical trials, time to do it better, Complement. Ther. Med. 16 (3) (2008) 121–123.
Park, J.; White, A.R.; Lee, H.; et al., Development of a new sham needle, Acupunct. Med. 17 (1999) 110–112.
Petrie, J.P.; Hazleman, B.L., Credibility of placebo transcutaneous nerve stimulation and acupuncture, Clin. Exp. Rheumatol. 3 (1985) 151–153.
Ratcliffe, J.; Thomas, K.J.; MacPherson, H.; et al., A randomised controlled trial of acupuncture care for persistent low back pain: cost effectiveness analysis, BMJ 333 (7569) (2006) 626–628A.
Reinhold, T.; Witt, C.M.; Jena, S.; et al., Quality of life and cost-effectiveness of acupuncture treatment in patients with osteoarthritis pain, Eur. J. Health Econ. 9 (3) (2008) 209–219.
Scharf, H.P.; Mansmann, U.; Streitberger, K.; et al., Acupuncture and knee osteoarthritis – a three-armed randomized trial, Ann. Intern. Med. 145 (1) (2006) 12–20.
Schnyer, R.N.; Allen, J.J., Bridging the gap in complementary and alternative medicine research: manualization as a means of promoting standardization and flexibility of treatment in clinical trials of acupuncture, J. Altern. Complement. Med. 8 (5) (2002) 623–634.
Streitberger, K.; Kleinhenz, J., Introducing a placebo needle into acupuncture research, Lancet 352 (1998) 364–365.
Takakura, N.; Yajima, H., A double-blind placebo needle for acupuncture research, BMC Complement. Altern. Med. 7 (2007) 31.
Vincent, C., Credibility assessments in trials of acupuncture, Complementary Medical Research 4 (1990) 8–11.
Vincent, C.; Lewith, G., Placebo controls for acupuncture studies, J. R. Soc. Med. 88 (1995) 199–202.
Waite, N.R.; Clough, J.B., A single-blind, placebo-controlled trial of a simple acupuncture treatment in the cessation of smoking, Br. J. Gen. Pract. 48 (433) (1998) 1487–1490.
Webster-Harrison, P.; White, A.; Rae, J., Acupuncture for tennis elbow: an E-mail consensus study to define a standardised treatment in a GPs’ surgery, Acupunct. Med. 20 (4) (2002) 181–185.
White, A.R.; Eddleston, C.; Hardie, R.; et al., A pilot study of acupuncture for tension headache, using a novel placebo, Acupunct. Med. 14 (1) (1996) 11–15.
White, P.; Golianu, B.; Zaslawski, C.; et al., Standardization of nomenclature in acupuncture research (SONAR), eCAM 4 (2) (2006) 267–270.
White, A.; Foster, N.E.; Cummings, M.; et al., Acupuncture treatment for chronic knee pain: a systematic review, Rheumatology (Oxford) 46 (3) (2007) 384–390.
White, A.; Cummings, M.; Barlas, P.; et al., Defining an adequate dose of acupuncture using a neurophysiological approach – a narrative review of the literature, Acupunct. Med. 26 (2) (2008) 111–120.
Williamson, L.; Yudkin, P.; Livingstone, R.; et al., Hay fever treatment in general practice: a randomised controlled trial comparing standardised Western acupuncture with sham acupuncture, Acupunct. Med. 14 (1) (1996) 6–10.
Willich, S.N.; Reinhold, T.; Selim, D.; et al., Cost-effectiveness of acupuncture treatment in patients with chronic neck pain, Pain 125 (1–2) (2006) 107–113.
Witt, C.; Brinkhaus, B.; Jena, S.; et al., Acupuncture in patients with osteoarthritis of the knee: a randomised trial, Lancet 366 (9480) (2005) 136–143.
Witt, C.M.; Brinkhaus, B.; Reinhold, T.; et al., Efficacy, effectiveness, safety and costs of acupuncture for chronic pain – results of a large research initiative, Acupunct. Med. 24 (Suppl) (2006) S33–S39.
Witt, C.M.; Jena, S.; Selim, D.; et al., Pragmatic randomized trial evaluating the clinical and economic effectiveness of acupuncture for chronic low back pain, Am. J. Epidemiol. 164 (5) (2006) 487–496.
Witt, C.M.; Reinhold, T.; Brinkhaus, B.; et al., Acupuncture in patients with dysmenorrhea: a randomized study on clinical effectiveness and cost-effectiveness in usual care, Am. J. Obstet. Gynecol. 198 (2) (2008) 166–168.
Witt, C.M.; Reinhold, T.; Jena, S.; et al., Cost-effectiveness of acupuncture treatment in patients with headache, Cephalalgia 28 (4) (2008) 334–345.
Wonderling, D.; Vickers, A.J.; Grieve, R.; et al., Cost effectiveness analysis of a randomised trial of acupuncture for chronic headache in primary care, BMJ 328 (7442) (2004) 747–749.
Wood, R.; Lewith, G., The credibility of placebo controls in acupuncture studies, Complement. Ther. Med. 6 (1998) 79.
Zaslawski, C.; Rogers, C.; Garvey, M.; et al., Strategies to maintain the credibility of sham acupuncture used as a control treatment in clinical trials, J. Altern. Complement. Med. 3 (1997) 257–266.
Zaslawski, C.J.; Cobbin, D.; Lidums, E.; et al., The impact of site specificity and needle manipulation on changes to pain pressure threshold following manual acupuncture: a controlled study, Complement. Ther. Med. 11 (1) (2003) 11–21.
Further reading
Altman, D.G., Practical Statistics for Medical Research. (1991) Chapman & Hall, London.
MacPherson
White, A.; Cummings, M.; Filshie, J., An Introduction to Western Medical Acupuncture. (2008) Churchill Livingstone Elsevier, Edinburgh.