Evaluation of drugs in humans

Published on 02/03/2015 by admin

Filed under Basic Science

Last modified 22/04/2025

Print this page

rate 1 star rate 2 star rate 3 star rate 4 star rate 5 star
Your rating: none, Average: 1 (1 votes)

This article have been viewed 2085 times

Chapter 4 Evaluation of drugs in humans

Synopsis

This chapter is about evidence-based drug therapy.

New drugs are progressively introduced by clinical pharmacological studies in rising numbers of healthy and/or patient volunteers until sufficient information has been gained to justify formal therapeutic studies. Each of these is usually a randomised controlled trial, in which a precisely framed question is posed and answered by treating equivalent groups of patients in different ways.

The key to the ethics of such studies is informed consent from patients, efficient scientific design and review by an independent research ethics committee. The key interpretative factors in the analysis of trial results are calculations of confidence intervals and statistical significance. Potential clinical significance develops within the confines of controlled clinical trials. This is best expressed by stating not only the percentage differences, but also the absolute difference or its reciprocal, the number of patients who have to be treated to obtain one desired outcome. The outcome might include both efficacy and safety.

Surveillance studies and the reporting of spontaneous adverse reactions respectively determine the clinical profile of the drug and detect rare adverse events. Further trials to compare new medicines with existing medicines are also required. These form the basis of cost–effectiveness comparisons.

Topics include:

Experimental therapeutics

After preclinical evidence of efficacy and safety have been obtained in animals, potential medicines are tested in healthy volunteers and volunteer patients. Studies in healthy normal volunteers can help to determine the safety, tolerability, pharmacokinetics and, for some drugs (e.g. anticoagulants and anaesthetic agents), their dynamic effect and likely dose range. For many drugs, the dynamic effect and hence therapeutic potential can be investigated only in patients, e.g. drugs for parkinsonism and antimicrobials will have no measurable efficacy in subjects without movement disorder or infection, respectively.

Modern medicine is sometimes accused of callous application of science to human problems and of subordinating the interest of the individual to those of the group (society).1 Official regulatory bodies rightly require scientific evaluation of drugs. Drug developers need to satisfy the official regulators and they also seek to persuade the medical profession to prescribe their products. Patients, too, are more aware of the comparative advantages and limitations of their medicines than they used to be. To some extent, this helps encourage patients to participate in trials so that future patients can benefit, as they do now, from the knowledge gained from such trials. An ethical framework is required to ensure that the interests of the individual participant take precedence over those of society (and, more obviously, those of an individual or corporate investigator).

Ethics of research in humans4

Some dislike the word ‘experiment’ in relation to humans, thinking that its mere use implies a degree of impropriety in what is done. It is better that all should recognise from the true meaning of the word, ‘to ascertain or establish by trial’,5 that the benefits of modern medicine derive almost wholly from experimentation and that some risk is inseparable from much medical advance.

The issue of (adequately informed) consent is a principal concern for Research Ethics Committees (also called Institutional Review Boards). People have the right to choose for themselves whether or not they will participate in research, i.e. they have the right to self-determination (the ethical principle of autonomy). They should be given whatever information is necessary for making an adequately informed choice (consent) with the right to withdraw at any stage. Consent procedures, especially information on risks, loom larger in research than they do in medical practice. This is appropriate given that in research, patients may be submitting themselves to extra risks, or simply to extra inconvenience (e.g. more or longer visits). It is a moot point whether more consent in routine practice might not go amiss. It is also likely that patients participating in well-conducted trials receive more, and sometimes better, care and attention than might otherwise be available. Sometimes the unintended consequences of ethical procedures include causing unnecessary apprehension to patients with long, legalistic documents, and creating a false impression of clinical researchers as people from whom patients need protection.

The moral obligation of all doctors lies in ensuring that in their desire to help patients (the ethical principle of beneficence) they should never allow themselves to put the individual who has sought their aid at any disadvantage (the ethical principle of non-maleficence) for ‘the scientist or physician has no right to choose martyrs for society’.6

In principle, it may be thought proper to perform a therapeutic trial only when doctors (and patients) have genuine uncertainty as to which treatment is best.7 Not all trials are comparisons of different treatments. Some, especially early phase trials of new drugs, are comparisons of different doses. Comparisons of new with old should usually offer patients the chance of receiving current best treatment with one which might be better. Since this is often rather more than is offered in resource-constrained routine care, the obligatory patient information sheet mantra that ‘the decision whether to take part has no bearing on your usual care’ may be economical with the truth. But it is also simplistic to view the main purpose of all trials with medicines as comparative.

The past decade has seen the pharmaceutical industry struggle to match the pace of new understanding about disease pathogenesis, and models of research are being adapted to the complexity of common disease that is now apparent. In diseases where many good medicines already exist, the industry spent much time developing minor modifications which were broadly equivalent to current therapy with possible advantages for some patients. With many of the standard blockbusters now off patent, new drugs for such diseases are unattractive, and the industry is concentrating more on harder therapeutic targets where no satisfactory treatment yet exists. Just as in basic science, non-hypothesis-led ‘fishing expedition’ research – genome scans, microarrays – is no longer frowned upon, so the imaginative clinical investigator must throw his stone – a new medicine – into the pond, and be able to make sense of the ripples. One such approach is to move away from trial design in which it is the average response of the group which is of interest towards the design in which the investigator attempts to match differences in response to differences – ethnic, gender, genetic – among the patients. Matches at a molecular level give clues both to how the drug may best be used, and who will benefit most.

The ethics of the randomised and placebo-controlled trial

Providing that ethical surveillance is rooted in the ethical principles of justice,8 there should be no difficulty in clinical research adapting to current needs. And even if the nature of early phase research is changing, the randomised controlled trial will remain the cornerstone of how cause and effect is proven in clinical practice, and how drugs demonstrate the required degree of efficacy and safety to obtain a licence for their prescription.

The use of a placebo (or dummy) raises both ethical and scientific issues (see placebo medicines and the placebo effect, Ch. 2). There are clear-cut cases when placebo use would be ethically unacceptable and scientifically unnecessary, e.g. drug trials in epilepsy and tuberculosis, when the control groups comprise patients receiving the best available therapy.

The pharmacologically inert (placebo) treatment arm of a trial is useful:

To distinguish the pharmacodynamic effects of a drug from the psychological effects of the act of medication and the circumstances surrounding it, e.g. increased interest by the doctor, more frequent visits, for these latter may have their placebo effect. Placebo responses have been reported in 30–50% of patients with depression and in 30–80% with chronic stable angina pectoris.

To distinguish drug effects from natural fluctuations in disease that occur with time, e.g. with asthma or hay fever, and other external factors, provided active treatment, if any, can be ethically withheld. This is also called the ‘assay sensitivity’ of the trial.

To avoid false conclusions. The use of placebos is valuable in Phase 1 healthy volunteer studies of novel drugs to help determine whether minor but frequently reported adverse events are drug related or not. Although a placebo treatment can pose ethical problems, it is often preferable to the continued use of treatments of unproven efficacy or safety. The ethical dilemma of subjects suffering as a result of receiving a placebo (or ineffective drug) can be overcome by designing clinical trials that provide mechanisms to allow them to be withdrawn (‘escape’) when defined criteria are reached, e.g. blood pressure above levels that represent treatment failure. Similarly, placebo (or new drug) can be added against a background of established therapy; this is called the ‘add on’ design.

To provide a result using fewer research subjects. The difference in response when a test drug is compared with a placebo is likely to be greater than that when a test drug is compared with the best current, i.e. active, therapy (see later).

Investigators who propose to use a placebo, or otherwise withhold effective treatment, should justify their intention. The variables to consider are:

The severity of the disease.

The effectiveness of standard therapy.

Whether the novel drug under test aims to give only symptomatic relief, or has the potential to prevent or slow up an irreversible event, e.g. stroke or myocardial infarction.

The length of treatment.

The objective of the trial (equivalence, superiority or non-inferiority; see p. 45). Thus it may be quite ethical to compare a novel analgesic against placebo for 2 weeks in the treatment of osteoarthritis of the hip (with escape analgesics available). It would not be ethical to use a placebo alone as comparator in a 6-month trial of a novel drug in active rheumatoid arthritis, even with escape analgesia.

The precise use of the placebo will depend on the study design, e.g. whether crossover, when all patients receive placebo at some point in the trial, or parallel group, when only one cohort receives placebo. Generally, patients easily understand the concept of distinguishing between the imagined effects of treatment and those due to a direct action on the body. Provided research subjects are properly informed and give consent freely, they are not the subject of deception in any ethical sense; but a patient given a placebo in the absence of consent is deceived and research ethics committees will, rightly, decline to agree to this. (See also: Lewis et al (2002) in Guide to further reading, at the end of this chapter.)

Rational introduction of a new drug to humans

When studies in animals predict that a new molecule may be a useful medicine, i.e. effective and safe in relation to its benefits, then the time has come to put it to the test in humans. Most doctors will be involved in clinical trials at some stage of their career and need to understand the principles of drug development. When a new chemical entity offers a possibility of doing something that has not been done before or of doing something familiar in a different or better way, it can be seen to be worth testing. But where it is a new member of a familiar class of drug, potential advantage may be harder to detect. Yet these ‘me too’ drugs are often worth testing. Prediction from animal studies of modest but useful clinical advantage is particularly uncertain and, therefore, if the new drug seems reasonably effective and safe in animals it is rational to test it in humans. From the commercial standpoint, the investment in the development of a new drug can be over £500 million, but will be substantially less for a ‘me too’ drug entering an already developed and profitable market.

Phases of clinical development

Human experiments progress in a commonsense manner that is conventionally divided into four phases (Fig. 4.1). These phases are divisions of convenience in what is a continuous expanding process. It begins with a small number of subjects (healthy subjects and volunteer patients) closely observed in laboratory settings, and proceeds through hundreds of patients, to thousands before the drug is agreed to be a medicine by a national or international regulatory authority. It is then licensed for general prescribing (though this is by no means the end of the evaluation). The process may be abandoned at any stage for a variety of reasons, including poor tolerability or safety, inadequate efficacy and commercial pressures. The phases are:

image

Fig. 4.1 The phases of drug discovery and development.

(With permission of Pharmaceutical Research and Manufacturers of America.)

Official regulatory guidelines and requirements12

For studies in humans (see also Ch. 6) these ordinarily include:

Studies of pharmacokinetics and bioavailability and, in the case of generics, bioequivalence (equal bioavailability) with respect to the reference product.

Therapeutic trials (reported in detail) that substantiate the safety and efficacy of the drug under likely conditions of use, e.g. a drug for long-term use in a common condition will require a total of at least 1000 patients (preferably more), depending on the therapeutic class, of which (for chronic diseases) at least 100 have been treated continuously for about 1 year.

Special groups. If the drug will be used in, for example, the elderly or children, then these populations should be studied. New drugs are not normally studied in pregnant women. Studies in patients having disease that affects drug metabolism and elimination may be needed, such as patients with impaired liver or kidney function.

Fixed-dose combination products will require explicit justification for each component.

Interaction studies with other drugs likely to be taken simultaneously. Plainly, all possible combinations cannot be evaluated; a rational choice, based on knowledge of pharmacodynamics and pharmacokinetics, is made.

The application for a licence for general use (marketing application) should include a draft Summary of Product Characteristics for prescribers. A Patient Information Leaflet must be submitted. These should include information on the form of the product (e.g. tablet, capsule, sustained-release, liquid), its uses, dosage (adults, children, elderly where appropriate), contraindications, warnings and precautions (less strong), side-effects/adverse reactions, overdose and how to treat it.

The emerging discipline of pharmacogenomics seeks to identify patients who will respond beneficially or adversely to a new drug by defining certain genotypic profiles. Individualised dosing regimens may be evolved as a result (see p. 101). This tailoring of drugs to individuals is consuming huge resources from drug developers but has yet to establish a place in routine drug development.

Therapeutic investigations

With few exceptions, none of these is easy to answer definitively within the confines of a pre-registration clinical trials programme. Effectiveness and safety have to be balanced against each other. What may be regarded as ‘safe’ for a new oncology drug in advanced lung cancer would not be so regarded in the treatment of childhood eczema. The use of the term ‘dose’, without explanation, is irrational as it implies a single dose for all patients. Pharmaceutical companies cannot be expected to produce a large array of different doses for each medicine, but the maxim to use the smallest effective dose that results in the desired effect holds true. Some drugs require titration, others have a wide safety margin so that one ‘high’ dose may achieve optimal effectiveness with acceptable safety. There are two classes of endpoint or outcome of a therapeutic investigation:

Use of surrogate effects presupposes that the disease process is fully understood. They are best justified in diseases for which the true therapeutic effect can be measured only by studying large numbers of patients over many years. Such long-term outcome studies are indeed always preferable but may be impracticable on organisational, financial and sometimes ethical grounds prior to releasing new drugs for general prescription. It is in areas such as these that the techniques of large-scale surveillance for efficacy, as well as for safety, under conditions of ordinary use (below), would be needed to supplement the necessarily smaller and shorter formal therapeutic trials employing surrogate effects. Surrogate endpoints are of particular value in early drug development to select candidate drugs from a range of agents.

Therapeutic evaluation

The aims of therapeutic evaluation are three-fold:

The process of therapeutic evaluation may be divided into pre- and post-registration phases (Table 4.1), the purposes of which are set out below.

When a new drug is being developed, the first therapeutic trials are devised to find out the best that the drug can do under conditions ideal for showing efficacy, e.g. uncomplicated disease of mild to moderate severity in patients taking no other drugs, with carefully supervised administration by specialist doctors. Interest lies particularly in patients who complete a full course of treatment. If the drug is ineffective in these circumstances there is no point in proceeding with an expensive development programme. Such studies are sometimes called explanatory trials as they attempt to ‘explain’ why a drug works (or fails to work) in ideal conditions.

If the drug is found useful in these trials, it becomes desirable next to find out how closely the ideal may be approached in the rough and tumble of routine medical practice: in patients of all ages, at all stages of disease, with complications, taking other drugs and relatively unsupervised. Interest continues in all patients from the moment they are entered into the trial and it is maintained if they fail to complete, or even to start, the treatment; the need is to know the outcome in all patients deemed suitable for therapy, not only in those who successfully complete therapy.14

The reason some drop out may be related to aspects of the treatment and it is usual to analyse these according to the clinicians’ initial intention (intention-to-treat analysis), i.e. investigators are not allowed to risk introducing bias by exercising their own judgement as to who should or should not be excluded from the analysis. In these real-life, or ‘naturalistic’, conditions the drug may not perform so well, e.g. minor adverse effects may now cause patient non-compliance, which had been avoided by supervision and enthusiasm in the early trials. These naturalistic studies are sometimes called ‘pragmatic’ trials.

The methods used to test the therapeutic value depend on the stage of development, who is conducting the study (a pharmaceutical company, or an academic body or health service at the behest of a regulatory authority), and the primary endpoint or outcome of the trial. The methods include:

Formal therapeutic trials are conducted during Phase 2 and Phase 3 of pre-registration development, and in the post-registration phase to test the drug in new indications. Equivalence trials aim to show the therapeutic equivalence of two treatments, usually the new drug under development and an existing drug used as a standard active comparator. Equivalence trials may be conducted before or after registration for the first therapeutic indication of the new drug (see p. 46 below for further discussion). Safety surveillance methods use the principles of pharmacoepidemiology (see p. 51) and are concerned mainly with evaluating adverse events and especially rare events, which formal therapeutic trials are unlikely to detect.

Need for statistics

In order truly to know whether patients treated in one way are benefited more than those treated in another, it is essential to use numbers. Statistics has been defined as ‘a body of methods for making wise decisions in the face of uncertainty’.15 Used properly, they are tools of great value for promoting efficient therapy. More than 100 years ago Francis Galton saw this clearly:

Concepts and terms

Hypothesis of no difference

When it is suspected that treatment A may be superior to treatment B, and the truth is sought, it is convenient to start with the proposition that the treatments are equally effective – the ‘no difference’ hypothesis (null hypothesis). After two groups of patients have been treated and it has been found that improvement has occurred more often with one treatment than with the other, it is necessary to decide how likely it is that this difference is due to a real superiority of one treatment over the other.

To make this decision we need to understand two major concepts, statistical significance and confidence intervals.

Confidence intervals

The problem with the P value is that it conveys no information on the amount of the differences observed or on the range of possible differences between treatments. A result that a drug produces a uniform 2% reduction in heart rate may well be statistically significant but it is clinically meaningless. What doctors are interested to know is the size of the difference, and what degree of assurance (confidence) they may have in the precision (reproducibility) of this estimate. To obtain this it is necessary to calculate a confidence interval (see Figs 4.2 and 4.3).18

A confidence interval expresses a range of values that contains the true value with 95% (or other chosen percentage) certainty. The range may be broad, indicating uncertainty, or narrow, indicating (relative) certainty. A wide confidence interval occurs when numbers are small or differences observed are variable and points to a lack of information, whether the difference is statistically significant or not; it is a warning against placing much weight on (or confidence in) the results of small or variable studies. Confidence intervals are extremely helpful in interpretation, particularly of small studies, as they show the degree of uncertainty related to a result. Their use in conjunction with non-significant results may be especially enlightening.19

A finding of ‘not statistically significant’ can be interpreted as meaning there is no clinically useful difference only if the confidence intervals for the results are also stated in the report and are narrow. If the confidence intervals are wide, a real difference may be missed in a trial with a small number of subjects, i.e. the absence of evidence that there is a difference is not the same as showing that there is no difference. Small numbers of patients inevitably give low precision and low power to detect differences.

Types of error

The above discussion provides us with information on the likelihood of falling into one of the two principal kinds of error in therapeutic experiments, for the hypothesis that there is no difference between treatments may either be accepted incorrectly or rejected incorrectly.

Types of therapeutic trial

A therapeutic trial is:

This is the classical randomised controlled trial (RCT), the most secure method for drawing a causal inference about the effects of treatments. Randomisation attempts to control biases of various kinds when assessing the effects of treatments. RCTs are employed at all phases of drug development and in the various types and designs of trials discussed below. Fundamental to any trial are:

Other factors to consider when designing or critically appraising a trial are:

The aims of a therapeutic trial, not all of which can be attempted at any one occasion, are to decide:

Design of trials

Techniques to avoid bias

The two most important techniques are:

Blinding

The fact that both doctors and patients are subject to bias due to their beliefs and feelings has led to the invention of the double-blind technique, which is a control device to prevent bias from influencing results. On the one hand, it rules out the effects of hopes and anxieties of the patient by giving both the drug under investigation and a placebo (dummy) of identical appearance in such a way that the subject (the first ‘blind’ person) does not know which he or she is receiving. On the other hand, it also rules out the influence of preconceived hopes of, and unconscious communication by, the investigator or observer by keeping him or her (the second ‘blind’ person) ignorant of whether he or she is prescribing a placebo or an active drug. At the same time, the technique provides another control, a means of comparison with the magnitude of placebo effects. The device is both philosophically and practically sound.24

A non-blind trial is called an open trial.

The double-blind technique should be used wherever possible, and especially for occasions when it might at first sight seem that criteria of clinical improvement are objective when in fact they are not. For example, the range of voluntary joint movement in rheumatoid arthritis has been shown to be influenced greatly by psychological factors, and a moment’s thought shows why, for the amount of pain patients will put up with is influenced by their mental state.

Blinding should go beyond the observer and the observed. None of the investigators should be aware of treatment allocation, including those who evaluate endpoints, assess compliance with the protocol and monitor adverse events. Breaking the blind (for a single subject) should be considered only when the subject’s physician deems knowledge of the treatment assignment essential in the subject’s best interests.

Sometimes the double-blind technique is not possible, because, for example, side-effects of an active drug reveal which patients are taking it or tablets look or taste different; but it never carries a disadvantage (‘only protection against biased data’). It is not, of course, used with new chemical entities fresh from the animal laboratory, whose dose and effects in humans are unknown, although the subject may legitimately be kept in ignorance (single blind) of the time of administration. Single-blind techniques have a place in therapeutics research, but only when the double-blind procedure is impracticable or unethical.

Ophthalmologists are understandably disinclined to refer to the ‘double-blind’ technique; they call it double-masked.

Some common design configurations

Size of trials

Before the start of any controlled trial it is necessary to decide the number of patients that will be needed to deliver an answer, for ethical as well as practical reasons. This is determined by four factors:

It will be intuitively obvious that a small difference in the effect that can be detected between two treatment groups, or a large variability in the measurement of the primary endpoint, or a high significance level (low P value) or a large power requirement, all act to increase the required sample size. Figure 4.3 gives a graphical representation of how the power of a clinical trial relates to values of clinically relevant standardised difference for varying numbers of trial subjects (shown by the individual curves). It is clear that the larger the number of subjects in a trial, the smaller is the difference that can be detected for any given power value.

The aim of any clinical trial is to have small Type I and II errors, and consequently sufficient power to detect a difference between treatments, if it exists. Of the four factors that determine sample size, the power and significance level are chosen to suit the level of risk felt to be appropriate. The magnitude of the effect can be estimated from previous experience with drugs of the same or similar action; the variability of the measurements is often known from published experiments on the primary endpoint, with or without drug. These data will not be available for novel substances in a new class, and frequently the sample size in the early phase of development is chosen on a more arbitrary basis. Numbers required to detect the difference in frequency of a categorical outcome, e.g. fractures in a trial of osteoporosis or remissions in a cancer trial, are generally larger than numbers required to detect differences in a continuous quantitative variable. As an example, a trial that would detect, at the 5% level of statistical significance, a treatment that raised a cure rate from 75% to 85% would require 500 patients for 80% power.

Fixed sample size and sequential designs

Defining when a clinical trial should end is not as simple as it first appears. In the standard clinical trial the end is defined by the passage of all of the recruited subjects through the complete design. However, it is results and decisions based on the results that matter, not the number of subjects. The result of the trial may be that one treatment is superior to another or that there is no difference. These trials are of fixed sample size. In fact, patients are recruited sequentially, but the results are analysed at a fixed time-point.

The results of this type of trial may be disappointing if they miss the agreed and accepted level of significance.

It is not legitimate, having just failed to reach the agreed level (say, P = 0.05), to take in a few more patients in the hope that they will bring P value down to 0.05 or less, for this is deliberately not allowing chance and the treatment to be the sole factors involved in the outcome, as they should be.

An alternative (or addition) to repeating the fixed sample size trial is to use a sequential design in which the trial is run until a useful result is reached.27 These adaptive designs, in which decisions are taken on the basis of results to date, can assess results on a continuous basis as data for each subject become available or, more commonly, on groups of subjects (group sequential design). The essential feature of these designs is that the trial is terminated when a predetermined result is attained and not when the investigator looking at the results thinks it appropriate. Reviewing results in a continuous or interim basis requires formal interim analysis and there are specific statistical methods for handling the data, which need to be agreed in advance. Group sequential designs are especially successful in large long-term trials of mortality or major non-fatal endpoints when safety must be monitored closely.

Such sequential designs recognise the reality of medical practice and provide a reasonable balance between statistical, medical and ethical needs. Interim analyses, however, reduce the power of statistical significance tests each time that they are performed, and carry a risk of false positive result if chance differences between groups are encountered before the scheduled end of a trial.

Meta-analysis

The two main outcomes for therapeutic trials are to influence clinical practice and, where appropriate, to make a successful claim for a drug with the regulatory authorities. Investigators are eternally optimistic and frequently plan their trials to look for large effects. Reality is different. The results of a planned (or unplanned) series of clinical trials may vary considerably for several reasons, but most significantly because the studies are too small to detect a treatment effect. In common but serious diseases such as cancer or heart disease, however, even small treatment effects can be important in terms of their total impact on public health. It may be unreasonable to expect dramatic advances in these diseases; we should be looking for small effects. Drug developers, too, should be interested not only in whether a treatment works, but also how well, and for whom.

The collecting together of a number of trials with the same objective in a systematic review28and analysing the accumulated results using appropriate statistical methods is termed meta-analysis. The principles of a meta-analysis are that:

There are strong advocates and critics of the concept, its execution and interpretation. Arguments that have been advanced against meta-analysis are:

In practice, the analysis involves calculating an odds ratio for each trial included in the meta-analysis. This is the ratio of the number of patients experiencing a particular endpoint, e.g. death, and the number who do not, compared with the equivalent figures for the control group. The number of deaths observed in the treatment group is then compared with the number to be expected if it is assumed that the treatment is ineffective, to give the observed minus expected statistic. The treatment effects for all trials in the analysis are then obtained by summing all the ‘observed minus expected’ values of the individual trials to obtain the overall odds ratio. An odds ratio of 1.0 indicates that the treatment has no effect, an odds ratio of 0.5 indicates a halving and an odds ratio of 2.0 indicates a doubling of the risk that patients will experience the chosen endpoint.

From the position of drug development, the general requirement that scientific results have to be repeatable has been interpreted in the past by the Food and Drug Administration (the regulatory agency in the USA) to mean that two well controlled studies are required to support a claim. But this requirement is itself controversial and its relation to a meta-analysis in the context of drug development is unclear.

In clinical practice, and in the era of cost-effectiveness, the use of meta-analysis as a tool to aid medical decision-making and underpinning ‘evidence-based medicine’ is here to stay.

Figure 4.4 shows detailed results from 11 trials in which antiplatelet therapy after myocardial infarction was compared with a control group. The number of vascular events per treatment group is shown in the second and third columns, and the odds ratios with the point estimates (the value most likely to have resulted from the study) are represented by black squares and their 95% confidence intervals (CI) in the fourth column.

The size of the square is proportional to the number of events. The diamond gives the point estimate and CI for overall effect.

Results: implementation

The way in which data from therapeutic trials are presented can influence doctors’ perceptions of the advisability of adopting a treatment in their routine practice.

Relative and absolute risk

The results of therapeutic trials are commonly expressed as the percentage reduction of an unfavourable (or percentage increase in a favourable) outcome, i.e. as the relative risk, and this can be very impressive indeed until the figures are presented as the number of individuals actually affected per 100 people treated, i.e. as the absolute risk.

Where a baseline risk is low, a statement of relative risk alone is particularly misleading as it implies large benefit where the actual benefit is small. Thus a reduction of risk from 2% to 1% is a 50% relative risk reduction, but it saves only one patient for every 100 patients treated. But where the baseline is high, say 40%, a 50% reduction in relative risk saves 20 patients for every 100 treated.

To make clinical decisions, readers of therapeutic studies need to know: how many patients must be treated30 (and for how long) to obtain one desired result (number needed to treat). This is the inverse (or reciprocal) of absolute risk reduction.

Relative risk reductions can remain high (and thus make treatments seem attractive) even when susceptibility to the events being prevented is low (and the corresponding numbers needed to be treated are large). As a result, restricting the reporting of efficacy to just relative risk reductions can lead to great – and at times excessive – zeal in decisions about treatment for patients with low susceptibilities.31

A real-life example follows:

Whether a low incidence of adverse drug effects is acceptable becomes a serious issue in the context of absolute risk. Non-specialist doctors, particularly those in primary care, need and deserve clear and informative presentation of therapeutic trial results that measure the overall impact of a treatment on the patient’s life, i.e. on clinically important outcomes such as morbidity, mortality, quality of life, working capacity, fewer days in hospital. Without it, they cannot adequately advise patients, who may themselves be misled by inappropriate use of statistical data in advertisements or on internet sites.

Pharmacoepidemiology

Pharmacoepidemiology is the study of the use and effects of drugs in large numbers of people. Some of the principles of pharmacoepidemiology are used to gain further insight into the efficacy, and especially the safety, of new drugs once they have passed from limited exposure in controlled therapeutic pre-registration trials to the looser conditions of their use in the community. Trials in this setting are described as observational because the groups to be compared are assembled from subjects who are, or who are not (the controls), taking the treatment in the ordinary way of medical care. These (Phase 4) trials are subject to greater risk of selection bias33 and confounding34 than experimental studies (randomised controlled trials) where entry and allocation of treatment are strictly controlled (increasing internal validity). Observational studies, nevertheless, come into their own when sufficiently large randomised trials are logistically and financially impracticable. The following approaches are used.

Case–control studies

This reverses the direction of scientific logic from a forward-looking, ‘what happens next’ (prospective) to a backward-looking, ‘what has happened in the past’ (retrospective)37 investigation. The case–control study requires a definite hypothesis or suspicion of causality, such as an adverse reaction to a drug. The investigator assembles a group of patients who have the condition. A control group of people who have not had the reaction is then assembled (matched, e.g. for sex, age, smoking habits) from hospital admissions for other reasons, primary care records or electoral rolls. A complete drug history is taken from each group, i.e. the two groups are ‘followed up’ backwards to determine the proportion in each group that has taken the suspect agent. Case–control studies do not prove causation.38 They reveal associations and it is up to investigators and critical readers to decide the most plausible explanation.

A case–control study has the advantage that it requires a much smaller number of cases (hundreds) of disease and can thus be done quickly and cheaply. It has the disadvantage that it follows up subjects backwards and there is always suspicion of the intrusion of unknown and so unavoidable biases in the selection of both patients and controls. Here again, independent repetition of the studies, if the results are the same, greatly enhances confidence in the outcome.

Surveillance systems: pharmacovigilance

When a drug reaches the market, a good deal is known about its therapeutic activity but rather less about its safety when used in large numbers of patients with a variety of diseases, for which they are taking other drugs. The term pharmacovigilance refers to the process of identifying and responding to issues of drug safety through the detection in the community of drug effects, usually adverse. Over a number of years increasingly sophisticated systems have been developed to provide surveillance of drugs in the post-marketing phase. For understandable reasons, they are strongly supported by governments. The position has been put thus:

Four kinds of logic can be applied to drug safety monitoring:

Drug safety surveillance relies heavily on the techniques of pharmacoepidemiology, which include the following:

Voluntary reporting

Doctors, nurses, pharmacists and patients may report suspected adverse reaction to drugs. In the UK, this is called the ‘Yellow Card’ system and the Commission for Human Medicines advises the Medicines and Healthcare products Regulatory Agency of the government on trends and signals. It is recommended that for:

Inevitably the system depends on the intuitions and willingness of those called on to respond. Surveys suggest that no more than 10% of serious reactions are reported. Voluntary reporting is effective for identifying reactions that develop shortly after starting therapy, i.e. at providing early warnings of drug toxicity, particularly rare adverse reactions. Thus, it is the first line in post-marketing surveillance. Reporting is particularly low, however, for reactions with long latency, such as tardive dyskinesia from chronic neuroleptic use. As the system has no limit of quantitative sensitivity, it may detect the rarest events, e.g. those with an incidence of 1:5000 to 1:10 000. Voluntary systems are, however, unreliable for estimating the incidence of adverse reactions as this requires both a high rate of reporting (the numerator) and a knowledge of the rate of drug usage (the denominator).

Prescription event monitoring

This is a form of observational cohort study. Prescriptions for a drug (say, 20 000) are collected (in the UK this is made practicable by the existence of a National Health Service in which prescriptions are sent to a single central authority for pricing and payment of the pharmacist). The prescriber is sent a questionnaire and asked to report all events that have occurred (not only suspected adverse reactions) with no judgement regarding causality. Thus ‘a broken leg is an event. If more fractures were associated with this drug they could have been due to hypotension, CNS effects or metabolic disease’.40 By linking general practice and hospital records and death certificates, both prospective and retrospective studies can be done and unsuspected effects detected. Prescription event monitoring can be used routinely on newly licensed drugs, especially those likely to be widely prescribed in general practice, and it can also be implemented quickly in response to a suspicion raised, e.g. by spontaneous reports.

image

Fig. 4.5 Oscillations in the development of a drug.

(By courtesy of Dr Robert H. Williams and the editor of the Journal of the American Medical Association.)

Guide to further reading

Biomarkers Definitions Working Group. Biomarkers and surrogate endpoints: preferred definitions and conceptual framework. Clin. Pharmacol. Ther.. 2001;69(3):89–95.

Bland J.M., Altman D.G. Statistical notes: the odds ratio. Br. Med. J.. 2000;320:1468.

Bracken M.B. Why animal studies are often poor indicators of human reactions to exposure. J. R. Soc. Med.. 2008;101:120–122.

Chatellier G., Zapletal E., Lemaitre D., et al. The number needed to treat: a clinically useful nomogram in its proper context. Br. Med. J.. 1996;312:426–429.

Doll R. Controlled trials: the 1948 watershed. Br. Med. J.. 1998;317:1217–1220. (and following articles)

Egger M., Smith G.D., Phillips A.N. Meta-analysis: principles and procedures. Br. Med. J. 1997;315:1533–1537. (see also other articles in the series entitled ‘Meta-analysis’)

Emanuel E.J., Miller F.G. The ethics of placebo-controlled trials – a middle ground. N. Engl. J. Med.. 2001;345:915–919.

Garattini S., Chalmers I. Patients and the public deserve big changes in the evaluation of drugs. Br. Med. J.. 2009;338:804–806.

GRADE Working Group. GRADE: what is ‘quality of evidence’ and why is it important to clinicians?. Br. Med. J.. 2008;336:924–929. (and the other papers of this series)

Greenhalgh T. Papers that report drug trials. Br. Med. J. 1997;315:480–483. (see also other articles in the series entitled ‘How to read a paper’)

Kaptchuk T.J. Powerful placebo: the dark side of the randomised controlled trial. Lancet. 1998;351:1722–1725.

Khan K.S., Kunz R., Kleijnen J., Antes G. Five steps to conducting a systematic review. J. R. Soc. Med.. 2003;96:118–121.

Lewis J.A., Jonsson B., Kreutz G., et al. Placebo-controlled trials and the Declaration of Helsinki. Lancet. 2002;359:1337–1340.

Miller F.G., Rosenstein D.L. The therapeutic orientation to clinical trials. N. Engl. J. Med.. 2003;348:1383–1386.

Rochon P.A., Gurwitx J.H., Sykora K., et al. Reader’s guide to critical appraisal of cohort studies: 1. Role and design. Br. Med. J.. 2005;330:895–897.

Rothwell P.M. External validity of randomised controlled trials: ‘to whom do the results of this trial apply? Lancet. 2005;365:82–93.

Rothwell P.M. Treating individuals 2. Subgroup analysis in randomised controlled trials: importance, indications, and interpretation. Lancet. 2005;365:176–186.

Sackett D., Rosenberg W., Gray J., et al. Evidence based medicine: what it is and what it isn’t [editorial]. Can. Med. Assoc. J.. 2009;312:1–8.

Silverman W.A., Altman D.G. Patients’ preferences and randomised trials. Lancet. 1996;347:171–174.

Vlahakes G.J. Editorial. The value of phase 4 clinical testing. N. Engl. J. Med. 2006;354(4):413–415.

Waller P.C., Jackson P.R., Tucker G.T., Ramsay L.E. Clinical pharmacology with confidence [intervals]. Br. J. Clin. Pharmacol.. 1994;37(4):309.

Williams R.L., Chen M.L., Hauck W.W. Equivalence approaches. Clin. Pharmacol. Ther.. 2002;72:229–237.

Zwarenstein M., Treweek S., Gagnier J.J., et al. CONSORT group: Pragmatic Trials in Healthcare (Practihc) group. Improving the reporting of pragmatic trials: an extension of the CONSORT statement. Br. Med. J. 2008;337:a2390.

1 Guidance to researchers in this matter is clear. The World Medical Association Declaration of Helsinki (Edinburgh revision 2000) states that ‘considerations related to the well-being of the human subject should take precedence over the interests of science and society’. The General Assembly of the United Nations adopted in 1966 the International Covenant on Civil and Political Rights, of which Article 7 states, ‘In particular, no one shall be subjected without his free consent to medical or scientific experimentation’. This means that subjects are entitled to know that they are being entered into research even though the research be thought ‘harmless’. But there are people who cannot give (informed) consent, e.g. the demented. The need for special procedures for such is now recognised, for there is a consensus that, without research, they and the diseases from which they suffer will become therapeutic ‘orphans’.

2 Report: Royal College of Physicians of London 1996 Guidelines on the Practice of Ethics Committees in Medical Research Involving Human Subjects. Royal College of Physicians, London.

3 American Defence Secretary Donald Rumsfeld, on 12 February 2002, at a press briefing where he addressed the absence of evidence linking the government of Iraq with the supply of weapons of mass destruction to terrorist groups.

4 For extensive practical detail, see Council for International Organisations of Medical Sciences (CIOMS) in collaboration with the World Health Organization (WHO) 2002 International Ethical Guidelines for Biomedical Research Involving Human Subjects. CIOMS, Geneva. (WHO publications are available in all UN member countries.) See also: Guideline for Good Clinical Practice, International Conference on Harmonisation Tripartite Guideline. EU Committee on Proprietary Medicinal Products (CPMP/ICH/135/95). Smith T 1999 Ethics in Medical Research: A Handbook of Good Practice. Cambridge University Press, Cambridge.

5 Oxford English Dictionary. See also: Edwards M 2004 Historical keywords: Tria. Lancet 364:1659.

6 Kety S. Quoted by Beecher H K 1959 Journal of the American Medical Association 169:461.

7 This is the uncertainty principle: the concept that patients entering a randomised therapeutic trial will have equal potential for benefit and risk is referred to as equipoise.

8 The ‘four principles’ approach (above) is widely utilised in biomedical ethics. A full description and an analysis of the contribution of this and other ethical theories to decision-making in clinical, including research, practice can be found in: Beauchamp T L, Childress J F 2001 Principles of Biomedical Ethics, 5th edn. Oxford University Press, Oxford.

9 Injury to participants in clinical trials is uncommon and serious injury is rare. In March 2006, eight healthy young men entered a trial of a humanised monoclonal antibody designed to be an agonist of a particular receptor on T lymphocytes that stimulates their production and activation. This was the first administration to humans; preclinical testing in rabbits and monkeys at doses up to 500 times those received by the volunteers apparently showed no ill effect. Six of the volunteers quickly became seriously ill and required admission to an intensive care facility with multi-organ failure due to a ‘cytokine release syndrome’, in effect a massive attack on the body’s own tissues. All the volunteers recovered but some with disability. This toxicity in humans, despite apparent safety in animals, may be due to the specifically humanised nature of the monoclonal antibody. Testing of perceived high-risk new medicines is likely to be subject to particularly stringent regulation in future. See Wood A J J, Darbyshire J 2006 Injury to research volunteers – the clinical research nightmare. New England Journal of Medicine 354:1869–1871.

10 Freedman B 1987 Equipoise and the ethics of clinical research. New England Journal of Medicine 317:141–145.

11 Moderate to severe adverse events have occurred in about 0.5% of healthy subjects. See Orme M, Harry J, Routledge P, Hobson S 1989 British Journal of Clinical Pharmacology 27:125; Sibille M et al 1992 European Journal of Clinical Pharmacology 42:393.

12 Guidelines for the conduct and analysis of a range of clinical trials in different therapeutic categories are released from time to time by the Committee on Medicinal Products for Human Use (CHMP) of the European Commission. These guidelines apply to drug development in the European Union. Other regulatory authorities issue guidance, e.g. the Food and Drug Administration in the USA, the Ministry of Health, Labour and Welfare in Japan. There has been considerable success in aligning different guidelines across the world through the International Conferences on Harmonisation (ICH). The source for CHMP guidelines is info@mhra.gsi.gov.uk

13 A drug for which the original patent has expired, so that any pharmaceutical company may market it in competition with the inventor. The term ‘generic’ has come to be synonymous with the non-proprietary or approved name (see Ch. 7).

14 Information on both categories (method effectiveness and use effectiveness) is valuable (Sheiner L B, Rubin D B 1995 Intention-to-treat analysis and the goals of clinical trials. Clinical Pharmacology and Therapeutics 57(1):6–15).

15 Wallis W A, Roberts H V 1957 Statistics, A New Approach. Methuen, London.

16 Galton F 1879 Generic images. Proceedings of the Royal Institution.

17 Altman D G, Gore S M, Gardner M J, Pocock S J 1983 Statistical guidelines for contributors to medical journals. British Medical Journal 286:1489–1493.

18 Gardner M J, Altman D G 1986 Confidence intervals rather than P values: estimation rather than hypothesis testing. British Medical Journal 292:746–750.

19 Altman D G, Gore S M, Gardner M J, Pocock S J 1983 Statistical guidelines for contributors to medical journals. British Medical Journal 286:1489–1493.

20 The target difference. Differences in trial outcomes fall into three grades: (1) that the doctor will ignore, (2) that will make the doctor wonder what to do (more research needed), and (3) that will make the doctor act, i.e. change prescribing practice.

21 Bradford Hill A 1977 Principles of Medical Statistics. Hodder and Stoughton, London. If there is a ‘father’ of the modern scientific therapeutic trial, it is he.

22 Particularly in large-scale outcome trials, an independent data monitoring committee is given access to the results as these are accumulated; the committee is empowered to discontinue a trial if the results show significant advantage or disadvantage to one or other treatment.

23 Note also patient preference trials. Conventionally, patients are invited to participate in a clinical trial, give consent and are then randomised to a particular treatment group. In special circumstances, randomisation takes place first, the patients are informed of the treatment to be offered and are allowed to opt for this or another treatment. This is called pre-consent randomisation or ‘pre-randomisation’. In a trial of simple mastectomy versus lumpectomy with or without radiotherapy for early breast cancer, recruitment was slow because of the disfiguring nature of the mastectomy option. A policy of pre-randomisation was then adopted, letting women know the group to which they would be allocated should they consent. Recruitment increased sixfold and the trial was completed, providing sound evidence that survival was as long with the less disfiguring option (Fisher B, Bauer M, Margolese R et al 1985 Five-year results of a randomised clinical trial comparing total mastectomy and segmental mastectomy with and without radiotherapy in the treatment of breast cancer. New England Journal of Medicine 312:665–673). However, the benefit of enhanced recruitment may be limited by potential for introducing bias.

24 Modell W, Houde R W 1958 Factors influencing clinical evaluation of drugs; with special reference to the double-blind technique. Journal of the American Medical Association 167:2190–2199.

25 Senn S 1997 N-of-1 Trials: Statistical Issues in Drug Development. John Wiley, Chichester, pp. 249–255.

26 Jull A, Bennet D 2005 Do N-of-1 trials really tailor treatment? Lancet 365:1992–1994.

27 Whitehead J 1992 The Design Analysis of Sequential Clinical Trials, 2nd edn. Ellis Horwood, Chester.

28 A review that strives comprehensively to identify and synthesise all the literature on a given subject (sometimes called an overview). The unit of analysis is the primary study, and the same scientific principles and rigour apply as for any study. If a review does not state clearly whether and how all relevant studies were identified and synthesised, it is not a systematic review (Cochrane Library 1998).

29 Reports of therapeutic trials should contain an analysis of all patients entered, regardless of whether they dropped out or failed to complete, or even started the treatment for any reason. Omission of these subjects can lead to serious bias (Laurence D R, Carpenter J 1998 A Dictionary of Pharmacological and Allied Topics. Elsevier, Amsterdam).

30 See Cooke R J, Sackett D L 1995 The number needed to treat: a clinically useful treatment effect. British Medical Journal 310:452.

31 Sackett D L, Cooke R J 1994 Understanding clinical trials: what measures of efficacy should journal articles provide busy clinicians? British Medical Journal 309:755.

32 For example, drug therapy for high blood pressure carries risks, but the risks of the disease vary enormously according to severity of disease: ‘Depending on the initial absolute risk, the benefits of lowering blood pressure range from preventing one cardiovascular event a year for about every 20 people treated, to preventing one event for about every 5000–10 000 people treated. The level of risk at which treatment should be started is debatable’ (Jackson R, Barham P, Bills J et al 1993 Management of raised blood pressure in New Zealand: a discussion document. British Medical Journal 307:107–110).

33 A systematic error in the selection or randomisation of patients on admission to a trial such that they differ in prognosis, i.e. the outcome is weighted one way or another by the selection, not by the trial.

34 When the interpretation of an observed association between two variables may be affected by a strong influence from a third variable (which may be hidden or unknown). Examples of confounders would be concomitant drug therapy or differences in known risk factors, e.g. smoking, age, sex.

35 Used here for a group of people having a common attribute, e.g. they have all taken the same drug.

36 The Royal College of General Practitioners (UK) recruited 23 000 women takers of the pill and 23 000 controls in 1968 and issued a report in 1973. It found an approximately doubled incidence of venous thrombosis in combined-pill takers (the dose of oestrogen was reduced because of this study).

37 For this reason such studies have been named trohoc (cohort spelled backwards) studies (Feinstein A 1981 Journal of Chronic Diseases 34:375).

38 Experimental cohort studies (i.e. randomised controlled trials) are on firmer ground with regard to causation as there should be only one systematic difference between the groups (i.e. the treatment being studied). In case–control studies the groups may differ systematically in several ways.

39 Edwards I R 1998 A perspective on drug safety. In: Edwards I R (ed) Drug Safety. Adis International, Auckland, p. xii.

40 Inman W H W, Rawson N S B, Wilton L V 1986 Prescription-event monitoring. In: Inman W H W (ed) Monitoring for Drug Safety, 2nd edn. MTP, Lancaster, p. 217.

41 Guyatt G H, Sackett D L, Sinclair J C et al 1995 Users’ guides to the medical literature. IX. A method for grading health care recommendations. Evidence-Based Medicine Working Group. Journal of the American Medical Association 274:1800–1804.

42 The reporting of randomised controlled trials has been systemised so that only high-quality studies will be considered. See Moher D, Schulz K F, Altman D G 2001 CONSORT Group. The CONSORT statement: revised recommendations for improving the quality of reports of parallel group randomised trials. Lancet 357:1191–1194.

43 ‘Quick, let us prescribe this new drug while it remains effective’. Richard Asher.

Share this: