Chapter 11. Managing risks to patient safety in clinical units
Alan Merry
Introduction
Unintended harm to patients from failures in healthcare is now recognised as a public health problem of considerable importance (see Sorensen & Iedema, Chapter 1). There is more at stake than the possibility of direct harm to patients. There are a number of dimensions of quality in healthcare and risk that may manifest at different organisational layers within the system (see Figure 11.1).
Figure 11.1
(Source: reproduced with permission from Runciman et al 2007)
|
A balance must be found between investing in safety and the need to provide as much healthcare to as many patients as possible. An undue emphasis on safety by individual practitioners may lead to delays in treatment, increase the burden of illness experienced by patients on waiting lists, and in some cases reduce the effectiveness of the delayed therapy (because the benefit of some treatments depends on the timeliness with which they are provided). Redundancy (in the form of extra checks, extra staff, reserve power supplies, reserve equipment and so on) is an important way of promoting safety, but may also be seen as reducing efficiency. It may be difficult to work out the optimal balance between these conflicting considerations.
There may also be conflicts between risks for staff and the organisation and risks for individual patients – sometimes staff may feel that their own safety is best served by inactivity (declining to take on a difficult surgical case for example) or excessive activity (over-investigation of a patient before anaesthesia for example) in situations where some pragmatic middle course would be in the patient’s best interests.
Managing risks to patient safety in clinical units depends on interdisciplinary teamwork and on a cultural commitment to quality (which includes safety). In this chapter I will discuss critical aspects of effective risk management that include:
▪ understanding risk and the ways in which it can be measured
▪ understanding errors and violations and how these may occur in complex systems
▪ proactively identifying risks within one’s own unit
▪ actively reducing risk to acceptable levels without unduly impeding service delivery
▪ responding effectively when patients are harmed by healthcare.
Understanding risk
Many individuals find it difficult to evaluate the risk of infrequent events. People do not tend to use Bayesian logic when making decisions. Instead, they tend to be influenced by the nature of the hazard, their own personal experiences, social norms, and many other factors whose relevance may be more personal than logical. Marked fear of snakes or spiders is quite common, and generally out of proportion with any real hazard created by these animals. Furthermore, people who fear one do not necessarily worry about the other. Complacency about the substantial risks of smoking is common, but goes hand in hand with public expectations for the government to spend extraordinary amounts of money to further reduce the already infinitesimal risk of viral transmission through donated blood.
Expressing the risks of a particular procedure
From a clinical perspective, interest is typically in the risks of specific procedures. Published data tend to come from high-volume units, often from academic institutions, and often represent the best attainable results rather than the average. The actual risk of a procedure in the hands of one practitioner or one team may be quite difficult to establish because of the high numbers required to estimate the incidence of given adverse events with reasonable confidence limits. It is not often appreciated that (as a rule of thumb) a zero incidence of an event in x procedures could be associated with an upper 95% confidence limit of one event in x ÷ 3 procedures. Thus no adverse events in 10 procedures could be compatible with a true rate of 33%, and simply reflect a run of good luck. Conversely, one or two adverse events in a short series of procedures might be associated with surprisingly low true incidence. For example, two adverse events in 20 cases would give a point estimate of risk as 10% but with 95% confidence limits of 1% and 32%. Twenty events in 200 cases would give 10% (6%, 15%), and 200 in 2000 would give 10% (9%, 11%) – with all estimates rounded to the nearest integer.
Some ways of expressing the difference between two risks
Risk can be expressed in a number of ways. The most common expressions are ‘relative risk’ and ‘absolute risk’, the ‘odds ratio’, ‘numbers needed to treat’ (NNT) and ‘numbers needed to harm’ (NNH) set out in Table 11.1.
Expression of risk | Discussion and example |
---|---|
The risk of an event | The rate at which the event occurs, e.g. 10 events per 1000 exposures would be a 10% risk rate. |
The odds | The number of patients having the event compared with those not having it, i.e. using the above example, the odds ratio is 10:90, or 1:9. |
Relative risk and odds ratio | The extent to which a treatment reduces the likelihood that an event will occur. For example, if a treatment reduces the occurrence of a particular event from 0.5% to 0.25%, the relative risk is 50%. The odds of the event without the intervention are 1:199 and with it 1:399. The odds ratio is therefore 1/399 ÷ 1/199. This is very nearly the same as the relative risk reduction. When the risks are higher, however, the difference between these two measures becomes greater. For example, reducing a risk of 50% to 25% also gives a relative risk of 50%, but the odds are 1:1 and 1:3 so the odds ratio in this case is 1/3 ÷ 1/1, that is 33%. |
Absolute risk reduction | The difference between having a treatment and not having it. In our first example, the reduction in absolute risk is 0.25%, but in the second it is 25%. This expression perhaps describes the true benefit of each intervention or treatment more realistically than does relative risk. |
Number needed to treat | The number of patients who need to be treated before a beneficial outcome occurs in one person. In the first example above, the NNT would be 400. |
Number needed to harm | The number of patients who need to be treated before an episode of harm occurs in one person. |
The importance of exposure
An attempt is often made to compare risks of a medical procedure with those of a common every day activity. For example, anaesthetists often say that the risk of an anaesthetic is comparable with that of driving a car in traffic. This sounds plausible because the lifetime possibility of dying from one or the other event may be similar. However, for most people exposure to traffic is much greater than exposure to anaesthesia. It is important, therefore, to relate risk to exposure. The risk of undergoing anaesthesia in a developed country has been estimated as 1000 deaths per 100 million hours of exposure, while that of being in traffic (in any capacity) is only 50 deaths per 100 million hours of exposure. Interestingly, in relation to time of exposure, flying in a commercial aircraft is more dangerous than being in traffic, with a risk of 100 deaths per 100 million hours of exposure. Note that the way in which exposure is defined can make a difference here: if distance travelled were used as the denominator, then flying would be safer than driving a car (Runciman et al 2007).
The importance of underlying risk
The value of an initiative to reduce risk is dependent on the underlying rate of the adverse event in question. For example, a treatment that could reduce the risk of coronary artery disease by 10% would have no value if given to children, because they hardly ever get coronary artery disease. It is possible to stratify adults according to their risks of coronary disease, and it makes sense to invest in those groups at highest risk. A 10% reduction in this risk would give an absolute risk reduction of 0.1% if the underlying rate of the disease was 1%, and of 2% if the underlying rate was 20%.
Refining the concept of harm
One point often missed about studies like the Harvard Medical Practice Study (Brennan et al 1991) is that many of the deaths identified in these studies occurred in very sick patients who did not have long to live anyway (Hayward & Hofer 2001), and therefore, although not acceptable, they are not quite comparable with most deaths caused by road traffic accidents.
In addition to mortality rates, it might help to define the burden of iatrogenic harm in order to measure the quality adjusted life years (QALYs) (Murray & Lopez 1996) lost in this way, although there are limitations to the value of these indicators as well (La Puma & Lawlor 1990).
The importance of balancing loss against gain
Another limitation is the uni-dimensional nature of the results of these studies. Very little attempt has been made to balance the burden of adverse events from admissions to acute care institutions with the reduction in the burden of disease associated with the same admissions. When quantifying risk, some measure of accomplishment is useful to place into context the data on harm.
Process in healthcare – in pursuit of six sigma quality
Many medical activities are relatively straightforward, and it should be possible to achieve reliability levels equivalent to those now expected as normal in certain industries. The concept of ‘six sigma quality’ involves expressing the reliability of a given process in terms of standard deviations from the mean of a normal distribution. In effect, six sigma quality implies 3.4 failures per million events (procedures undertaken, or products produced). This level of reliability is expected in many industrial and manufacturing processes today, but if all adverse events are considered (not just deaths), it is not often achieved in healthcare, if ever. For routine processes this is no longer acceptable. It is time to clearly identify aspects of healthcare that lend themselves to process management, and adopt an approach for these that is process oriented, standardised, and strictly compliant with clearly defined and monitored guidelines.
On the other hand, it must also be recognised that not all aspects of healthcare lend themselves to the methods of industry. In general, diagnosis tends to be more demanding and depends more on experience and judgment than procedural work. Patients do not usually present with diagnoses emblazoned on their foreheads. A systematic and standardised approach will no doubt improve the likelihood of a correct diagnosis, but not all patients present in accordance with the textbooks. Even after a correct diagnosis has been made, there are many conditions for which the question of what to do next remains controversial. It is not easy to allow for uncertainty of this type when measuring the quality of healthcare.
Many procedural activities also involve considerable variation from case to case, and extend skilled practitioners to the limits of their ability. Performing a triple heart valve replacement in a 78-year-old patient is a very different proposition from replacing a single valve in a 40-year-old, for example. As another example, in major trauma each patient presents a unique combination of problems and requires sustained and creative effort from a large number of people working in unison. An activity of this sort is more like trying to win the World Cup in Rugby than like managing an industrial process.
Casemix and risk scores
Units that regularly undertake more complex and acute cases are likely to have more adverse events than those that concentrate on the routine and the straightforward. Some measure of differences in casemix is needed to take this variation into account. Various scoring systems have been adopted for this purpose – the EuroSCORE for cardiac surgery being one example (Nashef et al 2002) and the Goldman Index of cardiac risk for patients undergoing non-cardiac surgical procedures being another (Goldman et al 1977). Scores of this type are generated using regression analysis of information collected in large databases, and then validated by testing in other populations. They tend to become dated quite quickly as methods of managing medical conditions advance, and may not be reliable if applied to patient populations different from the ones in which they were developed.
Measuring performance over time
It is important to know not only whether a team or individual’s rate of adverse events is comparable with benchmark statistics, but also whether it is stable. Most time series of biological data exhibit variation, which can be of two types. Common cause variation may be thought of as ‘noise’. Thus a unit might have an infection rate that fluctuates by ± 2% around a mean of 5%. Fluctuation of this type usually reflects random variation from day to day, and the underlying rate may be stable from one year to the next. On the other hand, something may change. Some failure in process or alteration in resistance of microbes might lead to a jump to (say) 7% ± 2%, or some initiative to improve sterile procedures or the timely administration of antibiotics might lead to a fall to 3% ± 2%. The challenge is to detect special cause variation of this type in a reliable way. This is particularly important in relation to monitoring the effect of interventions to promote safety. Not only may such interventions fail to achieve their aim, they may even make matters worse (through so called ‘revenge effects’ (Tenner 1997)).
Simply comparing rates at two time periods may be misleading, in the absence of information about the long-term stability of the process in question. There are a number of better ways to distinguish special cause from common cause variation over time. Cumulative sum (cusum) charts (Bolsin & Colson 2000) and control charts (Carey, 2002a and Carey, 2002b) are two examples. The data for these charts may or may not be adjusted for risk. One method of using risk-adjusted data to monitor the ongoing performance of an individual or a unit is the variable life-adjusted display (VLAD) chart. It is sometimes used in cardiac surgery, where it is possible to calculate an expected survival rate using scoring systems and each patient’s known risk factors. Outcomes (death or survival) can be plotted against numbers of cases operated on. If a particular patient has an expected survival probability of 0.9 (or 90%) and survives, the plot is moved one unit along the X-axis and 0.1 along the Y-axis in a positive direction. If such a patient dies, the plot is moved one unit along the X-axis and 0.9 units along the Y-axis in a negative direction. At the end of 10 cases, if as expected one dies and nine survive, the plot will have moved 10 points along the X-axis and will be at zero on the Y-axis. Statistical limits can be set to identify points at which it becomes likely that one is dealing with special cause variation rather than common cause. VLAD charts have much to offer in relation to procedural work where high levels of standardisation are possible, and where clearly definable adverse outcomes (such as death) are relatively common. A Microsoft Excel spreadsheet for creating VLAD charts on the basis of the logistic EuroSCORE can be downloaded from The Clinical Operational Research Unit at University College, London. 14
Defining acceptable standards
A point often missed in relation to any of these methods is that results are inevitably defined as acceptable in relation to the norms for a particular group. It is important to understand certain implications of this approach. In any one clinical unit it is a statistical inevitability that one practitioner will get the best results and another will get the worst. If one had 10 surgeons, evaluated results and fired the worse performer in the name of quality improvement, one would simply have reduced the group to nine, with a different surgeon in the bottom place. In the same way, demands that all practitioners should be better than average are statistical nonsense. Some adequately performing clinicians must, by definition, be operating below the average, and there is no possible way of changing this fact.
Pause for reflection
Some healthcare professionals are ‘insiders’ and in the privileged position of knowing the results of their colleagues. Would such a person elect to undergo cardiac surgery from the member of the unit who has the worst outcome data? Would the decision to choose the surgeon with the best results be (a) rational and (b) ethical?
This discussion is not trite, because no patient, if asked, would say he or she wanted to be operated on by a surgeon whose results were ‘below average’, let alone the worst surgeon in a group. However, in a clinical unit, all surgeons have to contribute to the work.
Yet another difficulty is that the clinical situations that lend themselves to an analysis of this type are very much in the minority. Is it equitable that many cardiac surgeons today are required to have their results publicly scrutinised while their colleagues in general medicine, psychiatry and geriatrics are not? Actually, even their anaesthetic colleagues tend to be left out of this process, notwithstanding evidence that the anaesthetist may contribute to outcome after cardiac surgery (Merry et al 1992). It could be argued that there is merit in applying sound methods or monitoring performance where possible, and in working towards improving the situation where it is not. Alternatively it could be argued that the inequitable application of harsh standards to isolated groups is likely to do little more than promote gaming. For example, risk scores are known to be inaccurate at the extremes, so in practice patient selection will probably influence the results of a VLAD chart even though this should not happen in theory.
Two more points are relevant to this debate. Statistics apply to groups rather than individuals. A risk of 10% means that 10 of the next 100 patients are likely to die, but for each patient the result will be unpredictable, and will be either life or death (not 10% death). Morbidity is also important – in the example of cardiac surgery the risk of stroke is significant, may differ from the risk of death, and may be more feared by patients, but strokes are seldom incorporated into risk scores or monitored with VLAD charts. Also the management of patients goes beyond procedural results, and includes decision making in the first place, and then the whole amalgam of interpersonal skills, compassion and professionalism that might be very important to individual patients. Thus it can be seen again that the use of a uni-dimensional measurement such as a mortality rate is inadequate as a means of measuring performance in an endeavour as complex as healthcare.
Where databases are large, and many individuals contribute to the data, a normative approach based on reasonable outcome measures is fairly satisfactory, but its limitations should not be forgotten. In most cases the difference in outcome between the best and worst clinician in a unit should be small. Any difference will become statistically significant if very large numbers of data are available for analysis, but ideally the difference between individuals within one unit should not be clinically important. In fact the point is not whether one is the best or worst performing individual available to a patient, but whether one’s performance is good enough. The same thing can be said of units (see Box 11.1)
Box 11.1
At the paediatric cardiac unit of the Bristol Royal Infirmary 29 children died and four were left brain damaged following open heart surgery between 1984 and 1995. This was twice the expected mortality rate. A major inquiry was undertaken, and 198 recommendations were made, which led to many reforms in medical practice in the UK, including the establishment of more rigorous requirements for monitoring the results of surgery.
It can be seen that there may be clinical units in which the important question is not which surgeon is the best on the team, but rather whether the team as a whole is functioning to a standard that is acceptable at all.
The importance of the team
Many of the difficulties associated with monitoring performance become less marked if the emphasis is placed on the team rather than on the individuals within it. This approach still depends on making sure that all individuals are performing adequately, but it is more likely to be associated with standardisation, the universal adoption of evidence-based approaches, teamwork and supportiveness, and better outcomes for all patients.
Given the large numbers of units across the world, there is no reason why one’s own unit should not aspire to being better than average. If all units worked hard to achieve that goal, the average standard overall would improve (even if the number below average remained obdurately at 50%!).
Legal implications of risk in healthcare
Increasingly, risk management is being driven by legislative requirements. Some of the legal pressures are proactive – explicit demands for accreditation for example. Others are reactive – litigation in response to patient harm for example. In most countries today legal considerations weigh heavily when decisions have to be made about how much of a limited resource should be invested into risk management.
Accreditation
Accreditation is a formal process for demonstrating that an organisation complies with certain standards. The degree to which accreditation of healthcare is compulsory varies from country to country and so does the process by which it is achieved. In the US the Joint Commission on Accreditation of Healthcare Organizations (JCAHO) provides accreditation required to obtain reimbursement from Medicare and Medicaid (healthcare funding for the elderly and poor respectively). In Australia and New Zealand there is a trend towards greater emphasis on accreditation. For example, the Australian Commission on Safety and Quality in Healthcare developed a national standard for credentialling and defining the scope of clinical practice in 2004, and this has formed the basis for initiatives by state governments such as introducing a policy for the credentialling of senior doctors appointed to public health services in Victoria. 15
Compliance with standards can be assessed by considering structure, process or outcome (Donabedian 2003). In theory, if it were possible to measure relevant outcomes adequately, there would be little need to measure anything else, but in practice this may be much more difficult than establishing that the required buildings, equipment and personnel are in place (i.e. that the structure of the organisation is adequate) and that the right protocols, clinical pathways and other process tools have been established and are in use.
Tort
In many countries there has been a substantial increase in the cost of litigation related to healthcare during the second half of the last century, and in the UK, US and Australia the costs of the tort system now amount to 1% of expenditure on healthcare (Runciman et al 2007). In New Zealand a no-fault system of accident compensation was established following the 1967 Sir Owen Wodehouse report, and it is essentially unknown for doctors to be sued for negligence (Merry & McCall Smith 2001). Today the Accident Compensation Corporation is responsible for promoting safety within the healthcare system proactively and for compensating patients after they have been harmed. The tort system tends to be slow, inefficient and somewhat capricious in the degree to which it achieves any worthwhile outcomes for patients, healthcare professionals or the system as a whole. From first principles the New Zealand approach appears to be more effective, and it is popular with New Zealanders. Individual practitioners are still held to account through the investigations of the Health and Disability Commissioner and through disciplinary actions by the Medical Council, but on the whole, organisations are less at risk from the repercussions of harming patients than they would be if lawsuits were still possible. It is hard to obtain data on the relative effectiveness of these different systems in promoting safety, but it is at least possible that the loss of the tort system from New Zealand has removed one incentive for hospital administrators to invest in safety initiatives.
Criminal law
It ought to go without saying that the role of criminal law in healthcare is to deal with seriously culpable behaviour (Merry & McCall Smith 2001). Unfortunately, in New Zealand in the 1990s, and more recently in the UK, there has been a tendency to respond to tragic accidents that have resulted in the deaths of patients by charging healthcare professionals with manslaughter. In many cases the level of negligence involved has been minimal (Ferner & McDowell 2006) and only a minority of the charges have been successfully prosecuted, but the impact on all concerned has been very substantial. This topic has been discussed in greater depth elsewhere (Merry 2007).
Whose risk should we manage?
One of the difficulties with legal responses that focus on punishing individual practitioners, and even with those in which the primary aim is to provide compensation for patients, is that they do little to promote safety at an organisational level. In fact, the motivation for many so-called safety initiatives is the protection of administrators, clinical leaders and individual practitioners. This is particularly true of some policies instituted in the name of safety or quality (see Box 11.2.)
Buy Membership for Internal Medicine Category to continue reading. Learn more here