Excellent |
|
Good |
|
Fair |
|
Poor |
|
Causation
At the core of the diagnosis of child maltreatment is the concept of causation. A primary concern is how certain are we that a finding, or constellation of findings, has been caused by child maltreatment. As a matter of epistemology, causation remains debated. There exist no specific criteria by which one can “prove” that an outcome has a specific cause; only demonstrations of a relationship between outcome and (proposed) cause. For clinical research the Bradford Hill Criteria (20, 21) serve as an agreed upon guide for demonstrating causation. These criteria are listed in Table 13.2. Of these criteria, only one (temporality) is required. The other criteria provide support for a causal relationship. These criteria help us understand the relationship between a finding on imaging and the explanation provided for that finding.
1. Strength | There is a strong association between the cause and the effect |
2. Consistency | There is a consistent association between cause and effect within comparable studies |
3. Specificity | There are a limited number of competing potential causes for the effect |
4. Temporal sequence* | The effect should always follow the cause |
5. Dose response | The cause and effect have a dose-dependent relationship |
6. Biologic plausibility | The relationship between the cause and the effect should be biologically reasonable; should not break the laws of physics |
7. Coherence | The association between the cause and effect should be consistent with other known biologic causes; not an outlier |
8. Experimental evidence | The association between the cause and effect should be supported by experimental evidence |
9. Analogy | The mechanisms or processes in which the cause results in the effect should have other examples in nature |
* The only required criterion.
Evidence-based radiology
The application of the EBM principles to the diagnostic imaging of child maltreatment requires some important distinctions. Evidence-based radiology (EBR) utilizes the same framework as EBM (critical appraisal exercise) but has some particular emphases which make it distinct. Similar to EBM, EBR involves the medical decision-making utilizing clinical experience with integration of the strongest evidence in the medical literature (22–25). As with EBM, EBR begins with a foundation of crafting the appropriate clinical question and searching for the highest-quality evidence to answer that question (22, 24). A fundamental distinction of EBR is that the primary question is one of diagnosis as opposed to therapy (with the exception of interventional radiologic and several diagnostic radiologic procedures). The diagnosis question can be separated into two broad fundamental sub-questions: (1) the best way to identify a finding (Practice); and (2) the implications of the presence or absence of a finding (Meaning).
The first is a question of diagnostic efficacy (best way to image). These study designs have a specific hierarchy of variables which is particular to EBR (Table 13.3) (25, 26). These represent the important considerations when deciding the strongest way to identify a finding. The evidence utilized is intended to identify and support the best modality, technique, timing, and population required to identify findings, which, in our case, are concerning for being due to child abuse (25).
Level | Type of efficacy and typical measures |
---|---|
1 | Technical efficacy: |
|
|
2 | Diagnostic accuracy efficacy: |
|
|
3 | Diagnostic thinking efficacy: |
|
|
4 | Therapeutic efficacy: |
|
|
5 | Patient outcome efficacy: |
|
|
6 | Societal efficacy: |
|
The second question is a question of implication or meaning. In essence it is a question of causation (what caused a finding). When assessing the evidence for causation, the medical literature, in the context of the Bradford Hill Criteria, serve as a guide. Within the critical appraisal exercise of EBR, evaluating the level and quality of evidence retrieved is critical. The ideal study for a question of diagnosis would include a full spectrum of patients (various manifestations), randomly or consecutively selected, in which the studies are read by an independent reviewer who is masked to the group or condition (22, 27). As RCTs would not be a meaningful study design for diagnostic radiology, there is a separate hierarchy of strength of evidence which mirrors the hierarchy utilized for EBM (Table 13.4) (28). This structure removes RCTs and provides a guide to distinguish high-quality from low-quality evidence in studies within diagnostic radiology. This tool is beneficial in that studies of low quality do not require any analysis and can be discarded readily.
Grade of recommendation/level of evidence | Prognosis | Diagnosis | Differential diagnosis/symptom prevalence study |
---|---|---|---|
A/1a | SR (with homogeneity) of inception cohort studies; CDR validated in different populations | SR (with homogeneity) of Level 1 diagnostic studies; CDR with 1b studies from different clinical centers | SR (with homogeneity) of prospective cohort studies |
A/1b | Individual inception cohort study with >80% follow-up; CDR validated in a single population | Validating cohort study with good reference standards; or CDR tested within one clinical center | Prospective cohort study with good follow-up |
A/1c | All or none case series | Absolute SpPins and SnNouts* | All or none case series |
B/2a | SR (with homogeneity) of either retrospective cohort studies or untreated control groups in RCTs | SR (with homogeneity) of Level >2 diagnostic studies | SR (with homogeneity) of 2b and better studies |
B/2b | Retrospective cohort study or follow-up of untreated control patients in an RCT; “Derivation of CDR” or validated on split-sample only | Exploratory cohort study with good reference standards; CDR after derivation, or validated only on split-sample or databases | Retrospective cohort study, or poor follow-up |
B/2c | “Outcomes” research | Ecologic studies | |
B/3a | – | SR (with homogeneity) of 3b and better studies | SR (with homogeneity) of 3b and better studies |
B/3b | – | Nonconsecutive study; or without consistently applied reference standards | Nonconsecutive cohort study, or very limited population |
C/4 | Case series (and poor quality prognostic cohort studies) | Case-control study, poor, or nonindependent reference standard | Case series or superseded reference standards |
D/5 | Expert opinion without explicit critical appraisal, or based on physiology, bench research, or “first principles” | Expert opinion without explicit critical appraisal, or based on physiology, bench research, or “first principles” | Expert opinion without explicit critical appraisal, or based on physiology, bench research, or “first principles” |
CDR, clinical decision rule; RCT, randomized controlled trial; SR, systematic review.
* An “Absolute SpPin” is a diagnostic finding whose specificity is so high that a positive result rules-in the diagnosis. An “Absolute SnNout” is a diagnostic finding whose sensitivity is so high that a negative result rules-out the diagnosis.
Lastly, it is important to highlight that EBR does not imply “cold reading” of imaging without clinical context. In fact, including clinical information in the reading of diagnostic imaging has been demonstrated by a systematic review of the literature to increase reading accuracy compared with blinded reading (29). Again, at its core, EBR is combining the practice of radiology with an expanded appreciation of the literature and clinical context.
Common pitfalls of EBM and radiology
Bias
One of the primary strength of the critical appraisal exercise of EBM/EBR is to overcome bias. Bias, in clinical research, is systematic error that can artificially distort the data (30, 31). Bias often is a function of poor study design or implementation. An example of bias would be identifying a cohort of patients for a study in an inconsistent or arbitrary manner (selection bias). This cohort may have characteristics which are artificially present and not representative of the real population. Table 13.5 lists a number of other biases which can influence studies. Common methods to reduce bias in research are subject randomization, consecutive recruitment of subjects, prospective study design, and investigator blinding (31). In addition to research bias, there are a number of cognitive biases that the practitioners of EBM/EBR should be aware of (32, 33), a small sample of which are found in Table 13.6 (33). Cognitive biases are internal choice preferences which may influence judgment decisions.
Bias | Description |
---|---|
Selection bias | The subjects, interventions, or procedures are chosen in a nonrandom fashion which may affect the results |
Sample bias | The group chosen to study does not represent the population of interest |
Loss-to-follow-up bias | Subjects in a study are followed-up in an unequal manner |
Disease spectrum bias | A limited form of the condition is studied; not representative of the true spectrum of the disease |
Referral bias | Local practices or conditions affect patient referrals or procedures |
Self-selection bias | The subjects who self-select are different from those who do not |
Recall bias | Subject historical recall can be incomplete to “unremarkable” but important items |
Interviewer bias | An interviewer may “frame” or “coach” the information from the subject |
Verification bias | Testing occurs to a subset of subjects who “screen positive” and not to a full population |
Response bias | Data are missing in a nonrandom manner; negative results are not evaluated as thoroughly |
Reviewer bias | The person collecting the data is not blinded to disease or condition |
Test review bias | In retrospective studies; when the diagnosis is known at the beginning of the study |
Imperfect-standard bias | The diagnostic standard is subjective |
Confounding | The role other “unknown” variables play in the data; within the subject, disease, or study design |
Cognitive bias | Description |
---|---|
Anchoring | The inclination to lock onto a diagnosis early on and failing to reconsider after receiving contradictory information |
Availability | The tendency to judge things as being more likely to occur due to recent exposure to similar situations |
Base-rate neglect | The tendency to ignore the true prevalence of disease and therefore either inflating or reducing its base-rate; often practiced in the strategy of “ruling out the worst-case scenario” |
Diagnosis momentum | Patients receive their diagnostic “label” and their diagnosis is carried on from person to person without being challenged |
Gender bias | The predisposition to believe that gender plays a role in the likelihood of diagnosis when no such trend exists |
Heuristics | Mental shortcuts used in cognitive reasoning to solve problems with minimal effort, i.e., “rules of thumb.” Based on previous knowledge and experiences but may lead to mistakes |
Outcomes bias | The predisposition to make diagnostic decisions that will lead to good outcomes; physicians lean toward making decisions targeted toward what they hope might happen rather than what they really believe might happen |
Overconfidence bias | The tendency to believe we know more than we do, or that we are correct more frequently than we really are |
Premature closure | The tendency to accept a diagnosis before it has been fully confirmed |
Case reports
A frequent pitfall is attaching too much emphasis on case reports as a foundation for clinical decisions. Case reports are the weakest form of evidence, and are by their very nature outliers. It is important to recognize the complete context of the clinical scenario and make clinical decisions from a perspective of “most likely” and not “most rare.” This is the difference between “it is possible” and “it is likely.” A single case report would indicate “it is possible.” The complete clinical picture would indicate whether “it is likely.” An example of how case reports can distort the understanding of a clinical scenario would be a case report of a one-year-old who was in a high-speed motor vehicle collision (34). The young girl had a profound cervical spine injury. Within the report the authors make no mention of intracranial injury (ICI). This report has been interpreted by others, despite the absence of data on the intracranial findings of the child, to support the belief that shaking an infant is not dangerous (35). Generalizing from a single case report would be similar to proposing that because there exists a case report of the only person who has been documented to have survived a rabies infection (36), that rabies, a universally fatal infection, is now safe; or given survival of a plane crash has been reported (37), that plane crashes are now safe.
Limiting evidence to only RCTs
Particularly in the context of diagnostic radiology, the RCT may be an inappropriate benchmark for evidence. As noted above, the appropriate study design for a question of diagnosis would be either a cross-sectional or cohort study (depending upon the precise question). Some authors have indicated that the absence of RCTs within the field of abusive head trauma (AHT) is a sign of weak published evidence (17, 38–40). This conclusion throws into stark relief a limited understanding of foundation EBM and clinical research (41).
Beyond the scope of diagnostic radiology it is important to note that not all meaningful clinical questions are, or can be, answered by an RCT (1). As reported in the BMJ in 2003, there is an absence of RCTs demonstrating the safety of parachutes (42), yet our clinical experience would endorse their safety over the alternative. Within the field of child maltreatment, take the example of the dangers of shaking an infant. The absence of an RCT would not support the contention that we cannot generate any meaningful evidence on the dangers of shaking an infant. We have a large amount of epidemiologic data on the dangers of trauma to an infant and the safety of short falls for an infant. This gives en face validity that shaking is indeed quite likely dangerous for an infant. An RCT of shaking infants is clearly unethical. Thus, despite the absence of an RCT of shaking or dropping infants, the contention that shaking an infant is dangerous is nearly universal. Thankfully, even the few hardcore skeptics about the dangers of shaking have not shaken an infant in an attempt to demonstrate shaking’s safety, but limited their studies to using biofidelic dummies (43). This is an example of how context (clinical experience) can provide persuasive evidence in the absence of an RCT.
An unsophisticated view of the scientific process would limit knowledge to the RCT. The vast majority of knowledge, not just in medicine but all knowledge has been obtained without the benefit of an RCT. Within the “hard” sciences – physics, chemistry, and biology – the RCT is not a part of gaining new understanding. From Newton’s Laws of Thermodynamics to Einstein’s Theory of Relativity, from the Krebs Cycle to photosynthesis, no RCTs have been done; yet we accept these as truths (or, at least, most of us do). The foundation of scientific knowledge is observation. As physicist Richard Feynman said in his 1964 Messenger Lecture Series at Cornell University, when trying to identify a new theory, “First we guess it. Then we compute the consequences of the guess to see what would be implied if this law that we guessed is right. Then we compare the result of the computation to nature, with experiment or experience; compare it directly with observation, to see if it works”(44, p. 165). Thus, to a Nobel Prize winning physicist, experience and observation are valid and valuable ways to obtain new knowledge in science.
Ignoring the quality of a study
Literature appraisal is a skill which many busy clinicians may not have the ability to develop. The Materials and Methods section of a paper is often not read, or at least not scrutinized to any degree for the reader to be able to assess the underlying quality of the research design. Along with the exploding body of scientific and medical literature has been a steady decrease in the quality of that literature (45–47). It is now beholden upon the reader to not only appreciate the appropriate study design for their clinical question, but to also be able to critically appraise the quality of the research itself. Contributing to the challenge has been the growth of the for-profit medical publishing sector (“Pay-to-Publish”). There are growing data that the increase of for-profit publishing has eroded the quality of published scientific and medical literature (48).
Criticism of EBM
Criticism of EBM has been present since its introduction (49, 50). The primary early criticism of EBM included publication bias, resulting in misleading results in the published literature (49); that it de-emphasized clinical basics (history and physical examination) (49); that it was arrogant in its claims to the truth (50); and that it was designed to be simply a cost-cutting maneuver (50). The concerns were that the advent of EBM was an attempt to supplant patient-centered clinical experiences to a simple algorithm. Over the intervening 15 years EBM supporters have tried to continue to balance clinical experience (focused patient-centered questions) with meaningful evaluations of the quality of evidence supporting a clinical judgment (literature appraisal).
One criticism leveled within the Child Maltreatment field is that EBM introduces bias by incorporating clinical experiences into the algorithm (Critical Appraisal Exercise). The concern is that by allowing any clinical judgment into the process, cognitive biases will outweigh the evidence from the literature. According to Sackett, clinical experiences are not the particular individual foibles of the practitioner, but by “individual clinical expertise we mean the proficiency and judgment that individual clinicians acquire through clinical experience and clinical practice” (1). In essence, EBM requires context. Interpretation of a particular clinical situation without context runs the great risk of being misled by case reports and outliers.
An additional criticism, particularly regarding AHT, is that the evidence-base is weak. Some authors have used the EBM hierarchy of strength of evidence to grade the evidence as it pertains particularly to AHT (17, 51). This is not as much a criticism of EBM, but more of a misapplication. The few skeptics of AHT will commonly cite their perceived lack of evidence in one breath and then propose an unsupported theory in the next. In a 2011 review by Barnes, he feels that “much of the traditional literature on child abuse consists of anecdotal case series, case reports, reviews, opinions, and position papers” (51). Yet in the same piece, the author proposed pertussis, rickets, seizures, hypoxia, dysphagic choking, and “certain therapies” as mimics of findings seen in AHT without any meaningful supporting evidence (51). It cannot be both ways. One cannot overcome a perceived poor evidence-base by adding anecdote or conjecture. Simply invoking the veil of EBM does not ordain the entire position.
“Temporary brittle bone disease”
An infant with multiple unexplained fractures raises the clinical concern for these fractures being the result of inflicted trauma or physical abuse. One particular concern for the clinical evaluation of infants with unexplained fractures is the assessment of a skeletal predisposition for fragility. As described in previous chapters, there exist numerous genetic, nutritional, anatomic, and functional conditions that predispose an infant to fractures and require consideration by the clinicians caring for the infant. Over the past 20 years the presence of a condition labeled “temporary brittle bone disease” (TBBD) has been proposed and embraced by a handful of professionals. The following is a deconstruction of the concept and evidence of TBBD, being mindful of the context of EBR and the fundamentals of clinical research. There are two main streams of TBBD promoted by its supporters. While they were both developed independently of each other, there is much overlap. Each will be discussed separately.
Paterson’s “TBBD”
In the early 1990s, Paterson and colleagues reported on 39 infants with fractures during the first year of life. The authors suggested that these infants suffered from a “self-limiting variant of osteogenesis imperfecta” due to a fundamental transient defect in collagen formation, and coined the term “temporary brittle bone disease” (52, 53). These reports, in addition to advocacy of the concept of transient brittle bone disease in court cases of alleged child abuse, have stirred intense controversy in the United Kingdom and North America. The proposal of TBBD by Paterson and colleagues was initially hotly debated in the literature. Their methods, results, and conclusions have been scrutinized in letters to the editor, invited commentaries, and author responses (53–65).
As described, most of the clinical and radiologic features ascribed to TBBD are those classically noted in cases of abuse. These include: vomiting and apnea; fractures during early infancy (often involving the metaphyses and ribs) – commonly in the absence of external features – found “on accident” without other conditions that would account for the fractures (52, 53, 66, 67). In essence, the diagnosis of TBBD retains the clinical and radiographic features of physical abuse of an infant; only the clinician “believes the parent” did not cause the injury and the injury is due to an unidentified (and undefined) bone fragility condition.
Paterson and colleagues also described a variety of clinical and laboratory findings that are nonspecific features or risk factors also associated with infant abuse (53, 57). These authors viewed their cases as distinctive because the fractures were unassociated with evidence of trauma and because they were found incidentally when radiographs were carried out for other reasons. The belief that fractures often require external signs of injury is not supported in the medical literature and is refuted by both Mathew and colleagues (68) and Peters and colleagues (69). Both of these studies support the infrequency of bruising with fractures, independent of whether caused by abuse or accident. The rationale for performing skeletal surveys (SSs) and bone scintigraphy in children with suspected abuse is predicated on the concept that fractures, which are strong indicators of abuse, are usually unapparent on physical examination (see Chapter 14) (57, 70–72). Additionally, the value of repeating the SS, in most cases, to identify fractures that may initially be radiographically occult has repeatedly been demonstrated (73–77).
It is interesting that 54% of Paterson and colleagues’ cases were preterm infants. Because 21% of their patients had gestational ages of less than 33 weeks, it is possible that some of these immature infants suffered from the well-recognized metabolic bone disease of prematurity. Fractures involving the metaphyses and ribs are familiar radiologic features in very premature infants and can be explained on the basis of decreased calcium stores – osteopenia of prematurity (see Chapter 8) (78–81).
The fracture types within the TBBD spectrum outlined by Paterson and colleagues included diaphyseal fractures in 57%, metaphyseal “abnormalities” in 76%, and rib fractures in 72% of cases (53). The published literature in the field has identified these radiologic observations as characteristic of abusive fractures (54, 57, 71, 72). Most metaphyseal fractures in osteogenesis imperfecta (OI) are nonspecific and do not conform to the classic metaphyseal lesion (CML) patterns originally described by Caffey and further elucidated by radiologic–histopathologic studies (see Chapter 2) (82, 83).
Fractures near the costovertebral articulations, as described by Paterson and others, entail certain specific mechanical factors in their pathogenesis (see Chapter 5). Studies point to severe anteroposterior (AP) thoracic compression, with leverage of the ribs over the transverse processes, to produce the characteristic radiologic and pathologic findings (84). Even when there is significant demineralization associated with metabolic bone disease in infants, posterior rib fractures are usually more laterally situated (see Fig. 8.3). Additional radiologic findings described by Paterson and colleagues in TBBD include periosteal “reactions” (49%), expanded costochondral junctions (CCJs) (34%), and overt osteopenia (31%) (53). As no radiologists were authors of this publication and the methods section does not address it, the certainty of these findings is unclear. Subperiosteal new bone formation is a well-recognized normal finding in young infants (57, 85) and prominent CCJs are often apparent on chest radiographs in infancy (57). It is unclear how these normal findings in infants are determined to be features consistent with TBBD.
Paterson and colleagues indicated that some of the fractures occurred while the infants were hospitalized (53). In a follow-up publication in 2009, Paterson reported on five children who he contended suffered fractures, while in medical care, as a result of TBBD (86). Two of the five children with fractures had clear alternative causes for their fractures (osteopenia of prematurity, birth-related trauma), and two of the five were likely abused (sent home with caretakers and returned to the hospital because of facial bruising). The 5th child spent their entire 15 months in the hospital and was noted to have 17 total rib fractures. TBBD was diagnosed by virtue of the child being in the hospital (86). It is likely that at least some of these 5 children are part of the cohort originally described by Paterson (86), as the author indicated a follow-up period for these children of 6–18 years. As noted in preceding chapters, rib fractures and many other osseous injuries may only become radiographically apparent 1–2 weeks after the injury (70, 75–77) and, therefore, fractures sustained prior to hospitalization may become evident only on follow-up radiographs acquired during or following hospitalization (57, 71, 72).
Of particular interest is the suggestion by Paterson and colleagues that these cases reflect a “temporary deficiency of an enzyme, perhaps a metalloenzyme, involved in the post-transitional processing of collagen” (52, 53, 87). Unfortunately, since there is no “methods” section in either publication on this subject, it is impossible to assess the authors’ assertion of a metabolic defect. There has been no scientific support for the view that a temporary defect exists in either the structure or rate of production of type I collagen as seen with (OI) (see Chapter 9).
The authors describe the serum copper levels in three cases: absent in one and normal in two others. Given the absence of supporting data for copper deficiency as the pathophysiology of TBBD, Paterson has expanded the potential causes to include vitamin D deficiency (88, 89). To keep the hypothesis of TBBD active, Paterson broadens the concept to include those conditions that are otherwise real mimics of physical abuse. In publications subsequent to the initial cohort described by Paterson and colleagues, the author expanded the potential causes of TBBD to include vitamin D deficiency (see Chapter 8), vitamin C deficiency, heritable conditions, and other collagen defects (such as Ehlers–Danlos) (86, 90, 91). It is notable that Paterson indicates that after over 30 years of promoting TBBD, “We still do not know its cause or causes. We still have no specific diagnostic tests. We cannot exclude other causes of fractures in every one of our published cases” (92). Despite the absence of data on pathophysiology, Paterson and Monk endorse a prenatal “cause” for TBBD, stating “(t)he time scale of the fractures suggests that intrauterine factors are significant” (93).
The inability to distinguish between TBBD and physical abuse is most readily apparent in the report by Paterson and Monk in 2013 (91). The authors report on 20 infants diagnosed by Paterson over the prior 15 years who presented with fractures between 1 and 6 months of life, who were, in addition, noted to have intracranial bleeding. Seventeen of the 20 were referred by the parents’ attorneys and 5 were diagnosed based solely upon review of the medical records. All of the infants were noted on cranial computed tomography (CT) to have subdural hemorrhage (SDH), and eight were found to have retinal hemorrhaging as well. Nine of the infants had follow-up information available, five of whom had persistent neurologic sequelae.
This cohort of patients would reasonably be considered likely victims of physical abuse. Despite features that would point strongly to abuse, the authors contend that their patients did not suffer maltreatment because the CT scans in 15 cases had hypodensity consistent with the prior history of vomiting and apnea, 3 had accidental injuries reported by caretakers, and others likely developed the SDH in utero or at birth. They contend that the infants most likely suffered from an inadequately investigated and undefined metabolic bone disease. Most perplexingly, the authors state “Had our patients been the victims of abuse it would have been severe and repeated; apart from the subdural bleeding they had an average of 8.2 fractures, usually of different ages. The lack of subsequent injury is a significant pointer to the likelihood that the original abnormalities were not the result of inflicted trauma. Similar conclusions have been reached in relation to cases of TBBD without subdural bleeding” (6, 91). The authors cite themselves as support for this conclusion (94); for which the authors were required to pay $1,695.00 to publish (95). It is puzzling why this report on 20 infants with multiple fractures and SDH appeared in a journal whose aim is to publish “clinical investigations in pediatric endocrinology and basic research with relevance to clinical pediatric endocrinology and metabolism” (96). Endocrinology was not discussed in the manuscript by the authors in either this paper or its follow-up (93).
In summary, Paterson and colleagues have reported a heterogeneous group of infants many of whom likely suffered from child abuse (52, 53, 93). The current primary argument for the existence of TBBD, as offered by Paterson are: (1) the patients he describes are similar to each other; (2) the infants have more fractures than would be expected by examination; (3) it looks similar to the cases he described as having occurred in the hospital (86); and (4) when children are returned to home, fractures do not recur (92). Circularity of their argument notwithstanding, the lack of rigorous scientific methodology in these publications, along with the absence of any evidence beyond small low-quality cases series, makes it difficult to draw any meaningful conclusions from their work. Paterson and Monk describe that TBBD is a “mimic” of physical abuse but do not provide a way to distinguish the two (93). To Paterson, the crucial feature is that all of the children which he has diagnosed with TBBD over the past three decades have “consistent clinical and radiologic features” (93). He does not consider that this consistency is due to a systematically skewed internal lens through which these patients are seen. Or in Latin “Populus Vult Decipi; Ergo Decipiatur” (People wish to be deceived, therefore they are deceived).
Miller’s “TBBD”
After the publication of Paterson and colleagues’ case series (52, 53, 97), Miller began evaluating infants with multiple unexplained fractures who were referred for evaluation by parents or their attorneys (98). Miller and Hangartner reported on a series of 33 infants, 26 of whom they diagnosed with TBBD. The diagnostic criteria used were (98):
Miller and Hangartner reported a fracture pattern similar to the TBBD cohort described by Paterson (53). For Miller and Hangartner the most common fracture identified was rib fractures (81%), followed by diaphyseal fractures (77%) and metaphyseal fractures (50%). Of the infants with rib fractures, 73% had posterior arc involvement and 77% had 4 or more rib fractures of the same age (98).
Of particular interest to the authors were the intrauterine conditions of the infant. They reported that 95% of the mothers of singleton pregnancies reported decreased fetal movement and that 92% of the pregnancies had some form of intrauterine confinement, including multiple gestation, oligohydramnios, and fetopelvic disproportion (98).
Miller and Hangartner reported that they performed radiographic absorptiometry and CT densitometry measurements on 9 of the infants (studies at 5–24 months of age) with TBBD, which were compared with 7 controls (studies at 10–27 months of age). It is not described in the methods how the controls were obtained, or how they were similar to the subjects. Despite the bone density studies being performed months, and perhaps years, after the fractures potentially caused by TBBD, the authors reported that the control infants had normal bone density and that the infants they had diagnosed with TBBD fell “2 or more SDs (Z-score) below the mean” (98).
The authors report that TBBD is not a variant of OI (17/26 had normal collagen tests and the fractures were different from those seen in OI) (98). They discern TBBD from inflicted trauma for these four reasons: (1) there is an absence of bruising associated with the fractures along with the absence of other abusive features (retinal hemorrhage, intracranial hemorrhage); (2) infants with rib fractures did not have intrathoracic injury (thus, the trauma was not severe); (3) the infants suffered their fractures in a narrow time period (3–18 weeks); and (4) the infants demonstrated low bone density (months or years later). The importance of bone densitometry in the support of the theory of TBBD is curious as Miller himself subsequently indicates that the technology is limited due, in part, to the lack of “appropriate age and size-matched controls” (99). Additionally, it is perplexing how the fracture pattern in TBBD, being different from the pattern seen in OI, makes OI less likely, but being similar to the pattern seen in physical abuse, is inconsequential.
Based upon their data, Miller and Hangartner propose that the mechanism of TBBD is not copper deficiency, as proposed by Paterson and colleagues, but is a result of decreased fetal movement resulting in decreased load on fetal bones. This decreased load on fetal bones results in decreased bone formation. They propose that this is the mechanostat/mechanical loading model of bone formation (Utah Paradigm) as proposed by Frost (100, 101). Miller cites Rodriguez and colleagues (102, 103) as having demonstrated in “both the human and the rat, the fetal immobilization resulted in decreased bone diameter” (99), and that their studies “indicate that fetal immobilization causes reduced periosteal bone formation” (99). These conclusions are quite different from what Rodriguez and colleagues reach. From the paper Miller cites, “Previous clinical studies have suggested that muscular strength is more important than movement in the regulation of fetal long bone development. This suggestion is based in the observation that long bone hypoplasia is a usual finding in newborns with congenital neuromuscular diseases but not in newborns with oligohydramnios sequence who also had intrauterine limitation of motion but normal muscular activity” (102). Rodriguez and colleagues conclude that intrauterine restriction does not result in long bone hypoplasia as Miller had reported.
Miller contends that there are a number of reasons that the infants described by Paterson and himself do indeed have TBBD and are not victims of abuse (99, 104). He contends that although there are similar features between TBBD and abuse (parental denial of injury, types of fractures, normal metabolic testing, and normal appearance of bones on plain film) there are features which distinguish the two. Miller proposes four main features of TBBD which can distinguish it from physical abuse (104). These four are similar to the points promoted by Paterson (92) and are: (1) the absence of bruising associated with fractures; (2) rib fractures without intrathoracic injury; (3) younger infants than abused infants; and (4) absence of recurrence of fractures.
As these four points form the basis of much of the support for the existence of both “types” of TBBD, it is important to appreciate how reasonable they are. First, as noted above, the absence of bruising with fractures in infants has been well documented (68, 69) and is an accepted premise in pediatric medicine. Mathew and colleagues specifically note “the absence of bruising cannot be taken to imply either underlying bone disease or an increased possibility of nonaccidental injury” (68). Second, Miller cites Garcia and colleagues (105) as support for the belief that intrathoracic injuries need to be present in children with traumatic rib fractures. Of the children in Garcia and colleagues’ paper, 70% sustained rib fractures from motor vehicle collisions. The presence of internal thoracic injury would not be unexpected in these children (106). Third, Miller contends that abusive fractures typically occur anytime between birth and 36 months of life and infants with TBBD fracture between birth and 6 months (99). It is unclear how this feature of TBBD discerns it from physical abuse as infant under six months of age could fracture from either abuse or TBBD. Last, the absence of recurrence of fractures is strongly promoted by both Miller and Paterson and both refer only to the anecdotal experience and a report by Paterson and Monk (94). This is a report of the follow-up of 61 of 85 subjects diagnosed by Paterson with TBBD between 1985 and 2000. The authors followed up by a phone call to most parents over an average of 6.9 years. Three reported subsequent fractures at two, six, and seven years of age. The authors report that the parents indicated that none of the index children (diagnosed with TBBD) had “sustained further injuries that were thought to represent nonaccidental injury” (94). In essence, the parents reported that none of their children had subsequently been abused.
An additional notable feature that distinguished TBBD from physical abuse promoted by Miller is that parents and caretakers of children who are abused often have a “high-risk profile,” and that parents of infants with TBBD have a risk profile “similar to general population” (99). This claim is made absent any data. Neither Miller nor Paterson has reported any data on the “parental profile.” Including this factor would indicate that Miller views the parents of infants in whom he diagnoses with TBBD as “good” people and that he does not view them as capable of abusing their infants.
Conclusion
Differentiation of osseous lesions due to child abuse from those associated with naturally occurring illness is generally readily accomplished when a systematic clinical, laboratory, and radiologic evaluation is performed. Experienced clinicians and radiologists with an interest in this field on rare occasions encounter difficult cases in which the diagnosis remains obscure despite thorough and diligent investigations. Fortunately, a period of observation in a safe environment usually resolves these unusual and challenging cases.
When placed in the context of EBR, the hypothesis of TBBD has some very notable shortcomings. First, the only primary data proposed to support TBBD are nonsequential, retrospective, unblinded case series of a very peculiar cohort of infants (53, 86, 91, 94, 98). Within the evidence hierarchy noted in Table 13.1 (12) and Table 13.4 (27), we see that this is a poor level of evidence. The proper study design would (for a question of diagnosis or causation) be a prospective cohort or case control study. In the intervening 20 years since TBBD has been proposed, a study of this kind has not been reported.
Another flaw within the research presented for TBBD is the profound amount of bias the studies contain. The primary reports of data (53, 86, 91, 94, 98) contain elements of each of the biases reported in Table 13.5 (31). The infants studied are haphazardly selected (Selection Bias) and are not representative of the population of interest (Sample Bias). The infants are referred primarily by attorneys (Referral Bias) or parents accused of abusing them (Self-Selection Bias). They are either not followed up or are followed up in a haphazard fashion (Loss-To-Follow-Up Bias). Much of the information acquired by the investigators is from parental recall (Recall Bias). The investigators are not blinded to the condition or outcome (Reviewer Bias) and have an explicit motivation for a particular finding (Interviewer Bias). Last, the reports have the two fatal flaws of both having the “outcome” or standard being subjective (Imperfect-Standard Bias) and having the outcome being known by the investigator in advance (Test Review Bias).
Within the field the philosophy of science, the distinction between science and pseudoscience is the Demarcation problem (107–109). A centerpiece of demarcation is the falsifiability test. Falsifiability is the premise that a true scientific theory could, in principle, be proved false (44, 107). This is also called “disprovability.” This does not mean that a theory has to be proved true to be scientific. It means that a theory has to have the ability to be proved false to be scientific. Einstein’s Theory of Relativity, when proposed, was not yet proved true; but it could be proved false. The predictions and implications of the theory were very specific and unambiguous. After being proposed as a special theory of relativity in 1905, it took the observations during a solar eclipse in 1919 to prove predictions made by the theory correct. The absence of evidence does not make the theory untrue; it is the inability to disprove it which does. A theory which cannot be proven to be false is deemed to be pseudoscientific. TBBD is a nondisprovable hypothesis. Under both frameworks proposed by Paterson and Miller, TBBD can never be demonstrated to be false. Again, quoting Nobel Prize winning physicist Richard Feynman, “You cannot prove a vague theory wrong” (44, p. 158). From the supporters of TBBD, there can be no research study design which could prove that TBBD does not exist. To the contrary, when presented with evidence that in the presence of normal bones, copper deficiency was not a reasonable explanation for fractures (110), Paterson indicated that despite the evidence it was true in his experience (67). Likewise, when presented with the evidence that infants with congenital neuromuscular diseases have osteopenia, but that infants with intrauterine restriction of movement have normal bone density (102, 103, 111), Miller continues to promote fetal restriction as a cause of TBBD, but amends the theory to include vitamin D deficiency (104). Miller has repeatedly posited TBBD as potential explanations for case reports of infants with fractures reported in the medical literature. This includes metaphyseal fractures from physical therapy (112), cervical spine injury (113), rib fracture from physical therapy (99), or metaphyseal fracture from cesarean section (114). The authors of the case reports find the proposition of TBBD unreasonable (115–117). Miller sees TBBD in many infants with fractures despite other more reasonable explanations, and finds no circumstance in which TBBD could be shown not to be present. Paterson and Monk blur the lines between real medical conditions and TBBD as well. When framing their three-decade experience diagnosing infants with TBBD, they state “It is likely that TBBD is a syndrome which includes osteopathy of prematurity” (93). This blurring of the lines between fact and fantasy make disproving TBBD impossible. Another example of the TBBD being nondisprovable is when Paterson and Monk refer to a large anterior fontanel: “Our results indicate the potential value of this measurement [anterior fontanel size] in the diagnosis of TBBD. Normal values do not exclude the syndrome, but high values point to a bone abnormality” (93). The same double standard is applied to bruising. Paterson and Monk report that most of the infants with TBBD do not have bruising as part of their findings (93). They then dismiss any bruising that may be found in infants as not from abuse, but that “bruising may itself be a feature of the underlying disorder” (93). If an infant with fractures has bruising it is from TBBD, and if they do not have bruising, it is because they have TBBD.
The foundational process of EBR (the critical appraisal exercise) in which the practitioner blends a clear clinical question with a comprehensive and critical analysis of the relevant published literature can be a powerful tool for evaluating children who are suspected victims of maltreatment. The objective appraisal of the quality of evidence allows experts in the clinical and judicial arenas to be able to separate the wheat from the chaff. Part of the advance of science is the proposal of novel theories and the ability to collect evidence to sustain, refute, or amend the theory. What has occurred with the theory of TBBD is that it has taken on a Frankenstein-like existence in which, despite no data of reasonable quality supporting its existence, and compelling data undermining its proposed pathophysiologic mechanisms, it regenerates in an evermore tortured form. The primary explanation for the continued interest in the theory is the interface of the judicial system in child abuse pediatrics. Without judicial proceedings, TBBD would not have survived this long. There is no medical or scientific debate regarding the existence of TBBD; the debate exists for the sake of the court room. The adversarial environment of the judicial process has fomented a culture of adversarial allegiance (118). This well-described concern in the court system is the expert bias towards a “side” which is felt to influence the interpretation of the scientific or medical evidence offered (118, 119). By applying the tenets of EBR, the clinician and the courtroom can both resolve the conjecture of TBBD in an expedient fashion.
I have chosen to provide a lengthy discussion of TBBD not because it is a credible concept, but rather to illustrate the method by which the critical reader should approach novel hypotheses that relate to the differential diagnosis of child abuse. Unlike most diseases concepts that have their origins in empirical evidence, TBBD arose as means to undermine the foundation of cases of alleged child abuse. The “beauty” of the concept is that it is amorphous and broad enough that as each explanation for TBBD (variant OI/copper deficiency/fetal immobilization) is embraced and then discarded, the stage remains set to place the blame for common abusive injuries not on those responsible, but on rare medical conditions (e.g., congenital rickets). Sadly, evidence-driven scholarship that might otherwise be poured into the relevant research is increasingly diverted to studies designed to debunk ideology-driven, highly speculative hypotheses, entirely devoid of any scientific foundation (see the CML discussion in the “Rickets versus abuse” section of Chapter 8) (120–122). It is certain that solid clinical experience and evidence-based medical research will continue to uncover and elucidate valid disorders that mimic child maltreatment. It is also an unfortunate certainty that TBBD and other poorly supported hypotheses will continue to haunt our courtrooms as undead specters – daunting those who enter, creating confusion, but ultimately made of vapor and mythology.