Evidence-based radiology and child abuse

Published on 11/04/2017 by admin

Filed under Radiology

Last modified 11/04/2017

Print this page

rate 1 star rate 2 star rate 3 star rate 4 star rate 5 star
Your rating: none, Average: 2.3 (25 votes)

This article have been viewed 2849 times

  • Systematic review
  • Multicenter studies
  • RCTs
  • Observational studies
  • Uncontrolled trials with dramatic results
  • Before and after studies
  • Non-RCTs
  • Descriptive studies
  • Case studies
  • Expert opinion
  • Studies of poor methodologic quality

Adapted from Evans D. Hierarchy of evidence: a framework for ranking evidence evaluating healthcare interventions. J Clin Nurs. 2003;12(1):77–84.

At the core of the diagnosis of child maltreatment is the concept of causation. A primary concern is how certain are we that a finding, or constellation of findings, has been caused by child maltreatment. As a matter of epistemology, causation remains debated. There exist no specific criteria by which one can “prove” that an outcome has a specific cause; only demonstrations of a relationship between outcome and (proposed) cause. For clinical research the Bradford Hill Criteria (20, 21) serve as an agreed upon guide for demonstrating causation. These criteria are listed in Table 13.2. Of these criteria, only one (temporality) is required. The other criteria provide support for a causal relationship. These criteria help us understand the relationship between a finding on imaging and the explanation provided for that finding.

Table 13.2 The Bradford Hill criteria for causality

1. Strength There is a strong association between the cause and the effect
2. Consistency There is a consistent association between cause and effect within comparable studies
3. Specificity There are a limited number of competing potential causes for the effect
4. Temporal sequence* The effect should always follow the cause
5. Dose response The cause and effect have a dose-dependent relationship
6. Biologic plausibility The relationship between the cause and the effect should be biologically reasonable; should not break the laws of physics
7. Coherence The association between the cause and effect should be consistent with other known biologic causes; not an outlier
8. Experimental evidence The association between the cause and effect should be supported by experimental evidence
9. Analogy The mechanisms or processes in which the cause results in the effect should have other examples in nature

* The only required criterion.

Adapted from Hill AB. The environment and disease: association or causation? Proc R Soc Med. 1965;58:295–300.

Evidence-based radiology

The application of the EBM principles to the diagnostic imaging of child maltreatment requires some important distinctions. Evidence-based radiology (EBR) utilizes the same framework as EBM (critical appraisal exercise) but has some particular emphases which make it distinct. Similar to EBM, EBR involves the medical decision-making utilizing clinical experience with integration of the strongest evidence in the medical literature (22–25). As with EBM, EBR begins with a foundation of crafting the appropriate clinical question and searching for the highest-quality evidence to answer that question (22, 24). A fundamental distinction of EBR is that the primary question is one of diagnosis as opposed to therapy (with the exception of interventional radiologic and several diagnostic radiologic procedures). The diagnosis question can be separated into two broad fundamental sub-questions: (1) the best way to identify a finding (Practice); and (2) the implications of the presence or absence of a finding (Meaning).

The first is a question of diagnostic efficacy (best way to image). These study designs have a specific hierarchy of variables which is particular to EBR (Table 13.3) (25, 26). These represent the important considerations when deciding the strongest way to identify a finding. The evidence utilized is intended to identify and support the best modality, technique, timing, and population required to identify findings, which, in our case, are concerning for being due to child abuse (25).

Table 13.3 Hierarchial model of efficacy: typical measures of analysis

Level Type of efficacy and typical measures
1 Technical efficacy:
  • resolution of line pairs
  • modulation, transfer, function, change
  • grayscale range, amounts of mottle
  • sharpness
  • computerized imaging parameters
2 Diagnostic accuracy efficacy:
  • yield of abnormal or normal diagnoses in a case series
  • diagnostic accuracy (percentage of correct diagnoses in case series)
  • sensitivity, specificity, positive, and negative predictive values in a defined clinical problem setting
  • measures of area under the receiver operating characteristic (ROC) curve
3 Diagnostic thinking efficacy:
  • number (percentage) of cases in a series in which image was judged “helpful” for rendering the diagnosis
  • entropy change in differential diagnosis probability distribution
  • difference in clinicians’ subjectively estimated diagnosis
  • probabilities before and after test information
  • empirical subjective log-likelihood ratio for test positive and negative in a case series
4 Therapeutic efficacy:
  • number (percentage) of times image was judged “helpful” in planning patient care in a case series
  • percentage of times medical or surgical procedure avoided due to image information
  • number or percentage of times planned therapy pretest changed after the image information was obtained (retrospectively inferred from clinical records)
  • number or percentage of times clinicians’ prospectively stated therapeutic choices changed after test information
  • ?patient utility assessment (see text)
5 Patient outcome efficacy:
  • percentage of patients improved with test versus without test
  • morbidity (or procedures) avoided after having image information
  • change in quality-adjusted life expectancy
  • expected value of test information in quality-adjusted life years (QALYs)
  • cost per QALY saved with image information
  • patient utility assessment (eg, Markov modeling, time trade-off)
6 Societal efficacy:
  • benefit-cost analysis from societal viewpoint
  • cost-effectiveness analysis from societal viewpoint

Reprinted with permission from Thornbury JR. Intermediate outcomes: diagnostic and therapeutic impact. Acad Radiol. 1999;6(Suppl. 1):S58–65; discussion S66–8.

The second question is a question of implication or meaning. In essence it is a question of causation (what caused a finding). When assessing the evidence for causation, the medical literature, in the context of the Bradford Hill Criteria, serve as a guide. Within the critical appraisal exercise of EBR, evaluating the level and quality of evidence retrieved is critical. The ideal study for a question of diagnosis would include a full spectrum of patients (various manifestations), randomly or consecutively selected, in which the studies are read by an independent reviewer who is masked to the group or condition (22, 27). As RCTs would not be a meaningful study design for diagnostic radiology, there is a separate hierarchy of strength of evidence which mirrors the hierarchy utilized for EBM (Table 13.4) (28). This structure removes RCTs and provides a guide to distinguish high-quality from low-quality evidence in studies within diagnostic radiology. This tool is beneficial in that studies of low quality do not require any analysis and can be discarded readily.

Table 13.4 Levels of evidence for diagnostic studies

Grade of recommendation/level of evidence Prognosis Diagnosis Differential diagnosis/symptom prevalence study
A/1a SR (with homogeneity) of inception cohort studies; CDR validated in different populations SR (with homogeneity) of Level 1 diagnostic studies; CDR with 1b studies from different clinical centers SR (with homogeneity) of prospective cohort studies
A/1b Individual inception cohort study with >80% follow-up; CDR validated in a single population Validating cohort study with good reference standards; or CDR tested within one clinical center Prospective cohort study with good follow-up
A/1c All or none case series Absolute SpPins and SnNouts* All or none case series
B/2a SR (with homogeneity) of either retrospective cohort studies or untreated control groups in RCTs SR (with homogeneity) of Level >2 diagnostic studies SR (with homogeneity) of 2b and better studies
B/2b Retrospective cohort study or follow-up of untreated control patients in an RCT; “Derivation of CDR” or validated on split-sample only Exploratory cohort study with good reference standards; CDR after derivation, or validated only on split-sample or databases Retrospective cohort study, or poor follow-up
B/2c “Outcomes” research Ecologic studies
B/3a SR (with homogeneity) of 3b and better studies SR (with homogeneity) of 3b and better studies
B/3b Nonconsecutive study; or without consistently applied reference standards Nonconsecutive cohort study, or very limited population
C/4 Case series (and poor quality prognostic cohort studies) Case-control study, poor, or nonindependent reference standard Case series or superseded reference standards
D/5 Expert opinion without explicit critical appraisal, or based on physiology, bench research, or “first principles” Expert opinion without explicit critical appraisal, or based on physiology, bench research, or “first principles” Expert opinion without explicit critical appraisal, or based on physiology, bench research, or “first principles”

CDR, clinical decision rule; RCT, randomized controlled trial; SR, systematic review.

* An “Absolute SpPin” is a diagnostic finding whose specificity is so high that a positive result rules-in the diagnosis. An “Absolute SnNout” is a diagnostic finding whose sensitivity is so high that a negative result rules-out the diagnosis.

Adapted from Oxford Centre for Evidence-Based Medicine Working Group. Levels of Evidence I (March 2009). Available from http://www.cebm.net/?o=1025 (accessed April 30, 2009).

Lastly, it is important to highlight that EBR does not imply “cold reading” of imaging without clinical context. In fact, including clinical information in the reading of diagnostic imaging has been demonstrated by a systematic review of the literature to increase reading accuracy compared with blinded reading (29). Again, at its core, EBR is combining the practice of radiology with an expanded appreciation of the literature and clinical context.

Common pitfalls of EBM and radiology


One of the primary strength of the critical appraisal exercise of EBM/EBR is to overcome bias. Bias, in clinical research, is systematic error that can artificially distort the data (30, 31). Bias often is a function of poor study design or implementation. An example of bias would be identifying a cohort of patients for a study in an inconsistent or arbitrary manner (selection bias). This cohort may have characteristics which are artificially present and not representative of the real population. Table 13.5 lists a number of other biases which can influence studies. Common methods to reduce bias in research are subject randomization, consecutive recruitment of subjects, prospective study design, and investigator blinding (31). In addition to research bias, there are a number of cognitive biases that the practitioners of EBM/EBR should be aware of (32, 33), a small sample of which are found in Table 13.6 (33). Cognitive biases are internal choice preferences which may influence judgment decisions.

Table 13.5 Potential biases in research and clinical practice

Bias Description
Selection bias The subjects, interventions, or procedures are chosen in a nonrandom fashion which may affect the results
Sample bias The group chosen to study does not represent the population of interest
Loss-to-follow-up bias Subjects in a study are followed-up in an unequal manner
Disease spectrum bias A limited form of the condition is studied; not representative of the true spectrum of the disease
Referral bias Local practices or conditions affect patient referrals or procedures
Self-selection bias The subjects who self-select are different from those who do not
Recall bias Subject historical recall can be incomplete to “unremarkable” but important items
Interviewer bias An interviewer may “frame” or “coach” the information from the subject
Verification bias Testing occurs to a subset of subjects who “screen positive” and not to a full population
Response bias Data are missing in a nonrandom manner; negative results are not evaluated as thoroughly
Reviewer bias The person collecting the data is not blinded to disease or condition
Test review bias In retrospective studies; when the diagnosis is known at the beginning of the study
Imperfect-standard bias The diagnostic standard is subjective
Confounding The role other “unknown” variables play in the data; within the subject, disease, or study design

From Sica GT. Bias in research studies. Radiology. 2006;238(3):780–9.
Table 13.6 Common cognitive biases

Cognitive bias Description
Anchoring The inclination to lock onto a diagnosis early on and failing to reconsider after receiving contradictory information
Availability The tendency to judge things as being more likely to occur due to recent exposure to similar situations
Base-rate neglect The tendency to ignore the true prevalence of disease and therefore either inflating or reducing its base-rate; often practiced in the strategy of “ruling out the worst-case scenario”
Diagnosis momentum Patients receive their diagnostic “label” and their diagnosis is carried on from person to person without being challenged
Gender bias The predisposition to believe that gender plays a role in the likelihood of diagnosis when no such trend exists
Heuristics Mental shortcuts used in cognitive reasoning to solve problems with minimal effort, i.e., “rules of thumb.” Based on previous knowledge and experiences but may lead to mistakes
Outcomes bias The predisposition to make diagnostic decisions that will lead to good outcomes; physicians lean toward making decisions targeted toward what they hope might happen rather than what they really believe might happen
Overconfidence bias The tendency to believe we know more than we do, or that we are correct more frequently than we really are
Premature closure The tendency to accept a diagnosis before it has been fully confirmed

From Vick A, Estrada CA, Rodriguez JM. Clinical reasoning for the infectious disease specialist: a primer to recognize cognitive biases. Clin Infect Dis. 2013;57(4):573–8.
Case reports

A frequent pitfall is attaching too much emphasis on case reports as a foundation for clinical decisions. Case reports are the weakest form of evidence, and are by their very nature outliers. It is important to recognize the complete context of the clinical scenario and make clinical decisions from a perspective of “most likely” and not “most rare.” This is the difference between “it is possible” and “it is likely.” A single case report would indicate “it is possible.” The complete clinical picture would indicate whether “it is likely.” An example of how case reports can distort the understanding of a clinical scenario would be a case report of a one-year-old who was in a high-speed motor vehicle collision (34). The young girl had a profound cervical spine injury. Within the report the authors make no mention of intracranial injury (ICI). This report has been interpreted by others, despite the absence of data on the intracranial findings of the child, to support the belief that shaking an infant is not dangerous (35). Generalizing from a single case report would be similar to proposing that because there exists a case report of the only person who has been documented to have survived a rabies infection (36), that rabies, a universally fatal infection, is now safe; or given survival of a plane crash has been reported (37), that plane crashes are now safe.

Limiting evidence to only RCTs

Particularly in the context of diagnostic radiology, the RCT may be an inappropriate benchmark for evidence. As noted above, the appropriate study design for a question of diagnosis would be either a cross-sectional or cohort study (depending upon the precise question). Some authors have indicated that the absence of RCTs within the field of abusive head trauma (AHT) is a sign of weak published evidence (17, 38–40). This conclusion throws into stark relief a limited understanding of foundation EBM and clinical research (41).

Beyond the scope of diagnostic radiology it is important to note that not all meaningful clinical questions are, or can be, answered by an RCT (1). As reported in the BMJ in 2003, there is an absence of RCTs demonstrating the safety of parachutes (42), yet our clinical experience would endorse their safety over the alternative. Within the field of child maltreatment, take the example of the dangers of shaking an infant. The absence of an RCT would not support the contention that we cannot generate any meaningful evidence on the dangers of shaking an infant. We have a large amount of epidemiologic data on the dangers of trauma to an infant and the safety of short falls for an infant. This gives en face validity that shaking is indeed quite likely dangerous for an infant. An RCT of shaking infants is clearly unethical. Thus, despite the absence of an RCT of shaking or dropping infants, the contention that shaking an infant is dangerous is nearly universal. Thankfully, even the few hardcore skeptics about the dangers of shaking have not shaken an infant in an attempt to demonstrate shaking’s safety, but limited their studies to using biofidelic dummies (43). This is an example of how context (clinical experience) can provide persuasive evidence in the absence of an RCT.

An unsophisticated view of the scientific process would limit knowledge to the RCT. The vast majority of knowledge, not just in medicine but all knowledge has been obtained without the benefit of an RCT. Within the “hard” sciences – physics, chemistry, and biology – the RCT is not a part of gaining new understanding. From Newton’s Laws of Thermodynamics to Einstein’s Theory of Relativity, from the Krebs Cycle to photosynthesis, no RCTs have been done; yet we accept these as truths (or, at least, most of us do). The foundation of scientific knowledge is observation. As physicist Richard Feynman said in his 1964 Messenger Lecture Series at Cornell University, when trying to identify a new theory, “First we guess it. Then we compute the consequences of the guess to see what would be implied if this law that we guessed is right. Then we compare the result of the computation to nature, with experiment or experience; compare it directly with observation, to see if it works”(44, p. 165). Thus, to a Nobel Prize winning physicist, experience and observation are valid and valuable ways to obtain new knowledge in science.

Ignoring the quality of a study

Literature appraisal is a skill which many busy clinicians may not have the ability to develop. The Materials and Methods section of a paper is often not read, or at least not scrutinized to any degree for the reader to be able to assess the underlying quality of the research design. Along with the exploding body of scientific and medical literature has been a steady decrease in the quality of that literature (45–47). It is now beholden upon the reader to not only appreciate the appropriate study design for their clinical question, but to also be able to critically appraise the quality of the research itself. Contributing to the challenge has been the growth of the for-profit medical publishing sector (“Pay-to-Publish”). There are growing data that the increase of for-profit publishing has eroded the quality of published scientific and medical literature (48).

Criticism of EBM

Criticism of EBM has been present since its introduction (49, 50). The primary early criticism of EBM included publication bias, resulting in misleading results in the published literature (49); that it de-emphasized clinical basics (history and physical examination) (49); that it was arrogant in its claims to the truth (50); and that it was designed to be simply a cost-cutting maneuver (50). The concerns were that the advent of EBM was an attempt to supplant patient-centered clinical experiences to a simple algorithm. Over the intervening 15 years EBM supporters have tried to continue to balance clinical experience (focused patient-centered questions) with meaningful evaluations of the quality of evidence supporting a clinical judgment (literature appraisal).

One criticism leveled within the Child Maltreatment field is that EBM introduces bias by incorporating clinical experiences into the algorithm (Critical Appraisal Exercise). The concern is that by allowing any clinical judgment into the process, cognitive biases will outweigh the evidence from the literature. According to Sackett, clinical experiences are not the particular individual foibles of the practitioner, but by “individual clinical expertise we mean the proficiency and judgment that individual clinicians acquire through clinical experience and clinical practice” (1). In essence, EBM requires context. Interpretation of a particular clinical situation without context runs the great risk of being misled by case reports and outliers.

An additional criticism, particularly regarding AHT, is that the evidence-base is weak. Some authors have used the EBM hierarchy of strength of evidence to grade the evidence as it pertains particularly to AHT (17, 51). This is not as much a criticism of EBM, but more of a misapplication. The few skeptics of AHT will commonly cite their perceived lack of evidence in one breath and then propose an unsupported theory in the next. In a 2011 review by Barnes, he feels that “much of the traditional literature on child abuse consists of anecdotal case series, case reports, reviews, opinions, and position papers” (51). Yet in the same piece, the author proposed pertussis, rickets, seizures, hypoxia, dysphagic choking, and “certain therapies” as mimics of findings seen in AHT without any meaningful supporting evidence (51). It cannot be both ways. One cannot overcome a perceived poor evidence-base by adding anecdote or conjecture. Simply invoking the veil of EBM does not ordain the entire position.

“Temporary brittle bone disease”

An infant with multiple unexplained fractures raises the clinical concern for these fractures being the result of inflicted trauma or physical abuse. One particular concern for the clinical evaluation of infants with unexplained fractures is the assessment of a skeletal predisposition for fragility. As described in previous chapters, there exist numerous genetic, nutritional, anatomic, and functional conditions that predispose an infant to fractures and require consideration by the clinicians caring for the infant. Over the past 20 years the presence of a condition labeled “temporary brittle bone disease” (TBBD) has been proposed and embraced by a handful of professionals. The following is a deconstruction of the concept and evidence of TBBD, being mindful of the context of EBR and the fundamentals of clinical research. There are two main streams of TBBD promoted by its supporters. While they were both developed independently of each other, there is much overlap. Each will be discussed separately.

Paterson’s “TBBD”

In the early 1990s, Paterson and colleagues reported on 39 infants with fractures during the first year of life. The authors suggested that these infants suffered from a “self-limiting variant of osteogenesis imperfecta” due to a fundamental transient defect in collagen formation, and coined the term “temporary brittle bone disease” (52, 53). These reports, in addition to advocacy of the concept of transient brittle bone disease in court cases of alleged child abuse, have stirred intense controversy in the United Kingdom and North America. The proposal of TBBD by Paterson and colleagues was initially hotly debated in the literature. Their methods, results, and conclusions have been scrutinized in letters to the editor, invited commentaries, and author responses (53–65).

As described, most of the clinical and radiologic features ascribed to TBBD are those classically noted in cases of abuse. These include: vomiting and apnea; fractures during early infancy (often involving the metaphyses and ribs) – commonly in the absence of external features – found “on accident” without other conditions that would account for the fractures (52, 53, 66, 67). In essence, the diagnosis of TBBD retains the clinical and radiographic features of physical abuse of an infant; only the clinician “believes the parent” did not cause the injury and the injury is due to an unidentified (and undefined) bone fragility condition.

Paterson and colleagues also described a variety of clinical and laboratory findings that are nonspecific features or risk factors also associated with infant abuse (53, 57). These authors viewed their cases as distinctive because the fractures were unassociated with evidence of trauma and because they were found incidentally when radiographs were carried out for other reasons. The belief that fractures often require external signs of injury is not supported in the medical literature and is refuted by both Mathew and colleagues (68) and Peters and colleagues (69). Both of these studies support the infrequency of bruising with fractures, independent of whether caused by abuse or accident. The rationale for performing skeletal surveys (SSs) and bone scintigraphy in children with suspected abuse is predicated on the concept that fractures, which are strong indicators of abuse, are usually unapparent on physical examination (see Chapter 14) (57, 70–72). Additionally, the value of repeating the SS, in most cases, to identify fractures that may initially be radiographically occult has repeatedly been demonstrated (73–77).

It is interesting that 54% of Paterson and colleagues’ cases were preterm infants. Because 21% of their patients had gestational ages of less than 33 weeks, it is possible that some of these immature infants suffered from the well-recognized metabolic bone disease of prematurity. Fractures involving the metaphyses and ribs are familiar radiologic features in very premature infants and can be explained on the basis of decreased calcium stores – osteopenia of prematurity (see Chapter 8) (78–81).

The fracture types within the TBBD spectrum outlined by Paterson and colleagues included diaphyseal fractures in 57%, metaphyseal “abnormalities” in 76%, and rib fractures in 72% of cases (53). The published literature in the field has identified these radiologic observations as characteristic of abusive fractures (54, 57, 71, 72). Most metaphyseal fractures in osteogenesis imperfecta (OI) are nonspecific and do not conform to the classic metaphyseal lesion (CML) patterns originally described by Caffey and further elucidated by radiologic–histopathologic studies (see Chapter 2) (82, 83).

Fractures near the costovertebral articulations, as described by Paterson and others, entail certain specific mechanical factors in their pathogenesis (see Chapter 5). Studies point to severe anteroposterior (AP) thoracic compression, with leverage of the ribs over the transverse processes, to produce the characteristic radiologic and pathologic findings (84). Even when there is significant demineralization associated with metabolic bone disease in infants, posterior rib fractures are usually more laterally situated (see Fig. 8.3). Additional radiologic findings described by Paterson and colleagues in TBBD include periosteal “reactions” (49%), expanded costochondral junctions (CCJs) (34%), and overt osteopenia (31%) (53). As no radiologists were authors of this publication and the methods section does not address it, the certainty of these findings is unclear. Subperiosteal new bone formation is a well-recognized normal finding in young infants (57, 85) and prominent CCJs are often apparent on chest radiographs in infancy (57). It is unclear how these normal findings in infants are determined to be features consistent with TBBD.

Paterson and colleagues indicated that some of the fractures occurred while the infants were hospitalized (53). In a follow-up publication in 2009, Paterson reported on five children who he contended suffered fractures, while in medical care, as a result of TBBD (86). Two of the five children with fractures had clear alternative causes for their fractures (osteopenia of prematurity, birth-related trauma), and two of the five were likely abused (sent home with caretakers and returned to the hospital because of facial bruising). The 5th child spent their entire 15 months in the hospital and was noted to have 17 total rib fractures. TBBD was diagnosed by virtue of the child being in the hospital (86). It is likely that at least some of these 5 children are part of the cohort originally described by Paterson (86), as the author indicated a follow-up period for these children of 6–18 years. As noted in preceding chapters, rib fractures and many other osseous injuries may only become radiographically apparent 1–2 weeks after the injury (70, 75–77) and, therefore, fractures sustained prior to hospitalization may become evident only on follow-up radiographs acquired during or following hospitalization (57, 71, 72).

Of particular interest is the suggestion by Paterson and colleagues that these cases reflect a “temporary deficiency of an enzyme, perhaps a metalloenzyme, involved in the post-transitional processing of collagen” (52, 53, 87). Unfortunately, since there is no “methods” section in either publication on this subject, it is impossible to assess the authors’ assertion of a metabolic defect. There has been no scientific support for the view that a temporary defect exists in either the structure or rate of production of type I collagen as seen with (OI) (see Chapter 9).

The authors describe the serum copper levels in three cases: absent in one and normal in two others. Given the absence of supporting data for copper deficiency as the pathophysiology of TBBD, Paterson has expanded the potential causes to include vitamin D deficiency (88, 89). To keep the hypothesis of TBBD active, Paterson broadens the concept to include those conditions that are otherwise real mimics of physical abuse. In publications subsequent to the initial cohort described by Paterson and colleagues, the author expanded the potential causes of TBBD to include vitamin D deficiency (see Chapter 8), vitamin C deficiency, heritable conditions, and other collagen defects (such as Ehlers–Danlos) (86, 90, 91). It is notable that Paterson indicates that after over 30 years of promoting TBBD, “We still do not know its cause or causes. We still have no specific diagnostic tests. We cannot exclude other causes of fractures in every one of our published cases” (92). Despite the absence of data on pathophysiology, Paterson and Monk endorse a prenatal “cause” for TBBD, stating “(t)he time scale of the fractures suggests that intrauterine factors are significant” (93).

The inability to distinguish between TBBD and physical abuse is most readily apparent in the report by Paterson and Monk in 2013 (91). The authors report on 20 infants diagnosed by Paterson over the prior 15 years who presented with fractures between 1 and 6 months of life, who were, in addition, noted to have intracranial bleeding. Seventeen of the 20 were referred by the parents’ attorneys and 5 were diagnosed based solely upon review of the medical records. All of the infants were noted on cranial computed tomography (CT) to have subdural hemorrhage (SDH), and eight were found to have retinal hemorrhaging as well. Nine of the infants had follow-up information available, five of whom had persistent neurologic sequelae.

This cohort of patients would reasonably be considered likely victims of physical abuse. Despite features that would point strongly to abuse, the authors contend that their patients did not suffer maltreatment because the CT scans in 15 cases had hypodensity consistent with the prior history of vomiting and apnea, 3 had accidental injuries reported by caretakers, and others likely developed the SDH in utero or at birth. They contend that the infants most likely suffered from an inadequately investigated and undefined metabolic bone disease. Most perplexingly, the authors state “Had our patients been the victims of abuse it would have been severe and repeated; apart from the subdural bleeding they had an average of 8.2 fractures, usually of different ages. The lack of subsequent injury is a significant pointer to the likelihood that the original abnormalities were not the result of inflicted trauma. Similar conclusions have been reached in relation to cases of TBBD without subdural bleeding” (6, 91). The authors cite themselves as support for this conclusion (94); for which the authors were required to pay $1,695.00 to publish (95). It is puzzling why this report on 20 infants with multiple fractures and SDH appeared in a journal whose aim is to publish “clinical investigations in pediatric endocrinology and basic research with relevance to clinical pediatric endocrinology and metabolism” (96). Endocrinology was not discussed in the manuscript by the authors in either this paper or its follow-up (93).

In summary, Paterson and colleagues have reported a heterogeneous group of infants many of whom likely suffered from child abuse (52, 53, 93). The current primary argument for the existence of TBBD, as offered by Paterson are: (1) the patients he describes are similar to each other; (2) the infants have more fractures than would be expected by examination; (3) it looks similar to the cases he described as having occurred in the hospital (86); and (4) when children are returned to home, fractures do not recur (92). Circularity of their argument notwithstanding, the lack of rigorous scientific methodology in these publications, along with the absence of any evidence beyond small low-quality cases series, makes it difficult to draw any meaningful conclusions from their work. Paterson and Monk describe that TBBD is a “mimic” of physical abuse but do not provide a way to distinguish the two (93). To Paterson, the crucial feature is that all of the children which he has diagnosed with TBBD over the past three decades have “consistent clinical and radiologic features” (93). He does not consider that this consistency is due to a systematically skewed internal lens through which these patients are seen. Or in Latin “Populus Vult Decipi; Ergo Decipiatur” (People wish to be deceived, therefore they are deceived).

Miller’s “TBBD”

After the publication of Paterson and colleagues’ case series (52, 53, 97), Miller began evaluating infants with multiple unexplained fractures who were referred for evaluation by parents or their attorneys (98). Miller and Hangartner reported on a series of 33 infants, 26 of whom they diagnosed with TBBD. The diagnostic criteria used were (98):