Assessing Outcomes After Hip Surgery

Published on 11/04/2015 by admin

Filed under Orthopaedics

Last modified 22/04/2025

Print this page

This article have been viewed 2152 times

CHAPTER 8 Assessing Outcomes After Hip Surgery

Nick G. Mohtadi, M. Elizabeth Pedersen, Denise Chan

Introduction

The assessment of outcomes after any type of surgery can be categorized in a variety of different ways. Simply put, the outcome of a procedure can be anything that is measured or observed. It can range from something as simple as measuring the range of motion to a complex, multifaceted, disease-specific, health-related quality-of-life outcome questionnaire.

Outcomes can be considered objective; this means that they are undistorted by emotion or personal bias and based on observable phenomena. Outcomes can also be described as subjective, which means that the effect takes place within the mind and is modified by individual bias. The irony of the subjective categorization when it comes to measuring outcomes in medicine or surgery is that we consider something like an x-ray to demonstrate objective outcomes but visual analog pain assessments to show a subjective outcomes. However, the fact is that the interpretation of the x-ray is open to observer bias and therefore has a component of subjectivity. By contrast, a patient’s response to a visual analog pain scale can be reproduced and assessed for error, and it therefore has the essential properties of an objective measurement. Whether the outcome is objective or subjective is not as important as whether the thing being measured represents the truth with respect to the outcome of a particular procedure.

With regard to hip outcome measures, several authors have addressed this area in the past. In 1972, Andersson compared 77 patients with the use of nine different methods and converted the final outcomes into a categoric scale of “good,” “fair,” and “bad.” The results were very disparate, with good outcomes ranging from 97.5% to as low as 30%, depending on the outcome used. The author’s final conclusion emphasized the importance of achieving agreement about what outcome should be used. In 1990, Callaghan and colleagues came to similar conclusions. In 1993, Bryant and colleagues used a statistical approach to analyze two separate groups of patients. They identified three core factors that were statistically independent: walking distance, hip flexion, and pain. These factors represented independent variables with respect to the outcome of hip arthroplasty. The authors’ conclusion was that combining these variables into a composite score is “arbitrary and without scientific foundation.” More recently, authors have recognized the need to assess health-related quality of life as a measure of health status of patients. Ethgen and colleagues performed a systematic review of outcomes related to hip and knee arthroplasty. They identified several important outcome measures, but their focus was on whether arthroplasty surgery improves quality of life. They stated the following: “If clinicians are interested in going beyond the pathophysiology…, if they seek to perceive the broader implications of diseases and strategies implemented to counter these diseases, it is necessary to consider outcomes that encompass several dimensions of health, as a health-related quality-of-life instrument does.” With this compelling statement in mind, this chapter will focus on the quality and methodology of outcome measures that have been created or used by orthopedic surgeons to assess the management of traumatic and degenerative conditions of the hip. The essential information is based on a systematic review performed by the authors.

Forty-one clinical rating systems for the outcome measurement of orthopedic patients with hip disease were identified. We will start with a general statement about outcomes and present a historic perspective. We will then classify the tools according to whether they were clinician or patient based, their method of administration (i.e., clinician or self-administered), and their purpose (i.e., evaluative, discriminative, or predictive). In the next part, we will critically appraise each tool for its quality by evaluating its creation methodology, looking at its population of interest, and reviewing the psychometrics (i.e., reliability, validity, and responsiveness) of the outcome measures. The final part of the chapter will focus on the development of a new health-related quality-of-life instrument that focuses on young, active patients with hip problems.

Outcome assessment: general

The purpose of outcome measures can be classified as either disease specific, such as those tools created to assess osteoarthritis, or joint specific, such as those created to assess the outcome of any pathology of the hip. These measures can also be classified according to the person who completes the assessment. Traditionally, outcomes have been assessed by clinicians and include objective measures such as radiographic assessments. The clinician also asks the patient about pain and other subjective measures. These “clinician-based” or “clinician-administered” tools may not capture the patient’s perceived outcomes. Therefore, more recently, “patient-based” or “patient-administered” tools have been created.

The objective of the tool must also be considered. If the goal is to follow patients over time and to assess changes, an evaluative index is necessary, because it can measure the magnitude of longitudinal change in an individual or a group of individuals. If the objective is to differentiate among patients to determine treatment, a discriminative index should be used, because it distinguishes among individuals or groups. Finally, to prognosticate, a predictive index can be used to classify individuals into a set of predefined measurement categories.

The second factor to consider when choosing an outcome measure is the quality of the tool, which is determined by the creation methodology. The creation of a questionnaire should follow a structured methodology that includes generating and reducing items. After it has been created, the tool must be tested for psychometric properties (i.e., reliability, validity, and responsiveness) within the target population.

Reliability refers to the ability of the tool to yield consistent and reproducible results. The questionnaire must be reproducible so that results will be the same for the same patient with the same amount of pathology on two separate occasions when measured by either the same or different raters. In addition, the items within the tool itself must be consistent so that all questions pertain to the concept that is being assessed.

Validity refers to how well an instrument fulfills the function for which it is being used. The different types of validity are face, content, construct, and criterion validity. Face validity ensures that the questionnaire “looks good” or appears to measure the intended content or trait. Without face validity, the tool will not be accepted. Content validity refers to the comprehensiveness of the instrument and how well the items represent all relevant concerns. Construct validity is the extent to which a particular tool can be shown to measure a hypothetical construct. Many of the factors that affect the ultimate outcome of a treatment (e.g., patient satisfaction) are intangible and therefore difficult to test. Construct validity tests these intangible qualities by proposing and testing logical relationships among different tools to measure similar or different outcomes. Construct validity can be tested with the use of convergent validity (when a high positive correlation is desired) and divergent validity (when a high negative correlation is desired). Finally, criterion validity is the validity of the questionnaire as compared with a gold standard. However, there is often no gold standard against which a particular questionnaire or tool can be compared.

Responsiveness refers to the ability of an instrument to measure change. However, this may be limited by ceiling or floor effects. Ceiling effects occur when the ability to record improvement is limited by the maximum obtainable value, and floor effects occur when the ability to record deterioration is limited by the minimum obtainable value.

Historic summary

There are many outcome measures that have been created for use in the population of orthopedic patients with hip disease. Most outcomes have been created for older patients who either require hip replacement or have a fracture.

The first published outcome tool for the hip was created by Ferguson and Howorth to assess the operative management of children with slipped capital femoral epiphysis. The authors measured the range of motion of the hip in all planes and multiplied each measurement by different modifiers to give the motions different weightings. This index of mobility was then modified by Gade, who changed the weighting of the actions for use in a total hip arthroplasty population. In 1954, Shepherd modified this index to use it as the mobility assessment for his tool, which also includes assessment of pain and function as well as the patient’s own assessment. Shepherd’s modification was further refined by Harris in 1969.

The first functional assessment appears to have come from France; it was published by Judet and Judet in 1952 and then in a different form by Merle D’Aubigné. The Merle d’Aubigné-Postel hip score was created in 1954 to grade the functional value of the hip in 405 patients who were treated with arthroplasty for the management of fractures of the femoral head or neck, osteoarthritis, or congenital dislocations. This score has subsequently been modified into other hip outcome measures: Charnley used the tool to assess low-friction total hip arthroplasties, Dutton and colleagues modified the tool for the assessment of patients with hip resurfacing, and Matta and colleagues adapted the score for patients with acetabular fractures. The Merle d’Aubigné-Postel score continues to be widely used for the assessment of hip arthroplasty in Europe. It has also been used by Letournel and Judet to assess acetabular fracture treatment, and it has since become the primary outcome measure for assessing the patient population with acetabular fractures.

In North America, the Harris Hip Score is more commonly used for the assessment of total hip arthroplasty. This score was created in 1969 “in an effort to encompass all the important variables into a single reliable figure, which is both reproducible and reasonably objective. The system was also designed to be equally applicable to different hip problems and different methods of treatment.” This disease-specific measure was created to evaluate patients having a total hip arthroplasty after a hip dislocation or an acetabular fracture. In 1973, Ilstrup and colleagues modified the Harris Hip Score to create a computerized method of following the results of total hip arthroplasty. Over a 40-year period, many other scores were developed as clinician assessments of arthritis.

In 1968 Goodwin developed what was considered to be a predictive tool for patients with hip fracture treatment. In 1982 Keene and Anderson developed the Hip Fracture Functional Rating Scale as a predictive measure to help with patient discharge planning and placement. Other outcome measures were developed to assess traumatic hip dislocations and for patients with slipped capital femoral epiphysis.

Before 1985, all measures were clinician based, and none of these tools made use of a standardized methodologic process for its creation; rather, these tools were created by one or more clinicians on the basis of what was felt to be clinically relevant. Since 1987, the methodology for creating a tool has been described, and the newer tools have followed a structured creation format. Most tools have still been directed toward the patient population with arthritis, which generally includes patients with an average age of more than 70 years.

Classification (Table 8-1)

All tools created before 1986 were clinician based, whereas all but two tools, created after 1986, are patient based. Of the patient-based tools, 10 are self-administered, 3 are meant to be administered by a clinician, and 1 can be either self- or clinician-administered. There are 3 predictive tools; all of the other tools are evaluative, and there are no discriminative tools. There are two main populations for which these scores have been created: middle-aged to elderly patients undergoing a hip replacement and elderly patients with fractures of the hip. Several other tools were created to be inclusive of all ages. One tool was created to capture higher activity levels among an older arthritic population, and two tools—the Non-Arthritic Hip Score and the Hip Outcome Score—were created specifically to capture the concerns of young, active patients with pre-arthritic hips.

Table 8–1 Hip Outcome Measures

Creation Methodology

All patient-based tools were created in accordance with formal methodology. By comparison, none of the clinician-based tools made use of a formal process, other than reviewing older outcome measures. Of the tools that followed a formal creation process, four did not include patient input. The process of item generation was well described, and only some of the questionnaires included patient input as a critical step, including the Western Ontario and MacMaster University Osteoarthritis Index (WOMAC), the Musculoskeletal Function Assessment (MFA), and the Osteoarthritis Knee and Hip Health Quality of Life (OAKHQOL) assessment. In some other cases, outcome items were added to existing questionnaires from the literature. The remaining outcomes made use of clinician input, the existing literature, or both. Item reduction varied among the existing hip outcomes. Some made use of formal testing by calculating the frequency and importance of the items generated, others made use of factor analysis, and others involved a group consensus to determine which items should be retained. Three outcomes did not make use of a formal process of reducing items. Only four tools pretested new measures to ensure that both the wording and the format were appropriate.

Psychometrics (Table 8-2)

Table 8-2 shows the evaluative tools that have been tested for reliability, validity, and responsiveness. Of the clinician-based tools, only the Harris Hip Score and Lequesne Index have been tested for reliability and shown to have internal consistency. Ten of the patient-based measures have demonstrated internal consistency. Two others—the Total Hip Arthroplasty Outcome Evaluation and the Hip Rating Questionnaire—have also been tested for internal consistency, but the results were poor. The majority of the questionnaires have demonstrated adequate reproducibility or test/retest reliability. All tools in Table 8-2 were presumed to have both face and content validity because patients and clinicians were involved in their creation or because they have been used by other surgeons in clinical practice or research. All have been tested for construct validity against other questionnaires. Only three tools have been tested for criterion validity: the Hip Rating Questionnaire was compared with the 6-minute walk test; the MFA was compared with stair climbing and walking speed; and the Lower-Extremity Measure was compared with the timed up-and-go test.

Table 8–2 Tools that have Been Tested for Reliability, Validity, and Responsiveness

Recommendations

General Musculoskeletal Complaints

There are two outcome measures that have been well designed for general musculoskeletal complaints of the lower extremity: the MFA and the American Academy of Orthopedic Surgeons Outcomes Questionnaires (AAOS Outcomes Questionnaires) The MFA was developed in 1996 to detect differences in function among patients with musculoskeletal disorders of the extremities. It is an evaluative, self-administered, patient-based tool that is appropriate for adult patients (i.e., an average age of 40) with various disorders, including upper-extremity injuries (45%), lower-extremity injuries (45%), repetitive-motion disorders (6%), osteoarthritis (3%), and rheumatoid arthritis (2%). It was created by a group of clinicians that included academic and community orthopedic surgeons as well as rehabilitation medicine specialists, physical therapists, and occupational therapists. It was developed with a formal methodology that included item generation. Items were identified by a review of existing scores and by interviews with patients and clinicians. Item reduction followed a formal process of determining the items that were prevalent, important, representative, and measurable. The MFA is consistent and reproducible, and face and content validity were ensured during its creation. Good construct validity was shown by comparing MFA scores with physicians’ ratings of patient functioning. When tested for criterion validity against stair climbing and self-selected walking speed, the tool showed poor agreement. This lack of correlation likely reflects the fact that the MFA was designed for a broad range of musculoskeletal disorders, including upper-extremity injuries.

The AAOS Outcomes Questionnaires were “designed for the efficient collection of outcomes data from patients of all ages with musculoskeletal conditions.” The outcome tools were separated into a Lower Limb Core Scale, a Hip and Knee Core Scale, a Sports/Knee Module and a Foot, and Ankle Module. The Lower Limb Core and the Hip and Knee Core Scales are essentially identical instruments with seven questions. The essential difference between the two questionnaires is that the words “lower limb” is substituted with “hip/knee.” This self-administered, patient-based, evaluative tool was created in 2004 with the use of a modified group technique that involved surgeons and health-services researchers. The group identified items after a review of the literature and then reduced these items by consensus. It is appropriate for adult patients around 48 years old with hip or knee complaints. It is reliable, with good internal consistency and reproducibility, as demonstrated by a test followed 24 hours later by a retest. Although patients were not involved in the generation or reduction of items, patients were asked if the questionnaires addressed their concerns to ensure face and content validity. The AAOS tool has shown good to excellent construct validity against the WOMAC (Pearson value, 0.89), the Medical Outcomes Study 36-Item Short-Form Health Survey (SF-36; Pearson value, 0.7), physician assessments of pain (Pearson value, 0.69), and physician assessments of function (Pearson value, 0.73). The AAOS is responsive, but it may show slight ceiling effects.

Osteoarthritis of the Hip

The best tool for general osteoarthritis of the hip is the Hip Disability and Osteoarthritis Outcome Score (HOOS). The HOOS is a patient-based, self-administered, evaluative tool created from the WOMAC. The WOMAC is a patient-based self-assessment tool that was initially developed in 1988 for patients with symptomatic osteoarthritis of the hip or knee. The items on the WOMAC were generated from interviews with 100 patients with osteoarthritis. It was tested with a group of patients with an average age of 71 years who were undergoing total hip arthroplasty. The WOMAC has been found to be both consistent and reproducible, and face and criterion validity were ensured during its development. Construct validity was determined by testing it against the SF-36. This tool is also responsive; however, the WOMAC does not capture the concerns of more active patients.

The HOOS was developed in part to capture the higher activity levels of patients with hip osteoarthritis. The questions for the HOOS were generated by interviewing more than 100 patients with hip disability and with or without hip osteoarthritis. Items were reduced by factor analysis to 40 items and include all of the items from the WOMAC in unchanged form. This tool has high reliability for all components of the questionnaire and high internal consistency. Content validity was ensured by having a subgroup of 26 patients rate the relevance of the importance of each item on a Likert scale, with “1” indicating that the item was irrelevant and unimportant and “3” indicating that the item was very relevant and very important. Construct validity was evaluated by comparing HOOS scores with those of the SF-36 general health status questionnaire. Responsiveness to clinical change was evaluated by calculating standardized response means and comparing the results with those of the WOMAC. This tool is good for patients with an average age 65 to 70 years with primary hip osteoarthritis who are having total hip replacements.

Hip Arthroplasty

The best clinician-based tool for the evaluation of hip arthroplasty is the Harris Hip Score. This tool was created in 1969 to evaluate pain, activity, and function after total hip arthroplasty. The average age of patients involved in the original study was 47 years (range, 22 to 71 years). The development of this outcome tool did not follow a defined methodology. The small number of patients (n = 30) involved in the creation of this tool and the absence of group consensus among orthopedic experts suggests that this tool lacks face validity. At the time of development, the Harris Hip Score was compared with two existing tools: the Shepherd system and the Larson system; however, comparability with these scales is limited. There were no detailed statistical analyses performed to determine construct validation.

The Harris Hip Score has been tested against the SF-36 and the WOMAC, and it has been shown to have high validity, reliability, and responsiveness. It has also been compared with the original and modified Merle D’Aubigné-Postel Scores, and it has demonstrated high overall correlation among acetabular fracture patients. The Harris Hip Score has been shown to be effective for evaluating changes in hip function; however, when it was tested among patients with acetabular fractures, the outcome demonstrated ceiling effects, which suggests a limitation in the clinical use of this outcome.

With only 12 questions, the Oxford Hip Score is the simplest outcome assessment tool. This self-administered, evaluative, patient-based outcome measure was created in 1996 to assess the perception of pain and function among patients undergoing total hip arthroplasty. This evaluative tool is good for patients who are between 35 and 90 years old with primary or secondary osteoarthritis. The methodology for developing the Oxford Hip Score involved item generation and reduction on the basis of a review of the WOMAC, the Patient-Specific Index, Charnley Hip Score, and the Harris Hip Score as well as on interviews with patients. The Oxford Hip Score has high internal consistency and satisfactory reproducibility. Face and construct validity were ensured by involving patients and reviewing the literature during item generation. Construct validity was established by testing this outcome measure against the Charnley Hip Score, the SF-36, and the Arthritis Impact Measurement Scale. It has also been shown to be responsive and sensitive to change as compared with the EuroQoL.

For young patients with arthritis of the hip and knee, a better tool is the OAKHQOL outcome measure. This is a self-administered, patient-based, evaluative tool that was created to fulfill the need for a “disease-specific instrument with good content, construct validity, and responsiveness in assessing the [quality of life] of patients with lower limb [osteoarthritis] RAT, 2005.” The development of the OAKHQOL followed a structured methodology to ensure content validity. Items were generated and reduced with the use of focus groups that involved patients, rheumatologists, orthopedic surgeons, physiotherapists, and occupational therapists. The OAKHQOL was found to have high internal consistency with factor analysis and dimensional analysis; in addition, it is reproducible as demonstrated by testing followed by retesting after 10 to 21 days. This outcome measure has demonstrated adequate face and construct validity as compared with the SF-36 and a pain visual analog scale. Responsiveness was not tested, and there is no information regarding the scoring or the interpretation of the scores.

Nonarthritic, Young, Active Patients With Hip Pathology

A population that has recently been identified as a group that requires orthopedic services is the young population with hip pain but no evidence of osteoarthritis. Unfortunately, a well-developed outcome for this population is lacking. Although the Non-Arthritic Hip Score has been administered to assess this specific population, it is not a robust tool. The Non-Arthritic Hip Score was created in 2003 to assess pain and function among young, active patients with activity-limiting hip pain, both preoperatively and postoperatively. This tool is a patient-based, self-administered questionnaire that was developed as a modification of the WOMAC. The Non-Arthritic Hip Score is intended for patients between 20 and 40 years old who are experiencing hip pain without an obvious radiographic diagnosis. The items were generated through pilot test interviews with patients of varying educational levels as well as with health professionals. The tool has been shown to be reproducible, but retesting occurred at any time between 1 and 16 days after the original testing. The tool has internal consistency as assessed with the use of the Cronbach coefficient alpha. Construct validity was determined by comparing the Non-Arthritic Hip Score to the Harris Hip Score and the Short Form-12 for 48 patients.

Although this tool attempts to capture a younger population that has not been previously represented by other hip outcome assessments, the methodology is not ideal because the total number of questions (n = 20) was arbitrarily determined, which may result in a misrepresentation of items that are relevant to a young, active patient with nonarthritic hip problems. In addition, the items were taken directly from the WOMAC index, which was generated for an older, more sedentary population; therefore, the tool may be predisposed to ceiling effects, thus limiting its use for a younger, more active population. In addition, the sections that address pain, mechanical symptoms, and physical function ask the patient to consider problems that have occurred during the previous 48 hours, which may be too short of a time line to be truly representative of the problems that these patients are experiencing.

Most recently, the Hip Outcome Score (HOS) has been developed for younger, more active patients between the ages of 13 and 66 years. This is a self-administered, patient-based tool that is designed specifically to evaluate patients with labral tears who are functioning throughout a wide range of ability. Therefore, the HOS includes only two subscales: activities of daily living and sports. Items were generated by physicians and physical therapists and reduced by factor analysis; no patients were involved with item generation. The tool does show internal consistency as determined by Cronbach coefficients, but it has not been tested for reproducibility. The HOS demonstrated good construct validity as measured by convergent and divergent validity with the SF-36 questionnaire with the use of Pearson correlation coefficients. The items in the HOS were evaluated for their potential to be responsive with the use of an item-response theory analysis; however, testing in patients for sensitivity to change over time was not conducted. Although this tool followed a formal creation methodology and includes an analysis of the questionnaire’s content (which was not done for other outcome tools), it is limited by the specific nature of the population of interest. It may also underrepresent other areas of concern for these patients (e.g., symptoms, work-related issues), because the questionnaire only focuses on two subscales. More recently, the tool has been tested for validity in a group of hip arthroscopy patients with a minimum of 2 years of follow up. It was compared with the SF-36, and it was found to correlate well with measures of physical function. In addition, patients known to have better results scored higher on the tool, thus indicating that it is also likely to be responsive.

Creation of a Health-Related Quality-of-Life Outcome Measure for Young and Active Patients With Hip Disease

For the past 3 years, the authors—in combination with the Multicentre Arthroscopy of the Hip Outcomes Research Network, the Canadian Orthopaedic Trauma Society, and local hip arthroplasty surgeons—have been developing a new outcome measure. There was a perceived need for a more appropriate way to assess patients who are young and active that would differ from existing outcome assessment tools, which were created for the older population with arthritis and fracture. This outcome has proceeded through the Item Generation, Item Reduction, and Pre-testing phases of development. This outcome has been formulated using modern methodological principles. This international collaboration has utilized the input from over 400 active patients between the ages of 18 and 60. All patients were screened to be active based on the modified Tegner Activity Scale of 4 or higher. In its current form the questionnaire has 33 questions that are divided into four domains: symptoms and functional limitations (16 questions), job-related concerns (4 questions), sports and recreational physical activities (6 questions), and social, emotional, and lifestyle concerns (7 questions). This distribution of questions into the domains was determined through an assessment of redundancy, standard and item total correlations, factor analysis, and test/re-test reliability. The questionnaire has no demonstrable floor or ceiling effects, has an overall Pearson correlation of 0.96 on test/re-test analysis, with an error of 5%. The questionnaire has been shown to be qualitatively responsive over time. Face validity has been established by incorporating the patient input at the item generation and reduction phases. Content validity has been confirmed by ensuring that all domains are represented and that experts have been included in the development of the questionnaire. Construct validation has been shown by comparing this questionnaire to the non-arthritic hip score. The two questionnaires were highly correlated at 0.81. The intention of this outcome measure is to evaluate patients who are young and active with traumatic and other hip diseases who are being treated with newer techniques such as arthroscopy and hip resurfacing.

Conclusion

As the orthopedic care of patients with hip pathology continues to evolve, it is important that new therapies be assessed with the use of an appropriate outcome measure. Although there have been many tools developed, few of these have been created with the use of sound methodology or tested for reliability, validity, and responsiveness. Therefore, care must be taken when choosing an outcome measure for use in standard clinical practice research studies. Existing tools provide the clinician or researcher with a select number of options for assessing hip osteoarthritis and other hip-related conditions. There is promise that the newly developed outcome measure (HipQOL) for young active patients will fulfill this role. Future research and development is under way to address this concern.

Annotated references and suggested readings

Amstutz H.C., Thomas B.J., Jinnah R., Kim W., Grogan T., Yale C. Treatment of primary osteoarthritis of the hip. A comparison of total joint and surface replacement arthroplasty. J Bone Joint Surg Am. 1984;66(2):228-241.

Andersson G. Hip assessment: a comparison of nine different methods. J Bone Joint Surg Br. 1972;54B:621-625.

This publication was the first to review the previously published hip outcome measures. Nine were available. Andersson compared the results of 77 patients who had been treated by Moore arthroplasty. Patients were classified as good, fair, or bad. Each outcome was used to classify the patients into the three categories. It was very evident from the results that depending on which outcome was used provided a very different distribution of patients. The conclusion was that it would be important to achieve agreement on how patients were to be assessed..

Bach C.M., Feizelmeier H., Kaufmann G., Sununu T., Gobel G., Krismer M. Categorization diminishes the reliability of hip scores. Clin Orthop Relat Res (411); 2003:166-173.

Bellamy N., Buchanan W.W. A preliminary evaluation of the dimensionality and clinical importance of pain and disability in osteoarthritis of the hip and knee. Clin Rheumatol. 1986;5(2):231-241.

Bellamy N., Buchanan W.W., Goldsmith C.H., Campbell J., Stitt L.W. Validation study of WOMAC: a health status instrument for measuring clinically important patient relevant outcomes to antirheumatic drug therapy in patients with osteoarthritis of the hip or knee. J Rheumatol. 1988;15(12):1833-1840.

This publication established the final dimensions (Pain; Stiffness; Physical Function) of the WOMAC. The authors established face, content and construct validity, reliability, responsiveness, and relative efficiency of the instrument. The WOMAC is a disease-specific outcome designed to be used in clinical trials..

Boardman D.L., Dorey F., Thomas B.J., Lieberman J.R. The accuracy of assessing total hip arthroplasty outcomes: a retrospective correlation study of walking ability and 2 validated measurement devices. J Arthroplasty. 2000;15(2):200-204.

Bryant M.J., Kernohan W.G., Nixon J.R., Mollan R.AB. A statistical analysis of hip scores. J Bone Joint Surg Br. 1993;75–B(5):705-709.

These authors used a factor analytical approach to evaluate variables which were part of 13 methods of hip scoring. They identified three essential variables to assess patients who had a hip arthroplasty. These three variables were walking distance, hip flexion, and pain. The authors concluded that a three-factor hip score should be used to assess the results of hip arthroplasty. However, they suggested that each variable be recorded separately, because combining them into a composite score would be an arbitrary process without scientific foundation..

Callaghan J.J., Dysart S.H., Savori C.G., Hopkinson W.J. Assessing the results of hip replacement: a comparison of five different rating systems. J Bone Joint Surg Br. 1990;72 B:1008-1009.
The authors compared the results of measuring outcomes in 100 patients who had received an uncemented total hip arthroplasty. They compared the five most frequently used rating systems (Hospital for Special Surgery; Mayo Clinical Hip Score; Iowa/Larson Rating scale for hip disabilities; Harris Hip Score; Merle d’Aubiné-Postel) to the patient’s impression of their hip in categories of excellent, good, fair, and poor. They stated that the Hospital for Special Surgery rating produced the most optimistic and the Merle d’Aubigné-Postel rating the most pessimistic.
They also compared the rating to Charnley’s functional classes to which no meaningful relationship was demonstrated. The authors concluded that functional class should be included in all rating systems and that descriptive words such as limp or pain should be used in precisely the same way, but provided no good evidence for this statement.

Charnley J. The long-term results of low-friction arthroplasty of the hip performed as a primary intervention. J Bone Joint Surg Br. 1972;54(1):61-76.

Christensen C.P., Althausen P.L., Mittleman M.A., Lee J.A., McCarthy J.C. The nonarthritic hip score: reliable and validated. Clin Orthop Relat Res. 2003;406:75-83.

This outcome, which is based on the WOMAC, was specifically designed to address a younger and more active population of patients. This 20-item questionnaire incorporated 10 questions directly from the WOMAC measuring pain and physical function. The remaining 10 questions include 4 questions pertaining to mechanical symptoms and 6 questions related to levels of activity. The psychometric properties of the Nonarthritic Hip Score (NAHS) questionnaire were evaluated in a group of 65 patients with an average age of 32.5 years. The NAHS demonstrated excellent reliability and compared favorably to the Harris Hip Score and the SF-12. However, ceiling effects are likely present since the NAHS overall score was very similar to the Harris Hip Score, which was developed for older patients undergoing arthroplasty surgery..

Danielsson L.G. Incidence and prognosis of coxarthrosis. Acta Orthop Scand. 1964;66(Suppl):1-114.

D’Aubigné R.M., Postel M. Functional results of hip arthroplasty with acrylic prosthesis. J Bone Joint Surg Am. 1954;36-A(3):451-475.

Dawson J., Fitzpatrick R., Carr A., Murray D. Questionnaire on the perceptions of patients about total hip replacement. J Bone Joint Surg Br. 1996;78(2):185-190.

The Oxford Hip Score demonstrated in this article represents a very simple and validated way of assessing patients with hip arthritis. It primarily addresses pain and functional complaints with a 5-point descriptive scale for each question..

Dawson J., Fitzpatrick R., Frost S., Gundle R., McLardy-Smith P., Murray D. Evidence for the validity of a patient-based instrument for assessment of outcome after revision hip replacement. J Bone Joint Surg Br. 2001;83(8):1125-1129.

Dutton R.O., Amstutz H.C., Thomas B.J., Hedley A.K. Tharies surface replacement for osteonecrosis of the femoral head. J Bone Joint Surg Am. 1982;64(8):1225-1237.

Engelberg R., Martin D.P., Agel J., Swiontkowski M.F. Musculoskeletal function assessment: reference values for patient and non-patient samples. J Orthop Res. 1999;17(1):101-109.

Ethgen O., Bruyere O., Richy F., Dardennes C., Reginster J.Y. Health-related quality of life in total hip and total knee arthroplasty. A qualitative and systematic review of the literature. J Bone Joint Surg Am. 2004;86-A(5):963-974.

These authors focused on summarizing the literature on the use of health-related quality-of-life instruments to evaluate patients treated with hip or knee arthroplasty surgery. They emphasized the importance of measuring a broader concept of health rather than the more traditional approach of the pathophysiology of hip and knee problems. They identified seven generic and eleven specific instruments that focused on the dimensions of health-related quality of life that were specific to arthritic diseases or total hip and total knee arthroplasty. The Short Form-36 (generic) and the Western Ontario and McMaster University Osteoarthritis Index were the most frequently used..

Ferguson A.B., Howorth A.B. Slipping of the upper femoral epiphysis. JAMA. 1931;97(25):1867-1872.

Gade H.G. A contribution to the surgical treatment of osteoarthritis of the hip joint. A clinical study: comments on the follow up examinations and the evaluation of the therapeutic results. Acta Chir Scandinavica. 1947;120:37-45. Supplementum

Goodwin R.A. The Austin Moore prosthesis in fresh femoral neck fractures (A review of 611 post operative cases). Am J Orthop Surg. 1968;10(2):40-43.

Guyatt G.H., Bombardier C., Tugwell P.X. Measuring disease- specific quality of life in clinical trials. Cmaj. 1986;134(8):889-895.

Guyatt G.H., Feeny D.H., Patrick D.L. Measuring health-related quality of life. Ann Intern Med.. 1993;118(8):622-629.

Harris W.H. Traumatic arthritis of the hip after dislocation and acetabular fractures: treatment by mold arthroplasty. An end-result study using a new method of result evaluation. J Bone Joint Surg Am. 1969;51(4):737-755.

This article represents the origin of the Harris Hip Score that is currently used today based on the results of 30 patients treated for traumatic arthritis. Harris proposed a new clinician-based evaluation tool that was based on 100 points. This scale is made up of 44 points attributed to pain, 47 points to function, 5 points for range of motion, and 4 points for the absence of deformity. The amount of pain is graded from disabled (0 points—i.e., bedridden) to none (44 points). Function is divided into “daily activities,” which is weighted with 14 points, and “gait,” which has 33 points assigned to it. The range of motion score is determined by a composite measurement of flexion, abduction, external rotation in extension, internal rotation in extension, adduction, and extension. Each measurement is multiplied by an index factor from a table to give the maximal possible score for each motion. They are added together and multiplied by 0.05 to get the final point value out of the 5 available points. The range of motion part was originally described by Ferguson and Howorth in 1931 and modified by Gade in 1947. The final 4 points are given for an absence of a deformity (i.e., permanent flexion contracture of >30 degrees; fixed adduction of >10 degrees; fixed internal rotation of >10 degrees or a limb-length discrepancy of >3.2 cm.) At the time of this publication there was no information on the measurement properties of the scale..

Hoeksma H.L., Van Den Ende C.H., Ronday H.K., Heering A., Breedveld F.C. Comparison of the responsiveness of the Harris Hip Score with generic measures for hip function in osteoarthritis of the hip. Ann Rheum Dis. 2003;62(10):935-938.

Hopkins K.D. Educational and psychological measurement and evaluation, 8th ed. Allyn and Bacon: Toronto, 1998.

Hunsaker F.G., Cioffi D.A., Amadio P.C., Wright J.G., Caughlin B. The American Academy of Orthopaedic Surgeons outcomes instruments: normative values from the general population. J Bone Joint Surg Am. 2002;84-A(2):208-215.

Ilstrup D.M., Nolan D.R., Beckenbaugh R.D., Coventry M.B. Factors influencing the results in 2,012 total hip arthroplasties. Clin Orthop Relat Res (95); 1973:250-262.

Jaglal S., Lakhani Z., Schatzker J. Reliability, validity, and responsiveness of the lower extremity measure for patients with a hip fracture. J Bone Joint Surg Am. 2000;82-A(7):955-962.

Johanson N.A., Charlson M.E., Szatrowski T.P., Ranawat C.S. A self-administered hip-rating questionnaire for the assessment of outcome after total hip replacement. J Bone Joint Surg Am. 1992;74(4):587-597.

Johanson N.A., Liang M.H., Daltroy L., Rudicel S., Richmond J. American Academy of Orthopaedic Surgeons lower limb outcomes assessment instruments. Reliability, validity, and sensitivity to change. J Bone Joint Surg Am. 2004;86-A(5):902-909.

Judet R., Judet J. Technique and results with the acrylic femoral head prosthesis. J Bone Joint Surg Br. 1952;34-B(2):173-180.

Keene J.S., Anderson C.A. Hip fractures in the elderly. Discharge predictions with a functional rating scale. AMA. 1982;248(5):564-567.

Kirkley A., Griffin S. Development of disease-specific quality of life measurement tools. Arthroscopy. 2003;19(10):1121-1128.

Kirshner B., Guyatt G. A methodological framework for assessing health indices. J Chronic Dis. 1985;38(1):27-36.

This publication outlines the different types of health status measures including discriminative, predictive, and evaluative indices. They emphasize that the requirements for maximizing the functions of discrimination, prediction, or evaluation may impede the others. They describe the process of developing a measure of quality of life and the importance of each of the steps. The process and major issues with respect to the construction and validation of a measurement tool are discussed and provide the framework for assessing outcome measures..

Klassbo M., Larsson E., Mannevik E. Hip disability and osteoarthritis outcome score. An extension of the Western Ontario and McMaster Universities Osteoarthritis Index. Scand J Rheumatol. 2003;32(1):46-51.

The Hip disability and Osteoarthritis Outcome Score (HOOS) was derived from the WOMAC and followed the same process as the equivalent Knee Osteoarthritis Outcome Score (KOOS). The authors added questions that were taken directly from the KOOS, which was derived from the WOMAC and the Anterior Cruciate Ligament Quality of Life outcome questionnaire. The HOOS is a self-rated evaluative instrument for patients with hip problems. It should be pointed out that there were no new items or questions generated to create the HOOS..

Larson C.B. Rating scale for hip disabilities. Clin Orthop Relat Res. 1963;31:85-93.

Lazansky M.G. A method for grading hips. J Bone Joint Surg Br. 1967;49(4):644-651.

Letournel E., Judet R. Fractures of the acetabulum. New York, London, Berlin. Heidelberg: Springer-Verlag, 1993.

Lequesne M.G., Samson M. Indices of severity in osteoarthritis for weight bearing joints. J Rheumatol Suppl. 1991;27:16-18.

Liang M.H., Katz J.N., Phillips C., Sledge C., Cats-Baril W. The total hip arthroplasty outcome evaluation form of the American Academy of Orthopaedic Surgeons. Results of a nominal group process. The American Academy of Orthopaedic Surgeons Task Force on Outcome Studies. J Bone Joint Surg Am. 1991;73(5):639-646.

Martin R.L. Hip Arthroscopy and Outcome Assessment. Operative Techniques in Orthopaedics. 2005;15:290-296.

Martin D.P., Engelberg R., Agel J., Snapp D., Swiontkowski M.F. Development of a musculoskeletal extremity health status instrument: the musculoskeletal function assessment instrument. J Orthop Res. 1996;14(2):173-181.

This generic musculoskeletal instrument was developed to be patient based and self reported. The authors created an outcome that avoided the problems of other generic measures used to assess patients with musculoskeletal disorders. They created a 100-item questionnaire that was reliable and internally consistent with content validity..

Martin D.P., Engelberg R., Agel J., Swiontkowski M.F. Comparison of the Musculoskeletal Function Assessment questionnaire with the Short Form-36, the Western Ontario and McMaster Universities Osteoarthritis Index, and the Sickness Impact Profile health-status measures. J Bone Joint Surg Am. 1997;79(9):1323-1335.

Martin R.L., Kelly B.T., Philippon M.J. Evidence of validity for the hip outcome score. Arthroscopy. 2006;22(12):1304-1311.

Martin R.L., Philippon MJ. Evidence of reliability and responsiveness for the Hip Outcome Score. Operative Techniques in Orthopaedics. 2005;15:290-296.

This publication introduces the Hip Outcome Score (HOS), which was developed to address the deficiency in preexisting outcomes for young patients undergoing arthroscopy. The HOS has two subscales: Activities of Daily Living (ADL) and Sports. The psychometrics of this instrument have been determined subsequent to this publication. The main problem with this outcome is that no patients were directly involved in the determination of the items included in the two subscales..

Martin R.L., Philippon M.J. Evidence of validity for the hip outcome score in hip arthroscopy. Arthroscopy. 2007;23(8):822-826.

Martin R.L., Philippon M.J. Evidence of reliability and responsive-ness for the hip outcome score. Arthroscopy. 2008;24(6):676-682.

Marx R.G., Jones E.C., Atwan N.C., Closkey R.F., Salvati E.A., Sculco T.P. Measuring improvement following total hip and knee arthroplasty using patient-based measures of outcome. J Bone Joint Surg Am. 2005;87(9):1999-2005.

Matta J.M., Mehne D.K., Roffi R. Fractures of the acetabulum. Early results of a prospective study. Clin Orthop Relat Res (205); 1986:241-250.

McDowell I., Newell C. Measuring health: a guide to rating scales and questionnaires. New York: Oxford University Press, 1987.

Mohtadi N. Development and validation of the quality of life outcome measure (questionnaire) for chronic anterior cruciate ligament deficiency. Am J Sports Med. 1998;26(3):350-359.

Mohtadi N.G., Pedersen M.E., Chan D. The creation of a hip outcome measure for young patients with hip disease. In: World Congress of Sports Trauma.. 2008 Hong Kong

Mohtadi N., Pedersen M.E., Mahorn D., Chan, Fredine J. Validation of the Hip Quality of Life Questionnaire. New York: International Hip Arthroscopy Association, 2009.

Nilsdotter A.K., Lohmander L.S., Klassbo M., Roos E.M. Hip disability and osteoarthritis outcome score (HOOS)–validity and responsiveness in total hip replacement. BMC Musculoskelet Disord. 2003;4:10.

Ohman U., Bjorkegren N.A., Fahlstrom G. Fracture of the femoral neck. A five-year follow up. Acta Chir Scand. 1969;135(1):27-42.

Ovre S., Sandvik L., Madsen J.E., Roise O. Comparison of distribution, agreement and correlation between the original and modified Merle d’Aubigne-Postel Score and the Harris Hip Score after acetabular fracture treatment: moderate agreement, high ceiling effect and excellent correlation in 450 patients. Acta Orthop. 2005;76(6):796-802.

Parker M.J., Palmer C.R. A new mobility score for predicting mortality after hip fracture. J Bone Joint Surg Br. 1993;75(5):797-798.

Pedersen M.E., Chan D., Mohtadi N.G. Hip outcome measures: A systematic review of the literature. In Alberta Orthopaedics Residents Day; 2006;. Red Deer, Alberta; 2006.

Pellicci P.M., Wilson P.D.Jr., Sledge C.B., Salvati E.A., Ranawat C.S., Poss R., et al. Long-term results of revision total hip replacement. A follow up report. J Bone Joint Surg Am. 1985;67(4):513-516.

Rat A.C., Coste J., Pouchot J., Baumann M., Spitz E., Retel-Rude N., et al. OAKHQOL: a new instrument to measure quality of life in knee and hip osteoarthritis. J Clin Epidemiol. 2005;58(1):47-55.

The Osteoarthritis of Knee and Hip Quality of Life questionnaire is a step beyond the WOMAC in that it addresses the full dimensions of quality of life. This well-developed questionnaire has 40 questions divided into 5 dimensions (Pain; Physical activities; Mental health; Social support; Social functioning). In addition, there are three questions that relate to relationships, sexual activity, and professional life. This questionnaire was developed in France and although published in English has not been validated from a cross-cultural perspective..

Rat A.C., Pouchot J., Coste J., Baumann C., Spitz E., Retel-Rude N., et al. Development and testing of a specific quality-of-life questionnaire for knee and hip osteoarthritis: OAKHQOL (OsteoArthritis of Knee Hip Quality Of Life). Joint Bone Spine. 2006;73(6):697-704.

Schmalzried T.P., Silva M., de la Rosa M.A., Choi E.S., Fowble V.A. Optimizing patient selection and outcomes with total hip resurfacing. Clin Orthop Relat Res. 2005;441:200-204.

Shepherd M.M. Assessment of function after arthroplasty of the hip. J Bone Joint Surg Am. 1954;36B(3):354-363.

Shields R.K., Enloe L.J., Evans R.E., Smith K.B., Steckel S.D. Reliability, validity, and responsiveness of functional tests in patients with total joint replacement. Phys Ther. 1995;75(3):169-176. discussion 176–179

Shimmin A.J., Bare J., Back D.L. Complications associated with hip resurfacing arthroplasty. Orthop Clin North Am. 2005;36(2):187-193. ix

Soderman P., Malchau H. Is the Harris hip score system useful to study the outcome of total hip replacement?. Clin Orthop Relat Res (384); 2001:189-197.

Stucki G., Sangha O., Stucki S., Michel B.A., Tyndall A., Dick W., et al. Comparison of the WOMAC (Western Ontario and McMaster Universities) osteoarthritis index and a self-report format of the self-administered Lequesne-Algofunctional index in patients with knee and hip osteoarthritis. Osteoarthritis Cartilage. 1998;6(2):79-86.

Sullivan M., Karlsson J. The Swedish SF-36 Health Survey III. Evaluation of criterion-based validity: results from normative population. J Clin Epidemiol. 1998;51(11):1105-1113.

Tegner Y., Lysholm J. Rating systems in the evaluation of knee ligament injuries. Clin Orthop Relat Res (198); 1985:43-49.

Theiler R., Sangha O., Schaeren S., Michel B.A., Tyndall A., Dick W., et al. Superior responsiveness of the pain and function sections of the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) as compared to the Lequesne-Algofunctional Index in patients with osteoarthritis of the lower extremities. Osteoarthritis Cartilage. 1999;7(6):515-519.

Thompson V.P., Epstein H.C. Traumatic dislocation of the hip: a survey of two hundred and four cases covering a period of twenty-one years. J Bone Joint Surg Am. 1951;33(3):746-778.

Treacy R.B. To resurface or replace the hip in the under 65-year-old: the case of resurfacing. Ann R Coll Surg Engl. 2006;88(4):349-353. discussion 349–353

Tugwell P., Bombardier C., Buchanan W.W., Goldsmith C.H., Grace E., Hanna B. The MACTAR Patient Preference Disability Questionnaire—an individualized functional priority approach for assessing improvement in physical disability in clinical trials in rheumatoid arthritis. J Rheumatol. 1987;14(3):446-451.

Verhoeven A.C., Boers M., van der Liden S. Validity of the MACTAR questionnaire as a functional index in a rheumatoid arthritis clinical trial. The McMaster Toronto Arthritis. J Rheumatol. 2000;27(12):2801-2809.

Ware J.Jr., Kosinski M., Keller S.D. A 12-Item Short-Form Health Survey: construction of scales and preliminary tests of reliability and validity. Med Care. 1996;34(3):220-233.

Wilson P.D.Jr., Amstutz H.C., Czerniecki A., Salvati E.A., Mendes DG. Total hip replacement with fixation by acrylic cement. A preliminary study of 100 consecutive McKee-Farrar prosthetic replacements. J Bone Joint Surg Am. 1972;54(2):207-236.

Wright J.G., Rudicel S., Feinstein A.R. Ask patients what they want. Evaluation of individual complaints before total hip replacement. J Bone Joint Surg Br. 1994;76(2):229-234.

Wright J.G., Young N.L. A comparison of different indices of responsiveness. J Clin Epidemiol. 1997;50(3):239-246.

Yano H., Sano S., Nagata Y., Tabuchi K., Okinaga S., Seki H., et al. Modified rotational acetabular osteotomy (RAO) for advanced osteoarthritis of the hip joint in the middle-aged person. First report. Arch Orthop Trauma Surg. 1990;109(3):121-125.

Zuckerman J.D., Koval K.J., Aharonoff G.B., Hiebert R., Skovron ML. A functional recovery score for elderly hip fracture patients: I. Development. J Orthop Trauma. 2000;14(1):20-25.

Techniques in Hip Arthroscopy and Joint Preservation Expert Cons