The Cincinnati Knee Rating System

Published on 16/03/2015 by admin

Filed under Orthopaedics

Last modified 22/04/2025

Print this page

This article have been viewed 13189 times

Chapter 44 The Cincinnati Knee Rating System

Sue D. Barber-Westin, BS, Frank R. Noyes, MD

INTRODUCTION

The assessment of outcome following the treatment of knee injuries and disorders has received tremendous attention in the sports medicine literature. Over 40 knee rating instruments have been published since the mid 1980s; however, only a few have undergone rigorous testing of the psychometric properties of reliability, validity, and responsiveness.^26,^39,⁵⁶ Historically, early scoring scales and systems designed to rate the outcome of anterior cruciate ligament (ACL) reconstruction were introduced into the medical community without undergoing an assessment of these properties.^15,^19,^32,^34,^40,^63,⁶⁶ With no consensus regarding which variables to include in the measurement of patient outcome, it is not surprising that studies that compared the results of ACL reconstruction using different rating systems showed distinct differences in results and conclusions.^1,^6,^7,^22,^55,⁵⁷

In order to provide a comprehensive analysis of the knee condition and its impact on activity and function after ACL reconstruction (and other surgical procedures), clinical investigators have suggested that rating systems measure a variety of symptoms, sports and daily activity functions, patient satisfaction, and objective physical findings.^1,^57,⁶⁵ Only two such systems are currently available that have established reliability, validity, and responsiveness: the Cincinnati Knee Rating System (CKRS)^4,⁴³ and the International Knee Documentation Committee (IKDC) system.^23,²⁴ Each of these rating systems measures pain, swelling, giving-way, functions of sports and daily activities, sports activity levels, patient perception of the knee condition, range of knee motion, joint effusion, tibiofemoral and patellofemoral crepitus, knee ligament subluxations, compartment narrowing on radiographs, and lower limb symmetry during single-leg hop tests.

Critical Points INTRODUCTION

• Over 40 scales and knee rating instruments have been published; few have proven reliability, validity, and responsiveness.

• Knee rating systems must measure a variety of symptoms, sports and daily activity functions, and objective physical findings.

• Whereas the subjective assessment of symptoms and functional limitations is important, the final outcome of a specific treatment must also take into account objective measures such as physical findings, radiographs, and arthrometer measurements.

• The Cincinnati Knee Rating System is one of the most commonly used instruments to measure the results of anterior cruciate ligament reconstruction and has been considered a “gold standard” in the development and validity analyses of other knee rating scales.

The authors agree with Zarins ⁶⁵ that although the subjective assessment of symptoms and functional limitations is important, the final outcome of a specific treatment must also take into account objective measures that are appropriate for the diagnosis or injury under study. The determination of patient outcome according to subjective questionnaire-based data only⁵⁹ does not provide a complete understanding of the ability of the treatment protocol to restore normal knee function. Indeed, knee rating scales exists in which a patient may be rated as “excellent” even though an ACL reconstruction failed to restore normal or nearly normal knee stability, which is one of the major goals of the procedure.⁶³ It is well known for this injury that in the short-term, patients may function well, but over time, the knee joint deteriorates and functional limitations increase to frequently affect daily activities.^2,^5,^9,⁵² In addition, investigators should use instruments sensitive to the condition under study. Generic health or quality of life questionnaires have limited usefulness in studies comprising patients with a specific diagnosis and, therefore, should be used in addition to disease-specific rating systems. This chapter describes the rationale and methodology for the major components of the CKRS. The IKDC system is discussed in detail in Chapter 45, The International Knee Documentations Committee (IKOC) Rating System.

The CKRS was first published over 20 years ago in concert with the largest ACL natural history study conducted during that time period.⁵² In the early 1980s, the dilemma of the appropriate treatment for complete ACL ruptures stemmed in part from limited knowledge of the functional limitations caused by the injury and the lack of a rigorous rating system that graded symptoms and limitations according to the specific type of activity during which they occurred. Over the ensuing decade, additional scales and modifications were developed in order for the CKRS to provide a comprehensive assessment of the knee condition.^4,^43,^44,⁵¹ An overall rating scheme was devised to provide a final rating, which is available in either a numerical or a categorical manner, as is discussed later. The major components of the CKRS are shown in Table 44-1. This system is one of the most commonly used instruments in the orthopaedic literature to measure the results of ACL reconstruction and has been considered a “gold standard” in the development and content and criterion validity analyses of other knee rating scales.^21,^23,^25,³⁶ The CKRS was initially designed and validated in athletically active populations; however, it is also useful for patients who have undergone other operative procedures such as articular cartilage restorative procedures, meniscus repairs or transplants, osteotomies, or patellofemoral procedures.

TABLE 44-1 Components of the Cincinnati Knee Rating System

REVIEW OF ANALYSES USED TO MEASURE PSYCHOMETRIC PROPERTIES OF OUTCOME INSTRUMENTS

The use of outcome instruments with proven psychometric properties is essential. These properties—namely, reliability, validity, internal consistency, and responsiveness—are determined through a variety of methods.

Reliability is the extent to which scores on an instrument are reproducible and is measured either between subjects (test-retest) or between observers (interobserver). Patients complete questionnaires at separate time periods; a minimum 1-week interval was recommended by Deyo and coworkers ¹² to elapse between questionnaire administration. Reliability is measured with product-moment correlations and intraclass correlation coefficients (ICCs). ICC is the most commonly used statistic in modern studies and is calculated by

where A is the standard deviation (SD) from all values in trial number 1, B is the SD from all values in trial number 2, C is the SD of the difference in all values between trials number 1 and number 2, D is the mean of the difference between trials number 1 and number 2, and N is the total number of patients.

Correlations between test-retest data should be greater than 0.70, which is considered the standard for adequate reliability for questionnaires.⁵⁴ The use of ICC rather than the more common Pearson correlation coefficient was suggested by Deyo and coworkers¹² and Lin³¹ to provide a more sensitive assessment of variability within data. The problem that can occur with the Pearson correlation coefficient is that duplicate measurements may be systematically different yet correlate highly and, as a result, be falsely interpreted. For example, if every patient scored exactly 5 points lower on a scale on the second administration, the test-retest correlation would be a perfect 1.0, despite the fact that every patient had a lower score. The ICC handles this problem because it not only assesses the strength of the correlation but also determines whether the slope and intercept in the regression line of test-retest data vary from those expected with duplicate results. In our example, the ICC would correspondingly be reduced to demonstrate the systematic difference between the test-retest data.

Several measures have been described to determine the validity of an instrument, including content, construct, item-discriminant, convergent, and criterion. In general, validity is the psychometric criterion in which an instrument is tested to determine its ability to actually measure what it claims to measure.²⁷ Content validity is the extent to which a question or instrument represents the area of interest and has been described in various manners. Face validity, one example of content validity, is determined by consulting both patients and experienced medical professionals regarding the development of a scale’s questions and their relevance to the diagnosis under study.¹⁶ For instance, for ACL reconstruction investigations, a questionnaire with good face validity would be believed by patients, surgeons, and therapists to measure the common problems caused by this injury, such as pain and instability. This represents a subjective analysis that is not statistically analyzed. Another method to determine content validity is to calculate floor (worst result) and ceiling (best result) effects.^37,⁶² Scales in which the majority of patients score either the highest level or the lowest level do not allow for an assessment of deterioration or improvement over time. Floor and ceiling effects are present when greater than 30% of the population marks either the best possible or the worst possible scores on a scale.²⁸

Construct validity is the extent to which a measure corresponds to expected theoretical concepts or hypotheses regarding the diagnosis. An instrument will accurately differentiate patients whose outcome is expected to vary with regard to certain characteristics known to the disease process.^14,⁶¹ Researchers develop hypotheses based on prior research and clinical experience in which the questionnaire scores are expected to be significantly different between selected patient groups. The hypotheses are confirmed using the F and T-test at the level of P < .01. In addition, construct validity is determined by conducting Pearson product-moment correlation coefficients between scale items and either previously validated instruments or physician and patient assessments. A moderately strong coefficient is proved at R > 0.60.

Item-discriminant (or –divergent) validity is present when variables hypothesized to be dissimilar (such as patient age and anteroposterior knee [AP] displacements) are indeed found to be statistically unrelated.^14,³⁸ Pearson correlations are performed to detect statistical dissimilarities, proven at R = 0.28 or less. In the opposite manner, convergent validity is present when variables believed to be similar within the questionnaire are indeed found to be statistically similar.

Criterion (or concurrent) validity is assessed by correlating scores on the instrument under study with other criteria known or believed to adequately measure the function or symptom. This determines how a new instrument compares with an accepted gold standard instrument. The Pearson product-moment correlation coefficient is used to determine this property, with moderately strong findings indicated by R > 0.60.

Internal consistency is determined by a coefficient α greater than 0.60.²⁸ The underlying concept of this measure is that the consistency with which a patient answers from one question to the next can be used to provide an estimate of reliability for the total test score.⁴¹ A high coefficient α indicates that the items in a questionnaire are consistently measured or are homogeneous with regard to the measurement of the underlying diagnosis or attribute.³⁰

Responsiveness, or the ability of an instrument to detect clinically important change, is determined by calculating standardized response means (SRM) and effect sizes (ES) of the selected instrument categories. The magnitude of the SRM (mean change in score from preoperative to follow-up/SD of change in score)¹⁸ and the ES (mean change in score from preoperative to follow-up/SD of preoperative score)²⁹ are interpreted using the Cohen standard of greater than 0.20 for small effects, greater than 0.50 for moderate effects, and greater than 0.80 for large effects.¹⁰ This analysis provides a more precise indication of the change in results over time from those obtained by the standard student T-test. An instrument’s sensitivity simply denotes its ability to measure any change, which by definition does not necessarily indicate one that is clinically meaningful.²⁷

COMPONENTS OF THE CKRS

Rating of Symptoms

Pain, swelling, partial giving-way, and full giving-way are the major knee symptoms assessed in ACL investigations. Authors have proposed a variety of methods for the measurement of symptoms, from a binary system (“yes” or “no”^19,³⁴), to visual analog scales,^17,^21,³⁹ to a severity rating (such as mild, moderate, severe) which can be done either alone³² or in combination with activities (such as “slight after strenuous sports”⁶⁶).^15,^32,^56,⁶⁶

In 1983, Noyes and associates ⁵² proposed that the assessment of knee symptoms should be scored according to the activity during which they occurred: strenuous sports, recreational sports, or walking. This rationale provided an understanding of the impact of a chronically deficient ACL-deficient knee, because 30% of 103 patients reported pain with walking alone, 47% with recreational sports, and 69% with strenuous sports in the authors’ natural history study.

The assessment of symptoms was later refined and the scale increased to a six-level gradient shown in Figure 44-1.^43,^44,⁵⁰ Points are awarded for the highest activity level in which the patient is able to participate without incurring the symptom (Appendix A). If the symptom is present with activities of daily living, it is rated as either moderate (frequent, limiting) or severe (constant, not relieved). Definitions are provided for terms that might otherwise be ambiguous to patients, such as “moderate” sports (running, turning, twisting) and “strenuous” sports (jumping, hard pivoting).

FIGURE 44-1 Symptom Rating Scale.

(From Barber-Westin, S. D.; Noyes, F. R.; McCloskey, J. W.: Rigorous statistical reliability, validity, and responsiveness testing of the Cincinnati Knee Rating System in 350 subjects with uninjured, injured, or anterior cruciate ligament-reconstructed knees. Am J Sports Med 27:402–416, 1999.)

When reporting symptoms before and after surgery, a distribution of the percentage of patients in each of the six levels should be shown (along with a mean and SD) for both time periods. An example is shown in Figure 44-2 for a group of patients who received a meniscus transplant.⁴⁸ The data were also expressed in the body of the text as “the mean preoperative Cincinnati Knee Rating Scale pain score of 2.5 points (range, 0–6 points) improved to a mean of 5.8 points (range, 0–10 points) at follow-up (P < .0001). Before the meniscus allograft, thirty patients (79%) had moderate to severe pain with daily activities but at follow-up, only four patients (11%) had pain with daily activities.”⁴⁸

FIGURE 44-2 The pain scale shows the highest level of activity possible without the patient experiencing knee pain. This example was taken from a clinical study on meniscus transplantation. The difference between the preoperative and the follow-up visit was statistically significant (P < .0001). Mod, moderate; Sev, severe.

(From Noyes, F. R.; Barber-Westin, S. D.; Rankin, M.: Meniscal transplantation in symptomatic patients less than fifty years old. J Bone Joint Surg Am 86:1392–1404, 2004.)

One problem may occur in the rating of symptoms when patients have not attempted to return to strenuous sports activities. In these situations, a potential bias may occur if the patient or clinician attempts to project the correct symptom level based on a hypothetical answer. For instance, if a patient returned to bicycling or swimming and had no pain with those activities, then the pain score awarded would be a level 6 (see Fig. 44-1). However, if the patient is asked whether she or he believes pain would occur with level 8 activities (running, twisting, turning) and she or he responds that it probably would not occur, a bias would occur if this score was assigned without further verification that this was indeed true. This is often the case with the symptom of giving-way, because patients frequently return asymptomatically to level 6 or 8 activities postoperatively, but state that they participated a few times at level 10 activities (jumping, hard pivoting) without problems. Points are awarded only with a reasonable basis for the assessment and not speculated by the patient as to the level that may be possible.

This potential bias is a particular problem with populations of chronic knee injuries with compounding problems of advanced articular cartilage deterioration, multiple ligament reconstructive procedures, or varus osseous malalignment that is corrected with a high tibial osteotomy (HTO) in addition to ACL reconstruction. This problem has two solutions. First, the clinician may ask the patient to test the knee at higher levels of activities, if both patient and physician believe this is a reasonable request. Then, the patient can be contacted later for a symptom rating after he or she has participated in strenuous activities several times.

Second, the clinician may use the modified symptom rating scale that was first introduced in an investigation of patients with varus osseous malalignment and ACL deficiency who were treated with multiple operative procedures.⁴⁵ This modified scale consists of a four-level gradient. The levels of 0, 2, and 4 are the same as the original scale shown in Figure 44-1. The fourth and highest level (level 6) indicates that some type of sports participation is possible without the symptom (Fig. 44-3). This modified scale is intended for studies in which the majority of patients do not return to moderate or strenuous athletics indicated in levels 8 and 10. The reliability of this modified scale was previously shown to be adequate for patients and normal subjects.⁴ However, clinicians should be aware that the modified scales might have reduced sensitivity, especially if the results of a study that used the modified scale are compared with another study that used the original scale. This pertains only to the individual symptom results. The effect of the modified scale on the overall rating score, described later in this chapter, is small and has only a negligible impact when comparing the data of the overall scores between different populations.

FIGURE 44-3 Modified Symptom Rating Scale.

Even though there is always the potential for a bias to exist regarding the scoring of subjective symptoms, the CKRS format allows for an accurate assessment of the activity levels the patients returned to on a routine basis. The scale was designed to not award points if a patient participates at a high activity level but has symptoms, thereby fulfilling the criteria of the “knee abuser.”⁵²

Rating of Patient Perception of the Knee Condition

Modern knee rating systems incorporate some form of patient satisfaction, or rating of the patient’s perception of the knee condition, into the assessment of clinical outcome.^4,²³ In the CKRS, patients are asked to rate the overall condition of the knee by circling a number on a scale from 1 to 10 (Fig. 44-4). Four descriptors are provided to assist the patient in understanding the meaning of the numerical scale. Under the number 2 is the term “poor,” defined as “I have significant limitations that affect activities of daily living,” whereas under the number 10 are the terms “normal/excellent,” defined as “I am able to do whatever I wish (any sport) with no problems” (see Appendix A). For data reporting purposes, a distribution of responses is shown in a five-level gradient. Responses under numbers 1 and 2 are termed “poor”; those under numbers 3 and 4, “fair”; those under numbers 5 and 6, “good”; those under numbers 7 and 8, “very good”; and those under numbers 9 and 10, “normal.”

FIGURE 44-4 Patient Perception of the Knee Condition.

An example is shown in Figure 44-5 for a group of patients who received a meniscus transplant.⁴⁸ The data were also expressed in the body of the text as “the mean preoperative patient perception score of 3.2 points (range, 1–6 points) improved to a mean of 6.2 points (range, 1–9 points) at follow-up (P = .0001). Two patients rated the knee condition as the same, and two as worse.”

FIGURE 44-5 An example of the distribution of patient perception of the overall knee condition, from a clinical study on meniscus transplantation. The difference between preoperative and follow-up was statistically significant (P < .0001).

(From Noyes, F. R.; Barber-Westin, S. D.; Rankin, M.: Meniscal transplantation in symptomatic patients less than fifty years old. J Bone Joint Surg Am 86:1392–1404, 2004.)

Clinicians and researchers should realize that inconsistencies might arise from the responses to the patient perception scale and those in other scales of the CKRS. For instance, an 18-year-old patient who successfully returned to competitive soccer without problems or symptoms rated the overall knee condition as a 7 because he felt “slower than before the injury.” Conversely, a 45-year-old patient with a triple varus knee who required an HTO, ACL reconstruction, and posterolateral reconstruction and was able to return only to low-impact activities, also rated her knee condition as a 7. She indicated she was exceptionally pleased that her constant pain with daily activities had resolved and that she was able to swim and bicycle without problems. Because this portion of the CKRS is perhaps the most subjective of all of the assessment factors, it is not correlated with other outcome measures. This underscores the inconsistencies in reporting of patient outcome when only a patient perception rating is used to determine the results of an operation.

Rating of Sports and Daily Function and Activities

In the sports medicine literature, the accurate assessment of sports activity levels and participation is crucial in the determination of the result of an operation or treatment of any knee problem. This is true for many situations. For instance, if a competitive athlete chooses an ACL reconstruction after an acute rupture in order to be able to return to a high activity level, the outcome instrument must be able to precisely define the level of sports participation before and after surgery. Or, if a sedentary individual with moderate arthrosis and giving-way symptoms chooses ACL reconstruction to reduce limitations with daily activities and adopt a more active lifestyle, the instrument must be able to detect a more subtle change (increase) in activity. An instrument must also be able to detect the knee abuser or the patient who returns to athletics and experiences symptoms that may be harmful to the knee joint over the long term.⁵² These individuals should be sorted from others who returned to athletics without problems. Thus, the accurate assessment of sports participation must span many levels of ability, intensity, and frequency of participation; be able to sort populations according to changes in athletic levels or lifestyle; and identify patients who experience symptoms during athletic activities.

Critical Points RATING OF SPORTS AND DAILY FUNCTION AND ACTIVITIES

Accurate assessment of sports participation must span many levels of ability, intensity, and frequency of participation; be able to sort populations according to changes in athletic levels or lifestyle; and identify patients who experience symptoms during athletic activities.

The CKRS Sports Activity Scale was first introduced in 1989 after an analysis of existing scales at that time period detected multiple biases and potential sources of error in reporting outcome of ACL reconstruction.⁴⁴ The problems identified included (1) the failure to precisely define sports activity levels according to specific sport and intensity of participation, (2) the failure to sort populations according to overall intensity of athletic participation before and after treatment, (3) the failure to detect and sort from the population patients who return to sports and experience significant symptoms, (4) the combination of work and sports activities into the same scale, and (5) the failure to detect alterations in athletics due to changes in lifestyle and non–knee-related factors.

The goal in the development of the Sports Activity Scale was to distinguish among categories of athletic activities in a manner that allowed investigators to apply the rating in a uniform manner to any type of athletic activity. Two criteria were selected to determine this rating. First, the frequency of participation was determined using a four-level gradient that assigned patients to a subgroup depending upon the number of days in a week (or month) of sports participation (Fig. 44-6; see also Appendix A). Second, the knee functions that occurred during various sports activities were determined and sorted into three subgroups. The first subgroup consisted of sports that involved the most difficult knee motions: jumping, hard pivoting, and cutting. The second subgroup included sports that involved running, twisting, and turning. The third subgroup consisted of sports that did not involve running, twisting, or jumping (e.g., swimming and cycling).

FIGURE 44-6 Sports Activity Scale.

This sorting of sports activities according to frequency and intensity eliminates the ambiguous classification of athletes into categories such as recreational or competitive. Note that collegiate, professional, and adolescent athletes can be placed in the same category on this scale. Although some examples of sports are listed under the various subgroups, any athletic activity may be placed into the scale according to the knee functions that occur during that particular activity. The investigator may analyze the sports activity scale data according to either the frequency subgrouping or the intensity subgrouping categories. In addition, the sports scale incorporates a three-level gradient for patients who do not participate in athletics. This allows a determination of the overall severity of symptoms with activities of daily living.

The reporting of the patient responses to the Sports Activity Scale may be shown as a distribution according to either frequency or intensity of activities. An example is shown in Table 44-2 from a study on posterior cruciate ligament (PCL) reconstruction.⁴⁶ An average score should not be calculated from this scale, because the data are categorical in nature.

TABLE 44-2 Example of Sports Activity Levels before and after Posterior Cruciate Ligament Reconstruction

Type of Sport	Preoperative*	Follow-up*
Jumping, hard pivoting, cutting	0	2
Running, twisting, turning	1	2
Swimming, bicycling	7	11
None	11	4
Change in Sports Activities		4
Increased level, no symptoms		8
Same level, no symptoms		4
Decreased level, no symptoms		1
Playing with symptoms		2
No sports, knee-related reasons		3
No sports, non–knee-related reasons		1

* Number of patients.

From Noyes, F. R.; Barber-Westin, S.: Posterior cruciate ligament replacement with a two-strand quadriceps tendon–patellar bone autograft and a tibial inlay technique. J Bone Joint Surg Am 87:1241–1252, 2005.

A second component of the assessment of sports activities is the change that occurs in activity levels between treatment periods, the most common being that which occurs between the preoperative and the most recent follow-up evaluations (Fig. 44-7). The format is designed to determine changes in sports activities due to either knee-related or non–knee-related reasons and to detect knee abusers. The data from this analysis are usually displayed in publications in the same table as the Sports Activity Level data (see Table 44-2).

FIGURE 44-7 Change in sports activities.

The third component of the assessment of sports activities is the rating of six individual functions that place varying loads on the knee joint (Fig. 44-8). Functions of sports are analyzed separately from those of daily activities to assess limitations in all patients, not just those participating in athletics. Even in cases in which the knee joint is markedly affected, patients are able to indicate the problems encountered with these types of activities. Each function is determined using a four-level gradient whose terminology was selected to decrease the subjective component inherent in this type of analysis. The data obtained for each of the six functions is reported individually using means, SDs, and the distribution of responses in each of the four levels.

FIGURE 44-8 Activities of Daily Living and Sports Function Scales.

Rating of Occupational Activities

Little attention has been given in the sports medicine literature on the effect of ACL reconstruction on work activities. The results of knee operations and treatment programs should be determined by assessing both occupational and sports activities, especially if patients are not involved in athletics because of lack of interest, time, ability, or other factors. In these cases, the ability of an operation or treatment program to successfully return patients to their occupation is a primary determinant of the program’s effectiveness. This determination can only be made with a rating scale for work activities, which is separate from that of sports activities. Scales that combine work and sports into one activity rating do not allow for this analysis to be performed.⁶³

A study was conducted on published occupational rating systems in which several bias and potential sources of error were detected when reporting outcomes of treatment programs.⁵¹ One problem was the lack of a scale that sufficiently evaluated the amount of stress placed on the knee at work or whether the knee condition was limiting work activities. Ambiguous terminology, such as light, moderate, and heavy, was frequently encountered to describe occupations. No published scale incorporated a method to detect a change in work status due to factors not related to the knee condition. Based on this study, an Occupational Rating Scale (Fig. 44-9) was created that was composed of seven factors that place varying amounts of load on the lower extremity (see Appendix A). Each factor was graded according to intensity, frequency, and duration of the task required on a daily basis. The factors chosen were commonly used to assess the condition of the knee joint in workers’ compensation evaluations. This scale was found to be more effective and valid in assessing demands of occupations on the lower extremities than scales that used job titles or arbitrary numerical scales.⁵¹

FIGURE 44-9 Occupational Rating Scale.

The factor “standing/walking” was selected because of the harmful effect prolonged weight-bearing has on patients with certain knee disorders, including patellofemoral and tibiofemoral arthrosis. “Climbing” and “squatting” assess the ability of the knee to function under repetitive loading conditions. The factor “walking on uneven ground” is important for assessing joint instability in knees after a ligament injury or reconstruction. The factors “lifting/carrying” and “pounds carried” are helpful in assessing the general ability of the knee to tolerate specific loading conditions. The number of factors assessed was limited for the sake of brevity and simplicity, even though many others could have been included. These factors were not chosen to relate to one diagnosis or condition. Rather, they provide a general intensity rating of the work conditions on the lower extremity.

Each factor is graded according to one of two numerical scales: five factors are assigned a possible total of 10 points and two factors are assigned a possible total of 5 points. In terms of rating loads placed on the knee joint, the numbers assigned to rate the work intensity of one factor do not necessarily equal the same numbers assigned to rate the work intensity of other factors. The actual loads placed on the knee joint are a function of multiple intrinsic and extrinsic factors that would require actual measurement. The scores for each factor are totaled, providing a numerical score for data reporting purposes. Occupations may then be categorized based on the total number of points as either disabled (0 points), very light (1–20 points), light (21–41 points), moderate (41–60 points), heavy (61–80 points), or very heavy (>80 points). The data may be expressed as means and SDs for the population under study or through the distribution of patients in the six occupational categories before and after treatment.

The second major component in the assessment of work activities is the change that occurs in these activities between treatment periods, the most common occurring between preoperative and most recent follow-up evaluations. The format and terminology are the same as those used in the assessment of the change in sports activities between treatment periods, previously described in Figure 44-7. Importantly, the determination is made if changes in work activities are due to knee-related or non–knee-related reasons and if the patient is experiencing symptoms in her or his occupation.

The reliability of the Occupational Rating Scale was demonstrated to be adequate in patients with ACL ruptures, chronic patellofemoral disorders, degenerative meniscal tears, and degenerative knee joint arthritis.⁴ This scale is useful in the clinic to rate and categorize the occupational status of workers’ compensation patients and in the development of disability ratings. In addition, the Occupational Rating Scale has been used to assess the effect of insurance plans (private vs. workers’ compensation) on the outcome of ACL reconstruction.^47,⁶⁴

Patient History

For research purposes, a specific data collection format was developed to ensure that accurate and consistent information is obtained for the patient history (Fig. 44-10). For instance, the coding of major reinjuries and prior surgical procedures to the involved knee is necessary to determine whether the patient meets the established criteria for a particular study. For example, an acute prospective study would include only those patients who have sustained one major injury to the knee and no prior surgical intervention. In such an investigation, patients who had additional injuries or previous surgical procedures to the involved knee would be excluded. In addition, the condition of the noninvolved knee must be taken into consideration. Patients who have problems with the noninvolved knee are sorted into a separate subgroup to determine whether this factor affects the treatment outcome.

FIGURE 44-10 Patient history.

Only information related to orthopaedic conditions is included in the history portion of the CKRS. Patients are screened for additional medical problems, but these findings are not usually included unless they alter the treatment outcome. General demographic data are documented for identification and description purposes. The history of prior knee surgeries is coded by major operative categories to provide the basic information necessary for reporting prior treatment.

Knee Examination

The knee examination portion of the physical evaluation is designed to document all parameters of the knee condition (Fig. 44-11). The contents are divided into five sections. The general examination section includes joint effusion and range of motion. Joint effusion is rated as normal, mild (<25 ml of fluid), moderate (26–60 ml of fluid), or severe (>60 ml of fluid). Knee flexion and extension are measured with a goniometer; the heel-height method is also used to determine differences in extension between knees.

FIGURE 44-11 Knee examination.

The tibiofemoral assessment includes joint line pain, crepitus, and compression pain. Pain is rated as either none, mild, moderate, or severe. Crepitus is graded as none, mild, moderate, or severe. Moderate crepitus indicates that a cartilage abnormality (definite fibrillation) exists within 25° to 50° of knee motion. Severe crepitus indicates that the abnormality is present in an arc greater than 50° of knee motion. The patellofemoral examination includes crepitus, compression pain, lateral subluxation at 20°, medial subluxation at 20°, soft tissue pain, and soft tissue swelling. Crepitus and pain are rated as described for the tibiofemoral assessment. Lateral patellar subluxation at 20° of flexion is rated as normal if the patella can be moved within 25% of its width; mild, 26% to 50%; moderate, 51% to 75%; and severe, greater than 75%. Medial patellar subluxation at 20° of knee flexion is rated as normal if the patella can be moved 15 mm or less; mild, 11 to 15 mm; moderate, 6 to 10 mm; and severe, less than 6 mm.

The radiographic section assesses the three knee compartments as either normal (no joint space narrowing), mild, moderate (narrowing < 50% of the total joint space) or severe (narrowing > 50% of the total joint space) using standing 45° posteroanterior and 30° axial patellofemoral radiographs. Studies in which lower limb malalignment exists also require standing radiographs of the hip-knee-ankle to document the weight-bearing line and mechanical axis.¹³

The ligament subluxation section documents the results of the pivot shift test (scale, 0–3) and other knee ligament subluxation tests (see Fig. 44-11). The pivot shift test is recorded on a scale of 0 to 3, with a grade of 0 indicating no pivot shift; grade 1, a slip or glide; grade 2, a jerk with gross subluxation or clunk; and grade 3, gross subluxation with impingement of the posterior aspect of the lateral side of the tibial plateau against the femoral condyle. The posterior drawer test (palpation of the medial tibiofemoral step-off) is done at 90° of flexion, and the difference in posterior tibial translation between the reconstructed knee and the contralateral knee is recorded. The clinical tests for lateral and medial joint opening are based on the varus and valgus stress tests performed at 0° and 30° of flexion. The examiner estimates the amount (in millimeters) of joint opening from the initial closed contact position of each tibiofemoral compartment to the maximal opened position, comparing the injured with the opposite normal knee. The tibiofemoral rotation test is used to detect posterior tibial subluxations in a qualitative manner at 30° and 90° of knee flexion.⁴⁹

Objective Testing

Objective testing of the lower extremity is performed to determine ligamentous, muscular, and limb symmetry deficits (Fig. 44-12). The KT-2000 knee arthrometer test is performed at 20° of flexion at 134 N for ACL investigations. Total AP displacement is measured because this is more accurate than attempting to set a neutral point between anterior and posterior tibial displacement. The difference in the measurements between the reconstructed and the contralateral knee is recorded. Any knee that has an ACL or PCL rupture on the contralateral knee is excluded from KT-2000 testing. The manual maximum arthrometer examination is not used because the magnitude of the force applied is highly variable and not defined or measured. This problem results in inconsistent results between examiners and studies.

FIGURE 44-12 Objective testing.

Posterior stress radiographs are done with an 89-N force applied to the proximal tibia in knees with PCL ruptures.²⁰ A lateral radiograph is taken of each knee at 90° of flexion. The limb is placed in neutral rotation with the tibia unconstrained and the quadriceps relaxed. The difference in posterior tibial displacement between the reconstructed knee and the contralateral knee is recorded. More than 8 mm of increase in posterior tibial translation on stress testing indicates a complete PCL rupture. The knee arthrometer is the device most frequently used by investigators to measure posterior tibial translation after PCL replacement. However, this device underestimates the true amount of residual posterior translation in PCL-injured and reconstructed knees and, in the authors’ opinion, is unreliable for determining the results of PCL surgical procedures at high flexion angles.^20,^33,⁶⁰ Lateral stress radiographs are performed in knees with deficiency of the posterolateral structures. Both knees are radiographed at 20° flexion, neutral tibial rotation, and 67 N varus force and a comparison is made of the millimeters of lateral tibiofemoral compartment opening.

Muscular weakness or imbalances are assessed with isokinetic testing at either slow (60°/sec) or fast (300°/sec) speeds or in an isometric mode for patients with patellofemoral pain or symptomatic crepitus. Single-legged function hop tests are performed as previously described.^3,⁴² Patients perform any two of the hop tests shown in Figure 44-12 to determine functional limitations after ACL reconstruction and to continue the functional progression for return to sports activities.

Operative Procedures and Articular Cartilage Rating

Documentation of the operative procedures performed and the condition of all ligamentous structures, menisci, and articular cartilage in the patellofemoral and tibiofemoral compartments is essential (Fig. 44-13). The operative procedures or findings may be used as final exclusionary or inclusionary criteria as determined by the study protocol.

FIGURE 44-13 Operative procedures, articular cartilage condition.

A classification system for the rating of articular cartilage has been described previously.⁵³ This system separates the description of the surface appearance of the cartilage from the extent or depth of the defect. Three surface grades are rated (1) closed chondromalacia (surface intact), (2) open lesions of fibrillation and fragmentation (surface disrupted) and (3) bone exposed. Subtypes (a) and (b) then make a distinction regarding the depth of the lesion as described. The location and millimeters of surface involvement are recorded. Chapter 47, Articular Cartilage Rating Systems, further details this and other articular cartilage rating systems. In addition, treatment of articular surfaces such as débridement, drilling, abrasion arthroplasty, and shaving is also documented along with the condition of the articular cartilage.

Documentation of Postoperative Complications

Complications from surgical procedures or rehabilitation programs are documented on a prospective basis (Fig. 44-14). In addition, senior researchers or surgeons who did not participate in the care of the patient conduct a chart review to ensure complete and accurate reporting of complications and results of treatment.

FIGURE 44-14 Complications.

Overall Rating Assessment

An overall rating assessment is performed based on the ratings from 20 factors including symptoms, functional limitations, knee examination, stability, radiographs, and function testing (Appendix B). In patients in whom the ACL reconstruction is done for an acute injury (within 12 wk of the original injury), the analysis of the overall rating is performed in a manner different from that for patients with chronic injuries, who often have compounding problems such as articular cartilage damage, prior meniscal loss, loss of secondary restraints, and associated ligamentous instabilities.

Patients with acute knee injuries are expected to have more favorable results, especially in the symptom and functional limitations analyses, and therefore, a categorical analysis is used (along with a point scale) to determine the outcome of these patients. With this method, patients receive an overall rating grade of “excellent,” “good,” “fair,” or “poor” from the 20 factors shown in Appendix B. To receive a rating of “excellent,” all but 1 score must be in that category, with the remaining score in the “good” category. A “good” result is given when all scores are in the “excellent” or “good” categories and none are in the “fair” or “poor” categories. A “fair” result is assigned when any 1 score is “fair,” and a “poor” result when any 1 score is “poor.” In short, a knee can be only as good as the worst rating in any 1 of the 20 categories. This rationale was developed to solve the problem incurred with other systems that only use point scores to determine results and then arbitrarily assign categories for a certain number of points (such as “excellent” for 91–100 points, “good” for 81–90 points, and so on). In these systems, an ACL reconstruction that failed according to knee arthrometer testing could still receive an “excellent” or “good” rating if all other parameters score high numbers. Using this rationale, the CKRS is one of the strictest in terms of reporting results for acute knee injury studies.

The overall rating of knees with chronic injuries is not determined in the same categorical manner as that used for acute injuries because many of these individuals have preexisting problems that cannot be corrected with ACL reconstruction. In these instances, patients may be negatively biased into the “fair” or “poor” categories. For instance, a knee with preoperative moderate joint space narrowing on radiographs would not be expected to show improvement in this factor after an ACL reconstruction. This knee would, therefore, not have a chance of improving from the preoperative overall rating of “fair” to one of “good” or “excellent.” Because of these problems, the results for chronic knees are determined in the following manner. First, special emphasis is placed on the results of the individual components that make up the overall rating categories such as symptoms, knee range of motion, arthrometer testing, and functional testing. These data are presented in separate tables and graphs to allow comparisons of these important parameters between investigations. Second, the overall point total is calculated before and after treatment, including the change in points between evaluations. A distribution of the change in points is provided to allow assessment of slight change (<10 points), moderate change (between 10 and 20 points), and significant change (>20 points) between evaluations.

RELIABILITY, VALIDITY, AND RESPONSIVENESS TESTING: AUTHORS’ STUDY

Reliability testing of the scales and overall rating score of the CKRS was conducted on two groups of individuals.⁴ The first group comprised 50 patients who had a variety of chronic knee injuries and disorders, including meniscal tears, knee ligament tears, patellofemoral complaints, and degenerative joint disease. The second group consisted of 50 volunteers who had no current knee pathology and no history of prior knee operations. The questionnaires were given to these 100 individuals by an interviewer who was available to explain the instructions if required, but a formal interview was not accomplished. The questionnaires were completed an average of 7 days (range, 4–13 days) after the baseline evaluation.

Validity (construct, content, and item-discriminant) and responsiveness testing of the CKRS were conducted on a group of 250 patients who were prospectively followed after ACL bone–patellar tendon–bone autogenous reconstruction a mean of 27 months (range, 23–74 mo) postoperatively. The patients completed the questionnaires preoperatively and at the most recent follow-up evaluation.

All volunteers and patients completed 13 scales: (1) the four Symptom Rating Scales for pain, swelling, partial giving-way, and full giving-way, (2) the Patient Perception Scale, (3) the Activities of Daily Living Scales for walking, stair-climbing, and squatting, (4) the Sports Activities Function Scales for running, jumping, and hard twisting/cutting/pivoting, (5) the Sports Activity Scale, and (6) the Occupational Rating Scale. The overall rating score was calculated before surgery and at the most recent follow-up evaluation for the ACL-reconstructed patients.

Critical Points RELIABILITY, VALIDITY, AND RESPONSIVENESS TESTING OF THE CINCINNATI KNEE RATING SYSTEM: AUTHORS’ STUDY

• Reliability testing of the scales and overall rating score:

50 patients with chronic knee injuries and disorders, including meniscal tears, knee ligament tears, patellofemoral complaints, and degenerative joint disease.

50 volunteers with no current knee pathology or history of prior knee operations.

• Questionnaires completed an average of 7 days (range, 4–13 days) after baseline evaluation.

• Validity (construct, content, and item-discriminant) and responsiveness testing in 250 patients prospectively followed after anterior cruciate ligament bone–patellar tendon–bone autogenous reconstruction a mean of 27 mo (range, 23–74 mo) postoperatively.

• 13 scales demonstrated adequate test-retest reliability.

• Content validity: overall rating score has no floor or ceiling effects preoperatively. At follow-up, no floor scores were found, limited ceiling effects were calculated in 9%.

• Construct validity: 8/9 clinical hypotheses were confirmed.

• Item-discriminant validity: 94% comparisons.

• Cincinnati Knee Rating System is highly responsive in detecting change between evaluations.

The 13 scales all demonstrated adequate test-retest reliability (Table 44-3; R > 0.70). In the normal population, the ICCs ranged from 0.71 (jumping) to 1.0 (walking, full giving-way). In the patient population, the ICCs ranged from 0.75 (squatting/kneeling) to 0.98 (Sports Activity Scale).

TABLE 44-3 Reliability of Cincinnati Knee Rating System Variables Using Intraclass Correlation Coefficients

Variable	Uninjured Subjects (N = 50)	Patients (N = 50)
Pain	0.83	0.84
Swelling	0.83	0.83
Partial giving-way	0.88	0.87
Full giving-way	1.0	0.87
Walking	1.0	0.88
Stair-climbing	0.78	0.68
Squatting/kneeling	0.87	0.75
Running	0.88	0.86
Jumping	0.71	0.89
Twisting/turning	0.88	0.85
Patient perception of knee	0.91	0.88
condition	0.91	0.88
Sports Activity Scale	0.98	0.98
Occupational Rating Scale	0.87	0.97

Intraclass correlation coefficient > 0.70 required for adequate reliability.

From Barber-Westin, S. D.; Noyes, F. R.; McCloskey, J. W.: Rigorous statistical reliability, validity, and responsiveness testing of the Cincinnati Knee Rating System in 350 subjects with uninjured, injured, or anterior cruciate ligament-reconstructed knees. Am J Sports Med 27:402–416, 1999.

In the analysis of content validity, the CKRS overall rating score had no floor or ceiling effects preoperatively in the 250 patients (Table 44-4). At follow-up, no floor scores were found, and limited ceiling effects were calculated (22 patients, 9%). Good construct validity was demonstrated, as eight of the nine clinical hypotheses were confirmed using the overall rating score (Table 44-5). Item-discriminant validity was found for 59 of the 63 (94%) comparisons between the CKRS categories and the other variables shown in Table 44-6.

TABLE 44-4 Content Validity of Cincinnati Knee Rating System Variables

TABLE 44-5 Construct Validity: Clinical Hypotheses and Cincinnati Knee Rating System Overall Scores

TABLE 44-6 Item-Discriminant Validity: Correlations between Cincinnati Knee Rating System Categories and Dissimilar Items*

The CKRS was highly responsive to detecting changes from the preoperative to the follow-up evaluations. Seven of the eight categories showed large effects ranging from 1.07 to 2.48, and the remaining category (Activities of Daily Living average score) showed a moderate effect of 0.72 (Table 44-7).

TABLE 44-7 Outcome Scores before and 2 Years after Anterior Cruciate Ligament Reconstruction and Responsiveness to Change

ASSESSMENT OF THE CKRS BY INDEPENDENT INSTITUTIONS

Marx and colleagues ³⁵ conducted reliability, validity, and responsiveness testing of the CKRS Symptom Rating Scale and the Sports and Daily Activity functional scales in 42 patients with a wide variety of knee disorders. These investigators found that these components had adequate reliability (ICC, >0.80), face and content validity, and responsiveness (standardized response mean, 0.8). The authors found that the Lysholm scale, the American Academy of Orthopaedic Surgeons sports knee-rating scale, and the Activities of Daily Living Scale of the Knee Outcome Survey also had adequate reliability, validity, and responsiveness.

Risberg and coworkers ⁵⁵ assessed the responsiveness of the Symptom Rating and the Sports and Daily Activity functional scales in 109 patients after ACL reconstruction. The patients completed the questionnaires 3, 6, 12, and 24 months postoperatively. The authors reported that the CKRS was highly sensitive to detecting significant changes over time in contrast to the Lysholm, IKDC, and visual analog scales, which were found not to be effective in detecting such changes.

Demirdjian and associates ¹¹ evaluated scores for the CKRS and the Lysholm questionnaire from 418 normal subjects aged 13 to 25 years. For this study, the CKRS components analyzed were the scores for pain, swelling, partial giving-way, full giving-way, walking, stair-climbing, running, jumping/twisting, and the overall activity score, which summed to a possible 100 points. The mean and SD for the overall score of the CKRS was 99.10 ± 3.77 for males and 97.82 ± 4.97 for females. The mean overall score for the Lysholm scale was 99.10 ± 2.73 for males and 97.16 ± 5.26 for females. The 95% confidence intervals for the CKRS were 98.63 to 99.57 for males and 97.05 to 98.58 for females.

Critical Points ASSESSMENT OF CINCINNATI KNEE RATING SYSTEM

Assessment of the Cincinnati Knee Rating System (CKRS) by Independent Institutions

• CKRS has adequate reliability, face and content validity, and responsiveness.

• Symptoms and sports rating scales are highly sensitive to detecting significant changes over time in contrast to the Lysholm, International Knee Documentation Society, and visual analog scales.

• CKRS score provides the most effective estimate of disability compared with Lysholm.

• CKRS defines most precisely the outcome of anterior cruciate ligament reconstruction in athletically active patients.

Modified CKRS Scales

• Modified CKRS scales should not be used because they do not represent the instruments developed that have undergone rigorous statistical testing and may not provide the data that were intended by the authors of the CKRS.

Borsa and colleagues ⁷ evaluated 29 patients with an ACL-deficient knee to determine whether performance-based measures or patient-reported measures of function were more effective in estimating disability. Patients completed the CKRS Sports Activity Scale, Symptom Rating Scale, Patient Perception of the Knee Condition questions, and Sports Activity Function questions. Patients also completed the Lysholm questionnaire and were taken through proprioception, balance, single-leg hop, and isokinetic strength testing. Stepwise regression analysis demonstrated that the CKRS score was the most effective estimate of disability (R² = 0.56). The addition of the Lysholm score into the model only increased R² to 0.58 and the addition of the hop test index score only slightly improved the adjusted R² to 0.60. The authors concluded that the inclusion of the Lysholm score and hop index did not significantly improve the ability to estimate disability induced by ACL deficiency.

Sgaglione and coworkers ⁵⁷ used four separate knee rating outcome instruments to determine the outcome of 65 patients who underwent ACL reconstruction. The patients were evaluated a mean of 35 months (range, 24–58 mo) postoperatively. The CKRS in its entirety was used in this investigation. The results demonstrated that the CKRS individual scaled scores were lower than the Hospital for Special Surgery (HSS) and Lysholm questionnaire scores. The investigators concluded that the CKRS defined most precisely the outcome of ACL reconstruction in athletically active patients. The conclusion was related to the fact that the CKRS incorporated physical findings and avoided combining raw scores into categorical score data (e.g., excellent = 90–100 points). According to the investigators, the Lysholm and HSS scores tended to inflate the outcome of the operation.

Anderson and associates ¹ evaluated 70 patients a minimum of 5 years after ACL reconstruction with the HSS, Feagin and Blake, CKRS, Lysholm, and Zarins rating scales. The authors reported that the CKRS had a large number of measurements that reflected high precision in the analysis of subjective symptoms, subjective function, and objective examination. Substantial differences were noted among the results of the six scales evaluated. The conclusion was reached that future rating scales should provide a balance between subjective symptoms, subjective function, and objective findings.

MODIFIED CKRS SCALES

The authors express concern and caution regarding the use of so-called modified CKRS scales. Whereas some authors who implement these scales in their investigations ^8,⁵⁸ provide the actual portion of the CKRS used along with the modifications, the reasons for changing the original scales are usually not provided. The interpretation of data is difficult and cannot be compared with studies that use unedited CKRS scales. In the authors’ opinion, modified CKRS scales should not be used because they do not represent the instruments developed, which have undergone rigorous statistical testing, and may not provide the data that were intended by the authors of the CKRS.