Evidence-Based Physical Exam: Determining Diagnostic Accuracy

Diagnostic Accuracy of Physical Findings

Key concepts on diagnostic accuracy from Steven McGee’s book, “Evidence Based Physical Diagnosis”: 

  • “If a physical sign characteristic of a suspected diagnosis is present (i.e., positive finding), that diagnosis becomes more likely; if the characteristic finding is absent (i.e., negative finding), the suspected diagnosis becomes less likely.”
  • Pretest probability is the probability of disease (i.e., prevalence) before application of the results of a physical finding. Pretest probability is the starting point for all clinical decisions. For example, the clinician may know that a certain physical finding shifts the probability of disease upward 40%, but this information alone is unhelpful unless the clinician also knows the starting point: if the pretest probability for the particular diagnosis was 50%, the finding is diagnostic (i.e., post-test probability 50% + 40% = 90%); if the pretest probability was only 10%, the finding is less helpful, because the probability of disease is still the flip of a coin (i.e., post-test probability 10% + 40% = 50%).”
  • Sensitivity and specificity describe the discriminatory power of physical signs. Sensitivity is the proportion of patients with the diagnosis who have the physical sign (i.e., have the positive result). Specificity is the proportion of patients without the diagnosis who lack the physical sign (i.e., have the negative result).”
  • Likelihood ratios, like sensitivity and specificity, describe the discriminatory power of physical signs. Although they have many advantages, the most important is how simply and quickly they can be used to estimate post-test probability… The likelihood ratio (LR) of a physical sign is the proportion of patients with disease who have a particular finding divided by the proportion of patients without disease who also have the same finding.”

 

    • “A positive LR, therefore, is the proportion of patients with disease who have a physical sign divided by the proportion of patients without disease who also have the same sign. The numerator of this equation—proportion of patients with disease who have the physical sign—is the sign’s sensitivity. The denominator—proportion of patients without disease who have the
      sign—is the complement of specificity, or (1 – specificity).”

    • “Similarly, the negative LR is the proportion of patients with disease lacking a physical sign divided by the proportion of patients without disease also lacking the sign. The numerator of this equation— proportion of patients with disease lacking the finding—is the complement of sensitivity, or (1 − sensitivity). The denominator of the equation— proportion of patients without disease lacking the finding—is the specificity.”

    • “Although these formulae are difficult to recall, the interpretation of LRs is straightforward. Findings with LRs greater than 1 increase the probability of disease; the greater the LR, the more compelling the argument for disease. Findings whose LRs lie between between zero and 1 decrease the probability of disease; the closer the LR is to zero, the more convincing the finding argues against disease. Findings whose LRs equal 1 lack diagnostic value because they do not change probability at all. “Positive LR” describes how probability changes when the finding is present. “Negative LR” describes how probability changes when the finding is absent.”

  • “The clinician can use the LR of a physical finding to estimate probability of disease in three ways: (1) using graphs or other easy-to-use nomograms; (2) using bedside approximations, or (3) using formulas.” Below is an example of the Fagan nomogram, in which a pretest probability and LR are used to calculate the post-test probability: 


Reliability & Accuracy

Reliability refers to how often multiple clinicians, examining the same patients, agree that a particular physical sign is present or absent. Reliability is also known as inter-rater reliability, or sometimes inter-observer agreement. 

  • Most clinical studies express reliability using the kappa (κ) statistic, which usually has values between 0 and 1.
    • A κ-value of 0 indicates that observed agreement is the same as that expected by chance, and a κ-value of 1 indicates perfect agreement. According to convention, a κ-value of 0 to 0.2 indicates slight agreement; 0.2 to 0.4, fair agreement; 0.4 to 0.6, moderate agreement; 0.6 to 0.8, substantial agreement; and 0.8 to 1.0 almost perfect agreement. 
    • Of note, Even with laboratory tests, which present the clinician with a single, indisputable number, interobserver disagreement is still possible and even common, simply because the clinician has to interpret the laboratory test’s significance.

Blog post based on Med-Peds Forum talk by Julia Solomon, PGY3, and Cameron Ulmer, PGY2