DIAGNOSTIC PERFORMANCE ANALYSIS (PART 1)


The primary purpose of a screening test is to rule out a disease or a clinical condition of interest to put less demand on the healthcare system and less discomforting for patients as it is usually less invasive, less dangerous, user-friendly and inexpensive. The question is, how do we know that a screening test is reliable and, therefore, safely applicable in clinical settings?

The short answer is the predictive ability of a screening test must be inspected to what degree it corresponds to the reference standard to diagnose the condition. For example, if the hypothesis is testing the reliability of ECG to diagnose a STEMI, the ECG features must be analysed in relation to the angiogram outcome, which is considered the reference standard test to detect a complete obstruction of the coronary vessel. In other words, the ‘reference standard’ represents a test that provides authoritative and definitive proof that a condition of interest is present. The typical analysis to determine the reliability of the screening test is characterised below. Based on the screening result, which is typically categorized as a dichotomous outcome, ie. A positive or negative test and the reference standard which confirms the presence or absence of the condition, tested subjects are assigned to one of the four cells labelled A, B, C and D. Based on the count in each cell, the value for SN, SP, NPV and PPV can be measured and usually expressed as percentages. 

 

Has the condition

Does not have the condition

 

Positive Test

A

(True Positive)

B

(False Positive)

A + B

Negative Test

C

(False Negative)

D

(True Negative)

C + D

 

A + C

B + D

A + B + C + D


Sensitivity

The ability of the test to detect a true positive reflects the test’s ability to correctly identify all people who have the condition  [A/ (A + C)] x 100].

Sensitivity (SN) is defined as the probability of a screening test to detect patients who have the disease of interest solely among all the patients who truly have the disease. Based on the contingency table, the formula for SN is [A/(A+C)] x 100. A refers to the True Positive rate, and C refers to the False Negative rate. In other words, SN is the ability of the test to detect the true positive cases. The higher the proportion of true positive cases correctly identified by the test, the higher the sensitivity. In the clinical context, higher sensitivity reflects a superior ability to rule out a disease relative to tests with lower sensitivity. In other words, the higher the SN value of a test, the lower the rate of detecting false negative cases. False negative means the test is negative in patients who actually have the disease, ie. Type II error. Type II error in statistical terms refers to failure to reject the null hypothesis when it is false. Taken together, if a highly sensitive test comes back negative, one can be assured that the disease is ruled out.

It is also important to note that the interpretation of sensitivity is not a stand-alone concept. One must take into consideration the specificity (SP) and predictive values (PV) of the test as well. It is the balance between SN and SP, alongside the negative or positive predictive values, that provides clinicians with the overall picture of the diagnostic ability of the test.

Specificity

The ability of the test to detect a true negative reflects the test’s ability to correctly identify all people who do not have the condition [D/ (B + D )] x 100.

Specificity (SP) is defined as the probability of the screening test correctly detecting patients who do not have the disease of interest solely among patients who truly do not have the disease. It is the ability of the test to detect true negative cases. Based on the table below, it is the proportion of true negative cases (D) over the combination of true negative and false positive cases (B) (D/B+D). In other words, the higher the SP of a test, the lower the rate of false positive cases being detected. False positive means the test gives a positive result, although, in reality, the patient does not have the disease. From a statistical perspective, a false positive rate is also known as a Type I error. A clinical test with a high rate of false positivity or Type I error is not ideal as it can lead to unnecessary costly investigations and create a false alarm to patients which can be psychologically traumatizing. 

For example, a test with 65% SP was used to diagnose breast cancer, and the test came back positive. Before concluding whether the patient really has cancer, the treating physician must be aware that the rate of false positives with this test is relatively high, ie. 35%. Decisions must be made whether to prematurely break the news to the patient or subject the patient to another test with a higher SP. Hypothetically speaking, assuming a test is 100% specific, although in reality, it is almost impossible to have a perfect test, it means the test detects all true negative cases and no false positive cases. From a clinical standpoint, this means that if a test comes back positive, we can be sure that the patient really has the disease. Therefore, a highly specific test is a great tool to rule in a disease.


Again, it is important to reiterate that there is no perfect test. There is always an overlap between the true

negative cases and true positive cases, which gives rise to the grey area that encompasses the

false negative and false positive, depending on the threshold level. Based on the figure, the

The blue curve reflects the true negative cases, and the red curve reflects the true positive cases.

The black line denoted as A is the threshold for the test that detects all the true positive cases. i.e. 100%

 sensitivity. However, one can appreciate the substantial area of overlap with the blue curve, which

 represents the proportion of false positive cases. Similarly, the line denoted B is the threshold to detect

 all the true negative cases, i.e. 100% specificity. Although all the true negative cases are

 being detected at this threshold, a substantial area of the red curve is also included, which represents

 the proportion of false negative cases. Therefore, it is essential to select an appropriate threshold for

 each test to balance the risk of false negatives and positives. False positives may lead to unnecessary and

 costly downstream investigations or risky medical management, whereas false negatives may give a

 false sense of security to patients leading to undiagnosed medical conditions and, in some countries,

 risks of medical litigation. I hope the explanation helps.


Comments

Popular posts from this blog

2024 begins (01012024)