Algorithmic Interpretability
Learning Objectives
Define what it means for an algorithm to be interpretable and highlight key ways that the definition is subjective and user-dependent.
Describe the accuracy-interpretability tradeoff and why it is observed in many real-world algorithms.
Reflect on the role of interpretability in algorithms used in clinical practice.
What is Interpretability?
Unlike our past few modules defining topics like fairness and anonymity exactly, it is challenging to give a rigorous, objective definition of interpretability. One commonly cited definition is interpretability is the degree to which a human can understand why an algorithm made its prediction.1 If an algorithm is interpretable, then it is easier for someone to understand why certain predictions were made. Note that the definition of interpretability is entirely independent from the the accuracy of the algorithm - we only seek to explain why an algorithm made its own prediction, which may or may not be necessarily correct.
1 Miller T. Explanation in artificial intelligence: Insights from the social sciences. Art Intel 267: 1-38. (2019). doi: 10.1016/j.artint.2018.07.007
However, even this definition of interpretability remains underspecified.2 For example, here are some other factors of algorithmic design that are closely related to - if not paramount for - interpretability:
2 Lipton ZC. The mythos of model interpretability. Proc ICML Workshop on Human Interpretability in Machine Learning. (2016). 10.48550/arXiv.1606.03490
- Trust: What is an objective notion of trust? Is it confidence that a model will perform well? Do you care about how often an algorithm is correct, or for which inputs the algorithm is correct for?
- Causality: Can we use algorithms to learn potential hypotheses about the world around us? Is this important in labelling a model as interpretable?
- Transferrability: Do algorithms generalize to new patient populations? Can we predict when an algorithm might generalize and when it won’t? How might deployment of models alter the user’s environment to simultaneously invalidate the model?
- Informativeness: As previously explored in the Food for Thought discussion above, is it more important for you to have an explanation to approach future problems, or to get the correct answer? Does the model serve the role of an oracle, a colleague, or a mentor?
- Fair and Ethical Decision Making: How can we be sure that predictions do not discriminate on the basis of race, age, gender, and other patient attributes?
These factors that contribute to how we label if an algorithm is interpretable often vary due to a number of factors:
- Difficulty of the Task: Algorithms trained to perform complex, domain-specific tasks are inherently less interpretable by the average user due to the nature of the task itself.
- Expertise of the User: A domain expert may require less explanation in order to call an algorithm interpretable compared to someone with less experience in the field.
- Expertise with the Algorithm: Just like with any other software, technicians with years of experience using an algorithm are more likely able to explain its predictions more due to having more experience using the technology.
What other factors might influence the subjectivity of the interpretability of an algorithm?
Discussion Questions
Which of the following algorithms are interpretable?
Mean Arterial Pressure (MAP)
On one end of the spectrum, clinical algorithms like computing the mean arterial pressure (MAP) are pretty clearly interpretable. We can exactly right down the formula to compute this quantity as
\[\text{MAP}=\frac{1}{3}(\text{Systolic Blood Pressure (mmHg)}) + \frac{2}{3}(\text{Diastolic Blood Pressure (mmHg)})\]
We might even be able to reason why this formula works - heuristically, the arterial pressure might spend about 2/3rds of the time in diastole and 1/2rd in systole, and so the time-weighted average of these two quantities is the MAP.
MELD Score
The MELD Score is a clinical algorithm used quantifying the degree of end-stage liver disease in potential transplant candidates. Similar to MAP, the MELD score also has an exact formula:
\[\text{MELD}=9.57\times\log(\text{Cr}) + 3.78\times \log(\text{Bilirubin})+11.20\times \log(\text{INR})+6.43\]
Is this equation still interpretable? From the equation above, we still clearly have transparency into how a MELD score is calculated, but the equation itself is a little more complicated and may not be easily understand by everyone. After looking at the above equation, we’re still left with a number of remaining questions: How were the decimal coefficients derived? Why is there a logarithmic relationship between the MELD score and patient lab values?
A Machine Learning Algorithm
Suppose we now have a ML algorithm that predicts a patient’s risk of breast cancer given their genomic data. Such algorithms are often referred to as black-box algorithms because the algorithm’s user cannot see the inner workings of the algorithm. However, is such an algorithm truly “black-box”? Similar to the MELD Score, I can exactly write down the specific formula for the algorithm, with all of its inputs, internal functions, decimal coefficients, etc. It would be an incredibly long and complex equation, but any ML algorithm can be written down exactly just like the MELD score and MAP calculations above.
A Probabilistic Algorithm
Finally, consider the following algorithm that utilizes a fair two-sided coin: if I flip the coin and it lands heads, then I admit a patient from the ED. Otherwise, I discharge them and send the patient home. Is this algorithm (or any probability-based algorithm) “interpretable”?
An Accurate Algorithm Trained on An Unknown Dataset
After reading a recently published paper on a new machine learning algorithm to diagnose a rare disease, you try testing the algorithm on your own patients’ data and find that it has almost perfect accuracy! However, the paper does not include any details about how the model was trained - including any information on the patient demographics in the published study.
Interpretability versus Accuracy
A key insight that we hope you take away from considering the discussion question posed above is that there is often times a trade-off between the complexity and interpretability of an algorithm. If an algorithm is more complex, such as machine learning models and the MELD score, then they may be less interpretable. At the same time, algorithms that are more complex can often times represent more complex relationships between inputs and outputs, leading to better predictive accuracy. In other words, we have the following:
3 There’s a great blog post discussing the accuracy-interpretability tradeoff in more detail here: Ndungula S. Model accuracy and interpretability. Medium. (2022). Link to article
Evidence-Based Medicine Discussion
Do algorithms need to be interpretable in order for clinicians to leverage them for patient care?
Summary
Interpretability is a subjective property of an algorithm that characterizes the degree to which a human can understand why and how and algorithm made its prediction. There are many reasons why interpretability can vary, including the difficulty of the clinical task, the complexity of the algorithm, and expertise of the user among many others. While there is often a tradeoff observed between accuracy and interpretability in practice, many experts believe that interpretability is important for algorithms used in patient care.
Additional Readings
- Model interpretability. Amazon Web Services Whitepapers. Accessed 20 May 2024. Link to article
- Amann J, Blasimme A, Vayena E, Frey D, Madai VI. Explainability for artificial intelligence in healthcare: A multidisciplinary perspective. BMC Med Inform Decis Mak 20(1): 310. (2020). doi: 10.1186/s12911-020-01332-6. PMID: 33256715
- Teng Q, Liu Z, Song Y, et al. A survey on the interpretability of deep learning in medical diagnosis. Multimed Syst 28(6): 2335-55. (2022). doi: 10.1007/s00530-022-00960-4. PMID: 35789785
Made with ❤ by the EAMC Team ©2024.