Introduction to Machine Learning
Learning Objectives
Describe what an algorithm is and how they are used in both clinical medicine and everyday life.
Describe what it means to learn and how learning applies to machine learning.
Identify key applications of machine learning and when computational tools can be helpful (and potentially harmful) for patient care.
Food for Thought
Are there tasks in healthcare for which automated methods–such as computer algorithms and artificial intelligence (AI)–should never be used? Why or why not?
Does understanding how automated methods arrive at their predictions change any of your answers to question 1?
What if automated methods perform the task on par with humans? What if they perform better than humans?
Introduction to Machine Learning
What is an algorithm?
An algorithm is any function that computes an output from an input. We already use algorithms in everyday life and in clinical medicine. For example, here is an algorithm that you might use to determine when to walk to JMEC based on 3 different variables:
y = some_algorithm_for_when_to_walk_to_jmec(
how_long_does_it_typically_take_to_walk_to_campus,
how_much_sleep_did_I_get_last_night,
is_class_mandatory
)
where y
is when you decide to walk to campus.
Some algorithms can be written down exactly. For example, compute the anion gap given patient values, or compute the MAP of a patient given their blood pressure.
Other algorithms are harder to express on paper. For example, how to run a code or how to determine whether to admit a patient or not.
Computers can run algorithms that can be written down exactly. But how can we teach them how to run algorithms that are hard to express? To answer this question, let’s reflect on how we as students learn algorithms that might be hard to express.
Learning by Observing
Computers can learn by observation, much like how medical students learn! Consider some of the following scenarios:
A Database of Genomes
During your clinical research year, your advisor gives you a large dataset of many different patient genomes. By analyzing this dataset, we try to gain insights into which genes make individuals unique, and which ones all patients share in common.
A Randomized Control Trial
Your research mentor is impressed with your analysis and gives you a new project: investigating if a new drug abastatin lowers patient cholesterol levels. He gives you a large dataset of anonymized patient data containing two variables: whether the patient was given abastatin or placebo (\(x\)), and whether the patient had a reduction in their cholesterol levels (\(y\)). By analyzing this dataset, we try to learn whether or not abastatin is an effective drug for hypercholesterolemia.
A Patient with Sepsis
For those of you with a machine learning background, A Database of Genomes is an unsupervised learning problem, A Randomized Control Trial is a supervised learning problem, and A Patient with Sepsis is a reinforcement learning problem. You can learn more about each of these types of machine learning problems here!
A 52 year-old male presents with acute-onset altered mental status and fever. Vitals are notable for BP 90/60 and T 103.4. We can denote the patient as a variable \(x\) consisting of all of the relevant attributes of the patient: their HPI, past medical history, current lab values and vitals, etc.
On our first day as a medical student, we might not know what to do with this patient. Do we admit them and start them on IV antibiotics? Do we call a neurology consult? Do we just send the patient home? Each of these clinical interventions can be thought of as an action \(a\) that we can take to try to help the patient get better.
After observing a patient \(x\) and performing an action \(a\), we monitor the patient to see if they improve. The patient’s outcome can be denoted as a variable \(y\) (for example, \(y=0\) if the patient deteriorates and \(y=1\) if the patient gets better). We observe the clinical outcome \(y\), and use it to learn a better algorithm so that next time we see a similar patient, we can take a better action that leads to a more favorable outcome.
Over the course of medical school, we see hundreds (if not thousands) of tuples \((x, a, y)\) through clerkships, sub-Is, exams, and UWorld, and use this dataset of patient-action-outcome observations to learn hard-to-write-down algorithms for choosing the best clinical intervention \(a\) given a patient \(x\) to maximize the outcome \(y\).
In other words, we learn by observation.
What does it mean to “learn”?
No patient is exactly identical to any other patient, including the patients that you learn from. If all you can do is regurgitate the dataset you learned from, this is not learning! Put simply…
Machine Learning as a Framework
Machine learning (ML) uses the exact same framework of learning through observation to learn hard-to-write-down algorithms from data as exact steps that a computer can execute.
Just like how we all have different mnemonics and mental maps on how to approach clinical reasoning, the exact steps in the algorithm that ML learns may not be the same as the steps that clinicians learn! This is an important problem that researchers are still trying to solve.
The fundamental goal of machine learning is to learn hard-to-write-down algorithms from past observations to hopefully make accurate predictions for future observations.
When can machine learning be a helpful tool?
Consider the following example cases. Would you want to use machine learning in each of these cases?
In summary, ML is useful for tasks that are
- hard-to-write-down;
- associated with a lot of prior observations; and
- can lead to actionable utility for patients and/or clinicians by automating hard, repetitive, and/or common tasks.
There are a lot of tasks that fall into these categories! In practice, some of the most widely studied use cases include…
- Reading radiology scans to predict patient risk of disease
- Helping clinicians figure out how to best treat critically ill sepsis patients
- Discovering new drugs to help better treat patients
- De-identifying health records to protect patient privacy
- Enabling more accurate cancer subtyping from pathology slides
Can you think of any other potential use cases?
Evidence-Based Medicine Discussion
Should AI be used to improve access to mental health resources?
Hands-On Tutorial
Let’s explore how state-of-the-art AI models currently perform as mental health resources for real-world patients. Here is an example of a chatbot that’s currently available for anyone to use on the Internet (including your patients) - click on the link to open it in a new window.
This particular model is hosted on Hugging Face, which has become the de-facto website for publishing publicly available machine learning models like the one we’re exploring here. Anyone can download a model and use it for applications such as mental health among others.
Assume the role of a patient seeking mental health support and resources. How accurate is the model as a therapist? How empathetic is the model? Would you use this particular model for mental health support? Why or why not?
Summary
Algorithms are functions that map inputs to outputs. Some algorithms are easy to describe while others are harder to write down. Machine learning is the process of computers learning hard-to-write-down algorithms from past observations, with the goal of learning algorithms that are generalizable to new sets of inputs.
Additional Readings
- Topol EJ. High-performance medicine: The convergence of human and artificial intelligence. Nat Med 25: 44-56. (2019). doi: 10.1038/s41591-018-0300-7. PMID: 30617339
- Sidey-Gibbons JAM, Sidey-Gibbons CJ. Machine learning in medicine: A practical introduction. BMC Medical Research Methodology 19(64). (2019). doi: 10.1186/s12874-019-0681-4. PMID: 30890124
- JAMA Podcast with Dr. Kevin Johnson from Penn Medicine. October 4, 2023. Link.
Made with ❤ by the EAMC Team ©2024.