Skip to main content

Google Introduces MedPaLM 2

Google's AI now performs on par with "expert" clinicians on medical exams
Created on March 15|Last edited on March 15

The Successor

Last year, Google introduced MedPalm, which is a large language model built around the 540-billion parameter PaLM architecture, and designed for answering medical-related questions.
The model was capable of obtaining a >60% passing score on typical US medical licensing questions. Med-PaLM 2 outperforms the original Med-PaLM by a significant margin, scoring at an 85% score on the medical exam, which is considered “expert” level.

Evaluation Results

Although the model is passing medical exams with ease, there are still many obstacles to overcome in order to obtain real-world usability. As with many of the previous LLM’s, the evaluation of the model is not simple, and can be very subjective.
Google uses several metrics for evaluating MedPalm, including metrics like scientific factuality, precision, medical consensus, reasoning, bias, and harm, while using both clinicians and non-clinicians as evaluators for the model. Their results showed that although the model was performing well on the exam, there is still much progress to be made to ensure quality in real-world settings.

New Partnerships

In addition to this new model, Google has also made a few new partnerships that will help them develop and deploy their AI.
The first of which is Jacarda Health, which is a Kenya-based nonprofit focusing on improving health outcomes for mothers and babies in government hospitals. Google is also partnering with Guang Gong Memorial hospital in Taiwan, where Google will explore using ultrasound for the detection of breast cancer, which could be much more cost-effective in comparison to a mammogram. Google is also partnering with Mayo Clinic to improve the planning process for radiotherapy, where doctors segment areas on CT scans to determine areas of cancer vs. non-cancer. This process is very time-consuming, and can take up to 7 hours for a single patient. Finally, Google is working on TB screening using AI-powered chest x-rays, which could be a game changer for catching the disease earlier, which is currently quite difficult.
Overall, it's great to see big tech companies like Google taking immediate action to deploy their models for real world use, and hopefully other companies will do the same.
Tags: ML News
Iterate on AI agents and models faster. Try Weights & Biases today.