MedPaLM, Medical AI: The New WebMD?

Medical AI

Out of all the technological advancements I’ve researched, artificial intelligence (AI) remains my favorite. Seconded by quantum computing, the supercomputer’s jacked cousin. You see, AI and all it entails are a complex network so versatile it fits in literally every industry, including telecommunications and connectivity, law and politics,  and even in the coffee industry. Meet Google and DeepMind’s medical AI, MedPaLM.

MedPaLM: Medical Pathway Language Model

Google and DeepMind researchers revealed their work on a new large language model for a medical and clinical setting, MedPaLM.

The Three Parts

Medical Pathways Language Model (MedPaLM) is an iteration of Google’s PaLM that has been carefully and deliberately calibrated for maximum accuracy in a medical environment. Let me explain.

Med for Medical

The researchers adapted and optimized this iteration of the Pathway Language Model to pass two medicine-oriented benchmarks:

  • MultiMedQA: A newly formulated benchmark for testing large language models (LLMs) for medical and clinical applications.
  • HealthSearchQA: A benchmark focused on general medical knowledge searched for by consumers.

Look at them like the United States Medical Licensing Examination (USMLE), but for really advanced intelligent technology.

Pa for Pathways

Moving away from traditional machine learning (ML), Pathways is Google’s 2021 AI architecture built to handle a multitude of tasks simultaneously and learn new tasks at hyper-speed.

LM for Language Model

This part of medical AI pertains to its capability to understand and generate human language, such as predictive text on your phone. It’s a large language model. Large language models can be trained on enormous text corpora using machine learning algorithms to learn the patterns and structure of language. As a result of this extensive training, the AI tool can generate text, translate, analyze sentiment, among other offerings.

The Topics

This medical AI tool is not for mundane run-of-the-mill topics, as it solely caters to medical questions. The creators of MedPaLM used Instruction Prompt Tuning (IPT) to achieve this. This approach uses examples of desired input and output pairs as prompts to fine-tune a large language model for a particular task.

In short, instead of creating another ChatGPT, they prompt-tuned the previous iteration (Flan-PaLM) through guidelines and exemplars from a panel of qualified clinicians for each consumer medical question-answering dataset. The result? A super-healthcare-focused language model.

The Results

In terms of providing answers to consumer medical questions, researchers found that 94.4% of MedPaLM’s answers directly address the user’s question intent. While its answers were 11 % less helpful than that of humans (at 80.3 %), it still is 30% more than Flan-PaLM.

On the other hand, 16.9% of its answers presented incorrect information as opposed to 3.6% for human clinicians. Not to mention the 10.1% that showcased incorrect reasoning compared to the clinicians’ 2.1%. Incorrect comprehension occurred in 18.7% of cases for MedPaLM and in 2.2% for clinicians.

Final Thoughts

MedPaLM comes in very handy in a world where healthcare is not readily available for all. Still, it needs to be used with caution and self-awareness. That is to say, its answers should not be taken as an official medical diagnosis or recommendation. The user needs to be aware that the application provides a hub for somewhat accurate information and is not a replacement for a physician. Let’s not turn this useful piece of innovation into somewhat of a mockery like we did WebMD.


Inside Telecom provides you with an extensive list of content covering all aspects of the Tech industry. Keep an eye on our Medtechsection to stay informed and updated with our daily articles.