GPT-4 has Better Clinical Judgment Than Many Doctors, Passes US Medical Licensing Exam

A team of three scientists tested GPT-4 to see how the new AI model can perform as a medical assistant.

A team of three scientists led by Dr. Isaac Kohane, a physician and computer scientist at Harvard, tested OpenAI’s GPT-4 to see how the new AI model can perform in a medical setting.

He has published the results in the book titled The AI Revolution in Medicine, published by Pearson. Commenting on GPT-4’s performance, he says, “I’m stunned to say: better than many doctors I’ve observed.” The book is a handbook, previewing “a day in the life of a doctor with a true AI assistant.”

Notably, the book is co-authored by Microsoft’s VP of Research Peter Lee and journalist Carey Goldberg.

GPT-4 answered US medical exam licensing questions correctly over 90% of the time. The test was essentially a back-and-forth of medical questions. Additionally, the book comments on how GPT-4 can accurately translate into other languages and distill medical lingo into common terms.

Other capabilities that were commented on include giving suggestions, offering tips, communicating better, and summarizing reports, among other things that can help medical practitioners.

More surprisingly to Kohane, GPT-4 was able to pin down a 1-in-100,000 condition that he himself diagnosed successfully in the past based on key details, ultrasound data, and hormone levels (congenital adrenal hyperplasia).

He remarks how it’s “just as I would, with all my years of study and experience.”

Though it’s truly remarkable how GPT-4 can help common people medically, it’s a little concerning how we could guarantee or certify the responses as safe or effective. It’s to be noted that OpenAI does not claim that their LLMs can be used for medical purposes in the first place.

The book also documents the blunders of GPT-4.

Kohane concludes that this is certainly a resource that can free up precious time in the clinic. It’s obvious how information hunting by medical practitioners can be made quicker with the help of ChatGPT.

The book recommends physicians, patients, healthcare leaders, payers, policymakers, and investors take charge by utilizing AI in their daily workflow.

The authors write, “We have to force ourselves to imagine a world with smarter and smarter machines, eventually perhaps surpassing human intelligence in almost every dimension. And then think very hard about how we want that world to work.”

By Abhimanyu