OpenAI's GPT-4 model has demonstrated remarkable capabilities in diagnosing eye problems, outperforming junior doctors and closely matching the diagnostic abilities of seasoned eye specialists in a recent study conducted by researchers at the University of Cambridge. The evaluation involved diagnosing 87 clinical scenarios related to various eye conditions, allowing the model to engage in a head-to-head comparison with doctors across different experience levels.
The results were striking: GPT-4 achieved “significantly better” scores than unspecialized junior practitioners, while its performance rivaled that of both trainee and expert ophthalmologists. Only the top-performing human doctors surpassed GPT-4's accuracy, highlighting the model's potential in the field of eye care.
Dr. Arun Thirunavukarasu, the lead author of the study, emphasized that while AI models like GPT-4 will not replace human clinicians, they can significantly enhance healthcare workflows. They could assist in triaging patients, determining which cases require immediate attention from specialists, which can be managed by general practitioners, and which may not need treatment at all. Dr. Thirunavukarasu stated, “We could realistically deploy AI in triaging patients with eye issues,” underscoring the potential for AI to streamline patient management.
During the study, GPT-4 faced questions sourced from a textbook traditionally used for training eye doctors, covering symptoms such as diminished vision, itching, and extreme light sensitivity. Other large language models, including OpenAI’s GPT-3.5, Google’s PaLM2, and Meta's LLaMA, also participated in the analysis. However, GPT-4 consistently produced the most accurate responses.
“The models could follow clear algorithms already in use, and we’ve found that GPT-4 is as proficient as expert clinicians at interpreting eye symptoms to address more complex inquiries,” Dr. Thirunavukarasu explained. Furthermore, he noted that with additional advancements, these models could provide valuable support to general practitioners, especially those facing challenges in accessing timely advice from eye specialists.
Since conducting this study, researchers have observed the emergence of even more sophisticated models that might rival expert ophthalmologists. Recently, GPT-4 was succeeded by GPT-4 Turbo, OpenAI's enhanced large language model, although both remain exclusive to premium ChatGPT users and enterprise clients.
While GPT-3.5 continues to power the free version of ChatGPT, it has also showcased impressive medical knowledge, successfully passing various medical exams. Notably, results published last May indicated that the standard version of ChatGPT achieved a passing score on the three standardized tests that constitute the U.S. Medical Licensing Exam.
The researchers’ vision of employing large language models to provide patient advice has recently materialized, exemplified by the World Health Organization's introduction of Sophie, an AI-driven avatar designed to dispense guidance on issues such as smoking cessation, physical fitness, and mental health.