The discussions surrounding ChatGPT, a state-of-the-art natural language processing AI, are hard to miss. With its capabilities to draft articles, engage in written conversations, and provide complex coding solutions, ChatGPT holds great potential to revolutionize how people seek health information.
In November 2022, OpenAI introduced ChatGPT, first powered by GPT-3.5 architecture; it is now even more enhanced with its recent upgrade to GPT-4. Trained on vast amounts of text data, ChatGPT excels in translation, summarization, and question-answering tasks. The disruptive nature of GPT-4 is evident by Microsoft integrating it into the “new Bing” to compete with Google. Further, this technology is hypothesized to be able to power future AI tools that will serve as virtual assistants for telemedicine, clinical decision support, real-time translation, remote patient monitoring, and even patient triage interactions, drug information dissemination, and medical education.
Granting that ChatGPT is far from replacing a doctor’s consultation, our team investigated ChatGPT’s potential to revolutionize healthcare information-seeking. In our research, we compared the responses of ChatGPT to Google on questions frequently asked by cancer patients. These questions ranged from simple queries to detailed questions about symptoms, prognosis, and treatment side effects.
Round one: reducing alarm
Interestingly, ChatGPT generates different answers when asked the same question multiple times. However, for straightforward questions, ChatGPT’s responses appear similar in quality to Google’s featured snippets. For example, when asked if coughing is a sign of lung cancer, both ChatGPT and Google’s snippet identified a persistent cough as a symptom of lung cancer. However, ChatGPT went further, explaining that various conditions can cause coughing and that a doctor’s consultation is necessary for an accurate diagnosis. Our clinical team valued these nuances in the information provided by ChatGPT for reducing unnecessary alarm.
Round two: directing you to a healthcare professional
In our study, we also assessed the ability of ChatGPT to answer a complex question regarding pembrolizumab’s side effects. With GPT-3.5 architecture, ChatGPT’s responses were observed to vary considerably in quality and content. While all answers recommended consulting a healthcare professional, they needed more clarity on the urgency, potential severity, and occurrence of fever as a side effect of pembrolizumab. However, demonstrating the rapid advancement of this technology, the GPT-4 update now better emphasizes that it is not a doctor, that it provides general information only, and urges users to seek personalized advice from healthcare professionals about pembrolizumab adverse events.
Our team has also observed improved consistency in ChatGPT’s responses to other health-related questions, as it now consistently reminds users that it is not a doctor and that personalized advice should come from healthcare professionals. These are important updates that enhance the tool’s safety, potentially making it more reliable than having patients navigate the vast amount of links provided by Google, which can be tricky and potentially misleading.
Round three: consistency
Despite these updates, significant limitations remain for ChatGPT, indicating that ChatGPT is yet to be a reliable tool for health information. For example, the architecture is still prone to AI hallucinations (the phenomenon of AI producing responses that appear coherent but are, in fact, wrong or made up), providing questionable references, giving varying answers to a single query, and lacking real-time updates. At this stage, it is essential that healthcare professionals are aware of these limitations and ensure patients understand that AI tools like ChatGPT—similar to Google—need to be treated with considerable caution.
“All AI innovations impacting the patient-care interface must undergo rigorous implementation studies to ensure they deliver improved health outcomes without causing unwanted side effects for patients.”
Clinicians remain the best source for advice and should be consulted promptly for health queries, particularly about cancer or cancer treatment, where the consequences of delayed intervention can be serious. Furthermore, these limitations highlight the need for regulators to be actively involved in establishing minimum quality standards for AI interventions targeting healthcare.
The use and availability of AI chatbots and assistants will continue to grow, and healthcare-targeted interventions will likely emerge. For these tools, it’s crucial to uphold the principles of evidence-based medicine. While AI advancements are exciting, even the best public health interventions can have unforeseen effects. All AI innovations impacting the patient-care interface must undergo rigorous implementation studies to ensure they deliver improved health outcomes without causing unwanted side effects for patients. Moreover, as evidence-based tools for healthcare emerge, it’s essential to address equity issues. Without careful planning, there is significant potential for medically focused AI tools to come with costly price tags, potentially exacerbating existing disparities in access to healthcare.
Acknowledgments: Ashley M. Hopkins would like to acknowledge Dr Ganessan Kichenadasse, Dr Jessica M. Logan, and Professor Michael J. Sorich for their contributions as co-authors of the original research paper at https://doi.org/10.1093/jncics/pkad010.
Interested in learning more about applications of AI regarding cancer information? Try this article by Skyler B Johnson, MD, et al: Using ChatGPT to evaluate cancer myths and misconceptions: artificial intelligence and cancer information | JNCI Cancer Spectrum | Oxford Academic (oup.com)