As artificial intelligence (A.I.) seeps into every facet of daily life, researchers decided to test how ChatGPT fared in giving medical advice.
Researchers at the University of Maryland School of Medicine tested how accurately ChatGPT, developed late last year and already reaching 100 million users, advised people on breast cancer screenings. The results, published Tuesday in the journal Radiology, found the bot gave adequate advice nearly 90% of the time.
“We found ChatGPT answered questions correctly about 88 percent of the time, which is pretty amazing,” Dr. Paul Yi, author on the study and assistant professor of diagnostic radiology and nuclear medicine at the University of Maryland School of Medicine, says in a press release. “It also has the added benefit of summarizing information into an easily digestible form for consumers to easily understand.”
Nearly 300,000 women will face invasive breast cancer diagnoses this year, according to estimates from the American Cancer Society. As mammograms have reduced breast cancer mortality by roughly 40%, accurate information on screening timelines and breast cancer risk is paramount and can be life-saving.
The researchers compiled a list of 25 questions, including the recommended age to begin breast cancer screening, certain risk factors and symptoms, and how frequently people should undergo mammograms. After answering each question three times, the bot answered 22 of the 25 responses accurately, according to three radiologists who analyzed the answers.
When asked “what is my risk for breast cancer,” ChatGPT first gave a disclaimer to discuss personal risk with a health care provider, Yi shows in a tutorial along with the press release. The bot then spoke more generally about risk factors like lifestyle habits, age, and family history, which “checks out from a medical standpoint,” Yi says.
ChatGPT has potential but does not replace the doctor
However, the technology isn't infallible, according to the authors.
“That 10% that were not appropriate either ChatGPT gave inconsistent responses, meaning if you asked it on any given day you’ll hear different answers that often contradicted each other, or the responses were just flat-out wrong,” Yi says.
One of the responses studied used outdated information on whether or not someone should postpone a mammogram due to COVID-19 vaccination (updated guidelines recommend not waiting). Another question garnered different responses when posed more than once.
Yi further points to the technology’s biases.
“Our language unfortunately in society and the internet often has these racial biases with even how we describe patients, so you can imagine you put in the same patient scenario but you take out a different race, or put in a different race, the recommendations may be totally different,” he says.
And for the responses deemed sufficient, ChatGPT did not crowdsource the credible information.
“ChatGPT provided only one set of recommendations on breast cancer screening, issued from the American Cancer Society, but did not mention differing recommendations put out by the Centers for Disease Control and Prevention (CDC) or the US Preventive Services Task Force (USPSTF),” Dr. Hana Haver, a radiology resident at University of Maryland Medical Center, who was also an author on the study, says in a statement.
Yi says while the technology has potential, patients should not solely rely on it for health advice. He says further partnerships between computer scientists and doctors can improve this type of health intervention and create “guardrails” for patients who need accurate advice.
“We’ve seen in our experience that ChatGPT sometimes makes up fake journal articles or health consortiums to support its claims,” says Yi in the statement. “Consumers should be aware that these are new, unproven technologies, and should still rely on their doctor, rather than ChatGPT, for advice.”
A.I. and the future of health
Interest in using A.I. in health care continues to grow. A 2019 study examined ways the technology can work alongside health care workers.
“It also seems increasingly clear that A.I. systems will not replace human clinicians on a large scale, but rather will augment their efforts to care for patients. Over time, human clinicians may move toward tasks and job designs that draw on uniquely human skills like empathy, persuasion and big-picture integration,” the study reads. “Perhaps the only health care providers who will lose their jobs over time may be those who refuse to work alongside artificial intelligence.”