Evaluating artificial intelligence chatbots for patient education in oral and maxillofacial radiology

View/ Open
Publisher version (Check access options)
Check access options
Date
2025-01-11Author
Dilek, Helvacioglu-YigitDemirturk, Husniye
Ali, Kamran
Tamimi, Dania
Koenig, Lisa
Almashraqi, Abeer
...show more authors ...show less authors
Metadata
Show full item recordAbstract
ObjectiveThis study aimed to compare the quality and readability of the responses generated by 3 publicly available artificial intelligence (AI) chatbots in answering frequently asked questions (FAQs) related to Oral and Maxillofacial Radiology (OMR) to assess their suitability for patient education. Study DesignFifteen OMR-related questions were selected from professional patient information websites. These questions were posed to ChatGPT-3.5 by OpenAI, Gemini 1.5 Pro by Google, and Copilot by Microsoft to generate responses. Three board-certified OMR specialists evaluated the responses regarding scientific adequacy, ease of understanding, and overall reader satisfaction. Readability was assessed using the Flesch-Kincaid Grade Level (FKGL) and Flesch Reading Ease (FRE) scores. The Wilcoxon signed-rank test was conducted to compare the scores assigned by the evaluators to the responses from the chatbots and professional websites. Interevaluator agreement was examined by calculating the Fleiss kappa coefficient. ResultsThere were no significant differences between groups in terms of scientific adequacy. In terms of readability, chatbots had overall mean FKGL and FRE scores of 12.97 and 34.11, respectively. Interevaluator agreement level was generally high. ConclusionsAlthough chatbots are relatively good at responding to FAQs, validating AI-generated information using input from healthcare professionals can enhance patient care and safety. Readability of the text content in the chatbots and websites requires high reading levels.
Collections
- Dental Medicine Research [392 items ]