HYBRID EVENT: Join us in person in Singapore or attend virtually from anywhere.

5th Edition of

International Ophthalmology Conference

Evaluating the quality and readability of AI generated ophthalmic surgery education: A four model comparison

Arrane Selvamogan
Leicestershire Partnership NHS Trust, United Kingdom
Title: Evaluating the quality and readability of AI generated ophthalmic surgery education: A four model comparison

Abstract:

Background: Artificial Intelligence (AI) tools, particularly Large Language Models (LLMs), are increasingly utilised to provide health information, with patients seeking simplified explanations of surgical procedures. In ophthalmology, the readability and reliability of AI-generated content remain under-explored. This study evaluates the quality and readability of educational materials produced by four LLMs—ChatGPT-4 (OpenAI), Grok 3 (xAI), DeepSeek R1 (DeepSeek Inc.), and Gemini 2.5 Flash (Google)—for three common eye operations: cataract surgery, LASIK, and vitrectomy.

Methods: The four LLMs were queried with three patient-oriented prompts requesting simplified explanations of each procedure. Responses were assessed using the DISCERN instrument for quality and Flesch-Kincaid metrics for readability. Two independent reviewers scored each response. Results were analysed with descriptive statistics and visualised in RStudio.

Results: ChatGPT produced the most readable content, with Flesch-Kincaid Grade Levels of 5.0–6.5 and Reading Ease Scores of 68.5–77.7, suitable for secondary school reading levels. DeepSeek performed similarly, while Grok and Gemini generated more complex outputs, often at A-level or early university levels. Gemini’s “simplified” segments paradoxically had poorer readability scores. DISCERN scores were comparable across models (56–58.7), indicating moderate reliability. However, all models lacked source citations, undermining credibility and transparency.

Conclusions: ChatGPT demonstrates potential for delivering clear, accessible content for patient education in ophthalmology. However, the absence of citations across all models raises concerns about trustworthiness. Gemini’s inconsistent readability underscores the need for standardised AI responses. As patients rely on AI for medical decisions, ensuring clarity, reliability, and verifiable accuracy is crucial. Future AI development should prioritise adapting to user literacy levels and incorporating trusted citations to enhance patient trust and informed consent.

Biography:

Dr. Arrane Selvamogan graduated with an MBBS from St George’s University of London in 2023, following a BSc in Neuroscience and Biochemistry (2:1 Honours) from Keele University in 2017. Currently an FY2 Doctor at Leicester Royal Infirmary, she has a strong interest in ophthalmology, with experience in clinical audits, microsurgical training, and undergraduate medical education. She has led research on acute angle closure glaucoma and primary open angle glaucoma, presenting at regional and national conferences. Passionate about advancing ophthalmology education, Dr. Selvamogan is committed to innovative teaching and research, with ongoing projects aimed at publications and presentations.

YouTube
WhatsAppWhatsApp