HYBRID EVENT: You can participate in person at Singapore or Virtually from your home or work.

4th Edition of

International Ophthalmology Conference

March 23-25, 2026 | Singapore

IOC 2026

Evaluating the efficacy of artificial intelligence as a patient education tool in paediatric ophthalmology: A cross sectional comparative study

Speaker at International Ophthalmology Conference 2026 - Vera Haidar
Imperial College London, United Kingdom
Title : Evaluating the efficacy of artificial intelligence as a patient education tool in paediatric ophthalmology: A cross sectional comparative study

Abstract:

Background: As patients increasingly turn to digital platforms for medical information, the role of artificial intelligence (AI) tools such as ChatGPT and DeepSeek in patient education is expanding. However, concerns remain regarding the readability, accuracy, and overall quality of health content produced by AI models. This study is the first to compare AI-generated patient education materials with clinician-authored resources for parents of children with common ophthalmic conditions.

Methods: We evaluated the readability and content quality of AI patient education materials for four common paediatric ophthalmic conditions (chalazion, blepharitis, amblyopia, and strabismus) compared to NHS-approved leaflets from Moorfields Eye Hospital. Responses from ChatGPT-3.5 and DeepSeek R1 were generated using structured prompts based on NHS leaflet headings. These were assessed alongside NHS materials using four validated readability metrics, the Flesch-Kincaid Grade Level (FKGL) and Reading Ease (FRE), Gunning Fog Score, SMOG index, and the Ensuring Quality Information for Patients (EQIP) tool for quality assessment. Statistical analysis was conducted using Analysis of Variance (ANOVA) testing and Cronbach’s alpha to determine inter-rater reliability.

Results: NHS patient information leaflets were significantly more readable than those generated by AI models (mean FKGL: 4.1 vs ChatGPT: 7.5, DeepSeek: 9.2; p = 0.0011). Only human-generated leaflets met the recommended FRE score of 65 for the general population, with a mean FRE of 74.8 versus 57.7 for ChatGPT and 48.1 for DeepSeek (p = 0.0038). EQIP scores showed significant differences in information quality (p = 0.0041), with human- generated leaflets scoring highest (mean = 69.61, SD = 4.74), outperforming ChatGPT (mean = 58.26, SD = 3.35) and DeepSeek (mean = 56.75, SD = 4.73).

Conclusion: While AI tools offer accessible and timely health information, their outputs currently fall short in readability and quality compared to clinician-authored resources. These limitations, along with a lack of references for information, pose potential risks for patients, particularly for those with limited health or digital literacy. Until LLMs can reliably meet standards for clarity and accuracy, NHS-approved materials should remain the primary reference for patient education. Nonetheless, with refinement and human oversight, AI has the potential to support scalable, personalised, and interactive health communication in the future.

Biography:

Dr. Haidar and Dr. Ruan are award-winning Specialised Academic Doctors training within the Imperial College Healthcare NHS Trust. Both uniquely hold two degrees from Imperial College London, an MBBS with Distinction and a First-Class Honours BSc in Endocrinology, reflecting exceptional clinical and scientific achievement. They have published in leading journals including The BMJ and Archives of Disease in Childhood and presented research at major national conferences such as RCPCH and RCPsych. They also serve as editorial members for UK-based academic journals, contributing to the advancement of emerging medical research.

Watsapp