Comparative study of advanced reasoning versus baseline large-language models for histopathological diagnosis in oral and maxillofacial pathology
by Viet Anh Nguyen, Van Hung Nguyen, Thi Quynh Trang Vuong, Quoc Thanh Truong, Thi Trang Nguyen
Large language models (LLMs) are increasingly explored as diagnostic copilots in digital pathology, but whether the newest reasoning-augmented architectures provide measurable benefits over earlier versions is unknown. We compared OpenAI’s o3 model, which uses an iterative planning loop, with the baseline GPT-4o on 459 oral and maxillofacial (OMF) cases drawn from standard textbooks. Each case consisted of two to five high-resolution haematoxylin-and-eosin micrographs, and both models were queried in zero-shot mode with an identical prompt requesting a single diagnosis and supporting microscopic features. Overall, o3 correctly classified 31.6% of cases, significantly surpassing GPT-4o at 18.7% (Δ = 12.9%, P < 0.001). The largest gain was recorded for the heterogeneous “other conditions” category (37.2% versus 20.2%). For correctly diagnosed cases, o3 generated more detailed descriptions (median Likert score 9 versus 8, P = 0.003). These benefits were offset by longer mean response time (98 s versus near-instant) and lower reproducibility across repeated queries (40.2% versus 57.6%). A board-certified general pathologist achieved 28.3% accuracy on the same image set, underscoring the difficulty of the task. Ground truth was established by two board-certified OMF pathologists with high inter-rater reliability, ensuring the reliability of the reference standard. The general pathologist served only as a non-OMF difficulty benchmark. The findings indicate that advanced reasoning mechanisms materially improve diagnostic performance and explanatory depth in complex histopathology, but additional optimisation is required to meet clinical speed and consistency thresholds. Clinically, such models are adjunctive ‘copilots’ for preliminary descriptions and differential diagnoses; expert OMF pathologists retain full responsibility for sign-out.