Transfer Learning based Multi Class Common Eye Disease Classification Using Enhanced Vision Transformer

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

ASTU

Abstract

Common Retinal diseases like diabetic retinopathy, glaucoma, age-related macular degeneration, and cataracts are leading causes of vision loss and blindness globally. In Ethiopia, the high prevalence coupled with limited number of ophthalmologists poses a major public health challenge. Early detection and diagnosis through fundus imaging can enable prompt treatment and prevent vision loss. However, manual analysis is time-consuming and prone to human error. While convolutional neural networks (CNNs) have shown promise in retinal image analysis, they have limitations in capturing global context in images. transformers have emerged as an alternative approach leveraging self-attention mechanism to model global context in images. This study investigates the potential of vision transformer model for multi class classification of common retinal diseases using fundus photography. A custom vision transformer model is proposed by incorporating the standard architecture with convolutional and classifier layers tailored for retinal images. The study employed transfer learning from ImageNet pre-training. The model is trained on fundus datasets augmented through photometric and geometric transformations to expand the data size and improve generalization. Comparative assessment is conducted against Convolutional Neural Networks like ResNet50, VGG19 and InceptionV3. The custom vision transformer model achieves an accuracy of 95% in classifying normal retina and five disease classes outperforming vision transformer which scored 87-91% accuracy, as well as the CNN models( highest accuracy of 94%). Precision, recall and F1-score and AUC metrics are also improved over both baseline ViT and CNNs. Class activation mapping visualizations highlight discriminative regions focused on by the model, providing interpretability. The findings demonstrate the effectiveness of the custom transfer learning-based vision transformer model for retinal disease classification, demonstrating its superiority and generalization ability over convolutional neural networks. This work helps establish vision transformers as a promising approach toward developing automated screening systems to prevent vision impairment through early diagnosis of retinal diseases using fundus imaging

Description

Citation

Collections

Endorsement

Review

Supplemented By

Referenced By