Eosinophilic Esophagitis (EoE) is a chronic immune disease most commonly diagnosed through examination of biopsy tissue taken from the esophagus. Currently, trained pathologists spend hours manually examining biopsy slides for identifying and counting eosinophils, indicators for EoE. Given the success of deep learning models in automating other areas of medical image analysis, we wondered: can deep learning networks be trained for accurately segmenting and counting eosinophils in esophagus tissue biopsy slides? However, existing efforts have not sufficiently evaluated different deep learning models, hyperparameters, or metrics. Additionally, many are built on hundreds of annotated training images, which pathology labs often do not have the resources to generate. To address this, we present a comprehensive evaluation of five deep learning architectures and fine-tune their hyperparameters. We rely primarily on location-based metrics to count true positives (TP), false positives (FP), and false negatives (FN), and conduct a limited data analysis to see how models respond to varying amounts of training data. We find that UNet++ performs best of the evaluated models. Even though dice and IoU values remained similar across models, TPs and FPs varied greatly, highlighting the importance of including counting based metrics when comparing cell segmentation methods. Furthermore, we conduct sliding window experiments to study the effect of patch size and stride size in generating training data on the model performance. We find that TP counts do not vary greatly, while FP counts can differ significantly when different sliding window settings are used. Our limited data analysis revealed that eight training images is sufficient for most models for reliable results, allowing deep learning to be used as an efficient aid to pathologists. Our work provides helpful comparative information for future cell segmentation applications.
|