Περίληψη: | Artificial Intelligence (AI) tries to simulate human intelligence processes by machines, especially computer systems. The learning process of machines is achieved through a variety of algorithms, known as Machine Learning. Machine learning is widely used in the field of medical image analysis, in tasks such as image segmentation, image classification (including Computer Aided Detection and/or Diagnosis), image fusion, image-guided therapy and image retrieval. Deep Learning (DL) is a one of the main approaches of machine learning that consist of a set of algorithms, which from a given image, which tries to automatically learn multiple levels of representation that correspond to different levels of abstraction of an image. Understanding the characteristics of deep learning approaches, is crucial in order to properly apply and refine DL methods.
Breast cancer is the most frequently diagnosed cancer among women across the world and X-ray mammography, aided by breast ultrasonography, is the modality of choice for breast cancer screening. Due to intra- and inter-observer variabilities in diagnostic accuracy, image processing and analysis techniques are needed for the detection and classification of breast tumors, which will support the decision making process of radiologists.
The aim of this study is to evaluate the performance of deep Convolutional Neural Networks (CNNs) in mammographic mass segmentation. The current study design investigates 3 different input ROIs cropping scenarios, throughout the study, representing a different pixel number ratio between the mass and the background pixel classes expected to affect the pixel-wise classification. Finally, rescaling to a 256x256 ROI size was adopted, while 40x40 was also tested for comparison purposes.
The proposed mass segmentation model is based on the U-net model, demonstrating very promising performance, in case of biomedical images originating from light and electron microscopy. The effect of different modifications is investigated to improve segmentation performance. These modifications are: (i) introduction of α complimentary ground truth to balance the ration of foreground and background pixels in training and validation of the network, (ii) decreasing network width and increasing network depth (hyperparameters), (iii) evaluation of 2 transfer learning scenarios. Finally, (iv) a wavelet preprocessing stage was also tested in case of the best performing system.
Our working hypotheses were experimentally investigated in 2 mammographic mass datasets, the INbreast and a Privately Annotated DDSM subset (PA_DDSMsubset), split into 3 subsets, training, validation and testing. Training, including also a validation phase to avoid overfitting, was carried out to adjust weights at all layers for the proposed modifications, using a mini-batch Stochastic Gradient Descent (SGD) optimizer (batch size = 2), minimizing a Binary Cross Entropy (BCE) loss function.
Performance of the proposed modifications was evaluated in the testing subsets of the 2 datasets in the INbreast and the PA_DDSMsubset, demonstrating the following trends.
Addition of a complimentary ground truth, during training, was tested only in case of 256x256 input ROIs representing high resolution image detail, considering the 2 extreme mass to background ratios scenarios (tight and fixed size), seems to offer advantages for both scenarios. Thus, it was adopted for all following experiments. Performance of this model (Balanced_Training_U-net) was tested for 2 rescaled input sizes (40x40 and 256x256) and 3 mass to background ratio scenarios. The best performance was obtained for the tight scenario and 256x256 rescaled input for both tested subsets (90.95% for PA_DDSMsubset and 91.34% for INbreast), while a systematic decrease in performance is observed for the proportional padding and fixed size scenarios.
In case of 256x256 input ROIs, increasing network depth, is in favor of the tight scenario for both testing datasets, attributed to offering more precise representation of images at the bottleneck, with a higher number of more complex features extracted (92.03% for PA_DDSMsubset and 91.60% for INbreast). Adding an additional block for input ROI image size 40x40 was not deemed applicable, as further reducing image scale, would lead to unrealistic image size at the bottleneck.
The effect of reducing the number of activation maps was tested in the case of 40x40 input, as increasing the activation maps for a small size coarse detail input ROI image, could lead the network to overfit. Reducing network width resulted in decreased performance, while preserving the same performance pattern for different cropping scenarios.
The effect of two transfer learning methods was also investigated, in order to deal with the limited size of the datasets analyzed. The first one (pre trained weights) uses weights obtained from PA_DDSMsubset training to INbreast testing and the second (fine tuning) adopts these weights to initialize weights for INbreast training dataset, to be further optimized. Results indicate that both methods are comparable in performance with the from scratch scenario, which highlight that the U-net is efficient for small size datasets. The fine-tuning method providing increased performances as compared to the originally tested systems, as expected. Both methods maintain the same performance trend with respect to cropping scenarios (tight performing best). The fine-tuning method seems to improve performance, in case of the worst performing cropping scenarios (proportional padding and fixed size). In such cases, pre-trained weights from a digitized dataset to a digital dataset seems a valuable scenario, pointing towards using weights from different mammographic modalities to enhance performance in a specific mammographic modality dataset.
Finally, testing wavelet preprocessing to enhance edge representation input to the network seems not to help performance.
In conclusion, results of the present study demonstrate comparable performance of the proposed mammographic mass segmentation method, with respect to state of the art methods (Table 2.1). The effect of cropping scenarios seems to systematically affect performance with the tight scenario performing best, while relative literature suggests the proportional one (Table 2.1, Vivek Kumar Singha et al, Expert Systems with Applications, 2018).
Next steps involve investigating a postprocessing method, testing segmentation for different mass margin and shape morphologies.
|