Impact of pre-processing methods on lesion image feature extraction in PET

The fast-growing rate in technological advances in programming, data science and especially in Machine Learning algorithms, have brought birth to a new field of study in medical physics called radiomics. Its aim is to extract meaningful quantitative information from medical images using mathematical...

Πλήρης περιγραφή

Λεπτομέρειες βιβλιογραφικής εγγραφής
Κύριος συγγραφέας: Βανταράκης, Σωτήριος
Άλλοι συγγραφείς: Vantarakis, Sotirios
Γλώσσα:English
Έκδοση: 2022
Θέματα:
Διαθέσιμο Online:http://hdl.handle.net/10889/16402
Περιγραφή
Περίληψη:The fast-growing rate in technological advances in programming, data science and especially in Machine Learning algorithms, have brought birth to a new field of study in medical physics called radiomics. Its aim is to extract meaningful quantitative information from medical images using mathematical tools in order to improve decision support, especially in oncological imaging. In this dissertation a monoparametric review of a PET radiomics pipeline in Non-small Cell Lung Cancer (NSCLC) patients is performed. A pilot data set comprised of 15 adenocarcinoma and 18 squamous-cell carcinoma patients has been utilized. PET images of patients were acquired from the General University Hospital of Patras. This thesis examined the effect of absolute discretization with fixed bin width, for two different bin widths in the extracted feature values. This is evaluated using two approaches, (i) a statistical analysis one (utilizing IMB SPSS v27), comprised of repeatability and discrimination ability of lesion sub-type and (ii) a machine learning one. Regarding the radiomics process, the LIFEx freeware was used. In LIFEx the previously delignated tumor segments were imported, then the appropriate pre-processing parameters were set and the selected features were extracted. The spatial resampling was set to a constant value (3x3x3 mm) for every image data set to make volumetric 3-dimensional data isotropic and absolute discretization was utilized for 0.635 and 0.313 bin width. According to evaluation approach (i), the repeatability of feature values was evaluated utilizing the Intraclass Correlation Coefficient (ICC) index (Two-Way Mixed-Effects model, Consistency). Utilizing an ICC threshold of 0.90 resulted in almost 40% of features obtaining excellent repeatability, while the median ICC is 0.889 for all radiomic extracted features. To assess features discriminating ability the Shapiro-Wilk normality test was first utilized to determine the features’ distribution for each bin size (22 features have normal and 27 non-normal distribution). Then, depending on each features’ distribution a Mann-Whitney U test or an independent t-test for 95% confidence interval was employed to check the feature values ability to differentiate between the two cancer subtypes. These hypothesis tests indicated that GLRLM SRE for 0.635 bin width could differentiate between cancer subtypes. Furthermore, the extracted feature values were used as input for a monoparametric machine learning classification task. The software used for this classification task was Orange Data Mining v3.32. Three different classifiers were used to create the machine learning models, namely Logistic Regression, Random Forest, and Support-Vector Machine. Each monoparametric model’s performance for each feature was assessed using the area under curve value (AUC) of the ROC curve of the model. Machine learning highlighted more features capable of cancer subtype differentiation, while the highest AUC value obtained was 0.878 for the Logistic Regression model, for the NGLDM Busyness feature, for the large bin width. As a result of all the above three methods, the appropriate bin width for this specific PET radiomics study can be indicated as the 0.6 bin width. Lastly, two features that demonstrated both high ICC and AUC values (SUV Kurtosis and SUV Excess Kurtosis) can be suggested for further research as imaging biomarkers.