9791221502893_60.pdf

Video-based fire detection is a crucial object detection problem that relies on accurate and reliable data to detect fires. However, collecting and labeling fire-related data can be time-consuming and expensive, making it difficult to obtain sufficient data for training machine learning models. To a...

Πλήρης περιγραφή

Λεπτομέρειες βιβλιογραφικής εγγραφής
Γλώσσα:English
Έκδοση: Firenze University Press 2024
Διαθέσιμο Online:https://books.fupress.com/doi/capitoli/979-12-215-0289-3_60
id oapen-20.500.12657-89072
record_format dspace
spelling oapen-20.500.12657-890722024-04-03T02:23:11Z Chapter Efficient Data Curation Using Active Learning for a Video-Based Fire Detection Joshi, Keyur Dietrich, Philip Aziz, Angelina König, Markus Uncertainty Estimation Active Learning Object Detection Outlier Detection Feature-based cluster analysis Video-based Fire Detection thema EDItEUR::U Computing and Information Technology Video-based fire detection is a crucial object detection problem that relies on accurate and reliable data to detect fires. However, collecting and labeling fire-related data can be time-consuming and expensive, making it difficult to obtain sufficient data for training machine learning models. To address this challenge, uncertainty-based active learning techniques can be used to iteratively select the most informative samples for labeling. This can reduce the amount of labeled data needed to achieve high model performance and has the potential to even prune the training data with fewer informative samples. The traditional sampling-based uncertainty estimation methods are computationally expensive. Hence, an efficient prior network-based ensemble distillation State-of-the-Art approach is evaluated on an internal dataset which still requires relatively higher overhead computation making it difficult for production deployment. A biased softmax differencing-based uncertainty approach and a feature-based hard data mining approach are proposed and compared with the distillation approach. The novel approaches are found to have a very low overhead uncertainty estimation time compared to the ensemble distillation approach and traditional sampling techniques. The methods are evaluated in the context of curating the unlabeled pool data and improving the training data. For completeness, the experiments are performed on three different data sizes, and overall, the frame-wise selection strategy is proved to be better than the sequence-wise querying strategy. The Principal Component Analysis (PCA)-based hard data mining outperformed other methods and improved the model performance by 16.33% with AUC2% metric when compared with the random selection of data. The approach even outperformed the main network trained on full data by 7.33%, henceforth improving the training data by using informative 26.39% data. The results indicate that novel data mining provides efficient training and pool data curation 2024-04-02T15:45:39Z 2024-04-02T15:45:39Z 2023 chapter ONIX_20240402_9791221502893_41 2704-5846 9791221502893 https://library.oapen.org/handle/20.500.12657/89072 eng Proceedings e report application/pdf n/a 9791221502893_60.pdf https://books.fupress.com/doi/capitoli/979-12-215-0289-3_60 Firenze University Press 10.36253/979-12-215-0289-3.60 10.36253/979-12-215-0289-3.60 bf65d21a-78e5-4ba2-983a-dbfa90962870 9791221502893 137 9 Florence open access
institution OAPEN
collection DSpace
language English
description Video-based fire detection is a crucial object detection problem that relies on accurate and reliable data to detect fires. However, collecting and labeling fire-related data can be time-consuming and expensive, making it difficult to obtain sufficient data for training machine learning models. To address this challenge, uncertainty-based active learning techniques can be used to iteratively select the most informative samples for labeling. This can reduce the amount of labeled data needed to achieve high model performance and has the potential to even prune the training data with fewer informative samples. The traditional sampling-based uncertainty estimation methods are computationally expensive. Hence, an efficient prior network-based ensemble distillation State-of-the-Art approach is evaluated on an internal dataset which still requires relatively higher overhead computation making it difficult for production deployment. A biased softmax differencing-based uncertainty approach and a feature-based hard data mining approach are proposed and compared with the distillation approach. The novel approaches are found to have a very low overhead uncertainty estimation time compared to the ensemble distillation approach and traditional sampling techniques. The methods are evaluated in the context of curating the unlabeled pool data and improving the training data. For completeness, the experiments are performed on three different data sizes, and overall, the frame-wise selection strategy is proved to be better than the sequence-wise querying strategy. The Principal Component Analysis (PCA)-based hard data mining outperformed other methods and improved the model performance by 16.33% with AUC2% metric when compared with the random selection of data. The approach even outperformed the main network trained on full data by 7.33%, henceforth improving the training data by using informative 26.39% data. The results indicate that novel data mining provides efficient training and pool data curation
title 9791221502893_60.pdf
spellingShingle 9791221502893_60.pdf
title_short 9791221502893_60.pdf
title_full 9791221502893_60.pdf
title_fullStr 9791221502893_60.pdf
title_full_unstemmed 9791221502893_60.pdf
title_sort 9791221502893_60.pdf
publisher Firenze University Press
publishDate 2024
url https://books.fupress.com/doi/capitoli/979-12-215-0289-3_60
_version_ 1799945273721487360