9791221502893_60.pdf

Video-based fire detection is a crucial object detection problem that relies on accurate and reliable data to detect fires. However, collecting and labeling fire-related data can be time-consuming and expensive, making it difficult to obtain sufficient data for training machine learning models. To a...

Full description

Bibliographic Details
Language:	English
Published:	Firenze University Press 2024
Online Access:	https://books.fupress.com/doi/capitoli/979-12-215-0289-3_60

id	oapen-20.500.12657-89072
record_format	dspace
spelling	oapen-20.500.12657-890722024-04-03T02:23:11Z Chapter Efficient Data Curation Using Active Learning for a Video-Based Fire Detection Joshi, Keyur Dietrich, Philip Aziz, Angelina König, Markus Uncertainty Estimation Active Learning Object Detection Outlier Detection Feature-based cluster analysis Video-based Fire Detection thema EDItEUR::U Computing and Information Technology Video-based fire detection is a crucial object detection problem that relies on accurate and reliable data to detect fires. However, collecting and labeling fire-related data can be time-consuming and expensive, making it difficult to obtain sufficient data for training machine learning models. To address this challenge, uncertainty-based active learning techniques can be used to iteratively select the most informative samples for labeling. This can reduce the amount of labeled data needed to achieve high model performance and has the potential to even prune the training data with fewer informative samples. The traditional sampling-based uncertainty estimation methods are computationally expensive. Hence, an efficient prior network-based ensemble distillation State-of-the-Art approach is evaluated on an internal dataset which still requires relatively higher overhead computation making it difficult for production deployment. A biased softmax differencing-based uncertainty approach and a feature-based hard data mining approach are proposed and compared with the distillation approach. The novel approaches are found to have a very low overhead uncertainty estimation time compared to the ensemble distillation approach and traditional sampling techniques. The methods are evaluated in the context of curating the unlabeled pool data and improving the training data. For completeness, the experiments are performed on three different data sizes, and overall, the frame-wise selection strategy is proved to be better than the sequence-wise querying strategy. The Principal Component Analysis (PCA)-based hard data mining outperformed other methods and improved the model performance by 16.33% with AUC2% metric when compared with the random selection of data. The approach even outperformed the main network trained on full data by 7.33%, henceforth improving the training data by using informative 26.39% data. The results indicate that novel data mining provides efficient training and pool data curation 2024-04-02T15:45:39Z 2024-04-02T15:45:39Z 2023 chapter ONIX_20240402_9791221502893_41 2704-5846 9791221502893 https://library.oapen.org/handle/20.500.12657/89072 eng Proceedings e report application/pdf n/a 9791221502893_60.pdf https://books.fupress.com/doi/capitoli/979-12-215-0289-3_60 Firenze University Press 10.36253/979-12-215-0289-3.60 10.36253/979-12-215-0289-3.60 bf65d21a-78e5-4ba2-983a-dbfa90962870 9791221502893 137 9 Florence open access
institution	OAPEN
collection	DSpace
language	English
description	Video-based fire detection is a crucial object detection problem that relies on accurate and reliable data to detect fires. However, collecting and labeling fire-related data can be time-consuming and expensive, making it difficult to obtain sufficient data for training machine learning models. To address this challenge, uncertainty-based active learning techniques can be used to iteratively select the most informative samples for labeling. This can reduce the amount of labeled data needed to achieve high model performance and has the potential to even prune the training data with fewer informative samples. The traditional sampling-based uncertainty estimation methods are computationally expensive. Hence, an efficient prior network-based ensemble distillation State-of-the-Art approach is evaluated on an internal dataset which still requires relatively higher overhead computation making it difficult for production deployment. A biased softmax differencing-based uncertainty approach and a feature-based hard data mining approach are proposed and compared with the distillation approach. The novel approaches are found to have a very low overhead uncertainty estimation time compared to the ensemble distillation approach and traditional sampling techniques. The methods are evaluated in the context of curating the unlabeled pool data and improving the training data. For completeness, the experiments are performed on three different data sizes, and overall, the frame-wise selection strategy is proved to be better than the sequence-wise querying strategy. The Principal Component Analysis (PCA)-based hard data mining outperformed other methods and improved the model performance by 16.33% with AUC2% metric when compared with the random selection of data. The approach even outperformed the main network trained on full data by 7.33%, henceforth improving the training data by using informative 26.39% data. The results indicate that novel data mining provides efficient training and pool data curation
title	9791221502893_60.pdf
spellingShingle	9791221502893_60.pdf
title_short	9791221502893_60.pdf
title_full	9791221502893_60.pdf
title_fullStr	9791221502893_60.pdf
title_full_unstemmed	9791221502893_60.pdf
title_sort	9791221502893_60.pdf
publisher	Firenze University Press
publishDate	2024
url	https://books.fupress.com/doi/capitoli/979-12-215-0289-3_60
_version_	1799945273721487360

9791221502893_60.pdf

Similar Items