Effective algorithms and improved high-volume data analysis techniques

Behind the buzzword big data lays a challenge which is very true and very important both for academia and industry. Efficiently handling an overwhelming volume of data, possibly coming from a large number of sources abiding to vastly different functionality protocols, coding conventions, sampling r...

Full description

Bibliographic Details
Main Author:	Λιαπάκης, Ξενοφών
Other Authors:	Παυλίδης, Γεώργιος
Format:	Thesis
Language:	English
Published:	2019
Subjects:	Parallel computing GPU computing Special purpose computing Linear algebra System solution Sparse matrices Eigenvalues Spectral factorization Graph mining Graph resilience Triangles Blockchain management Distributed consensus Digital transparency Digital health Mobile health applications Insurance market Computational kernel Παράλληλη επεξεργασία Γραμμική άλγεβρα 005.7
Online Access:	http://hdl.handle.net/10889/12795

id	nemertes-10889-12795
record_format	dspace
institution	UPatras
collection	Nemertes
language	English
topic	Parallel computing GPU computing Special purpose computing Linear algebra System solution Sparse matrices Eigenvalues Spectral factorization Graph mining Graph resilience Triangles Blockchain management Distributed consensus Digital transparency Digital health Mobile health applications Insurance market Computational kernel Παράλληλη επεξεργασία Γραμμική άλγεβρα 005.7
spellingShingle	Parallel computing GPU computing Special purpose computing Linear algebra System solution Sparse matrices Eigenvalues Spectral factorization Graph mining Graph resilience Triangles Blockchain management Distributed consensus Digital transparency Digital health Mobile health applications Insurance market Computational kernel Παράλληλη επεξεργασία Γραμμική άλγεβρα 005.7 Λιαπάκης, Ξενοφών Effective algorithms and improved high-volume data analysis techniques
description	Behind the buzzword big data lays a challenge which is very true and very important both for academia and industry. Efficiently handling an overwhelming volume of data, possibly coming from a large number of sources abiding to vastly different functionality protocols, coding conventions, sampling rates, and quality standards, is very important from engineering, algorithmic, economic, and even social perspective. However, no matter how challenging the technical part is, it is nonetheless only a fraction of the entire challenge. This happens because harnessing new, non-trivial knowledge from literally an entire data ocean is even more challenging given the additional constraint that the value of this newly obtained knowledge must at least equal the total extraction it, including factors such as power, storage cost, equipment procurements, and data collection cost. And this is the marginal case, which cannot be maintained indefinitely. On the contrary, the knowledge value must be a multiple of the total effort cost in order for any big data pipeline to be viable from a business perspective. The twofold objective of this PhD dissertation is to: To explore the applications of parallelism to accelerating critical computations in challenging problems from various fields. One very concrete example comes from the emerging field of computational combinatorics. Specifically, a novel graph structural resilience metric based on triangles and paths is proposed. Since this metric is purely structural, namely function oblivious, it can be applied to virtually any graph as long as the patterns it relies on have a physical meaning. To show how parallelism can be part of very efficient and wide applicable computational kernels, such as those found in the BLAS library for basic linear algebra operations, which can be applied to various engineering and financial problems, the proposed algorithms are examined from a computational kernel perspective. It is shown that they can be applied to other problems as well, increasing thus their usefulness.
author2	Παυλίδης, Γεώργιος
author_facet	Παυλίδης, Γεώργιος Λιαπάκης, Ξενοφών
format	Thesis
author	Λιαπάκης, Ξενοφών
author_sort	Λιαπάκης, Ξενοφών
title	Effective algorithms and improved high-volume data analysis techniques
title_short	Effective algorithms and improved high-volume data analysis techniques
title_full	Effective algorithms and improved high-volume data analysis techniques
title_fullStr	Effective algorithms and improved high-volume data analysis techniques
title_full_unstemmed	Effective algorithms and improved high-volume data analysis techniques
title_sort	effective algorithms and improved high-volume data analysis techniques
publishDate	2019
url	http://hdl.handle.net/10889/12795
work_keys_str_mv	AT liapakēsxenophōn effectivealgorithmsandimprovedhighvolumedataanalysistechniques AT liapakēsxenophōn apotelesmatikoialgorithmoikaibeltiōmenestechnikesanalysēsdedomenōnmegalouonkou
_version_	1771297303476305920
spelling	nemertes-10889-127952022-09-05T20:36:41Z Effective algorithms and improved high-volume data analysis techniques Αποτελεσματικοί αλγόριθμοι και βελτιωμένες τεχνικές ανάλυσης δεδομένων μεγάλου όγκου Λιαπάκης, Ξενοφών Παυλίδης, Γεώργιος Παυλίδης, Γεώργιος Μεγαλοοικονόμου, Βασίλης Γαροφαλάκης, Ιωάννης Τζήμας, Ιωάννης Σιούτας, Σπύρος Τσόλης, Δημήτρης Στυλιάρας, Γεώργιος Liapakis, Xenofon Parallel computing GPU computing Special purpose computing Linear algebra System solution Sparse matrices Eigenvalues Spectral factorization Graph mining Graph resilience Triangles Blockchain management Distributed consensus Digital transparency Digital health Mobile health applications Insurance market Computational kernel Παράλληλη επεξεργασία Γραμμική άλγεβρα 005.7 Behind the buzzword big data lays a challenge which is very true and very important both for academia and industry. Efficiently handling an overwhelming volume of data, possibly coming from a large number of sources abiding to vastly different functionality protocols, coding conventions, sampling rates, and quality standards, is very important from engineering, algorithmic, economic, and even social perspective. However, no matter how challenging the technical part is, it is nonetheless only a fraction of the entire challenge. This happens because harnessing new, non-trivial knowledge from literally an entire data ocean is even more challenging given the additional constraint that the value of this newly obtained knowledge must at least equal the total extraction it, including factors such as power, storage cost, equipment procurements, and data collection cost. And this is the marginal case, which cannot be maintained indefinitely. On the contrary, the knowledge value must be a multiple of the total effort cost in order for any big data pipeline to be viable from a business perspective. The twofold objective of this PhD dissertation is to: To explore the applications of parallelism to accelerating critical computations in challenging problems from various fields. One very concrete example comes from the emerging field of computational combinatorics. Specifically, a novel graph structural resilience metric based on triangles and paths is proposed. Since this metric is purely structural, namely function oblivious, it can be applied to virtually any graph as long as the patterns it relies on have a physical meaning. To show how parallelism can be part of very efficient and wide applicable computational kernels, such as those found in the BLAS library for basic linear algebra operations, which can be applied to various engineering and financial problems, the proposed algorithms are examined from a computational kernel perspective. It is shown that they can be applied to other problems as well, increasing thus their usefulness. Το ανωτέρω ρητό εκτός από μια επιτυχημένη αναλογία μεταξύ του ψηφιακού κόσμου του 21ου αιώνα και του αμέσως προηγούμενου, ο οποίος ξεκίνησε ουσιαστικά τον 19ο αιώνα, κρύβει και μια πολύ σημαντική αλήθεια. Όπως και το πετρέλαιο, τα δεδομένα, ειδικά όταν η ψηφιακή μάζα τους ξεπερνά κάποια σημαντικά τεχνολογικά όρια, θα πρέπει να αποκτηθούν εύκολα και θα πρέπει να είναι υψηλής ποιότητας για την εξόρυξη και την επακόλουθη έγκαιρη αξιοποίηση νέας και μη τετριμμένης γνώσης από αυτά. Αν και κάθε οργανισμός ο οποίος παράγει ή διαχειρίζεται σήμερα δεδομένα (θα πρέπει να) διαθέτει ουσιαστικά εσωτερικά κριτήρια ποιότητας δεδομένων αν όντως θέλει να αντλήσει από αυτά έγκαιρο και έγκυρο πρόσθετο πληροφοριακό πλεονέκτημα έναντι του ανταγωνισμού. Αν και τα εν λόγω κριτήρια προφανώς και εξαρτώνται όχι μόνον από την οργανωτική δομή, την φύση, τους πόρους, και την αποστολή του εκάστοτε οργανισμού αλλά ορισμένες φορές, κυρίως για ad hoc δράσεις, και από εντελώς συγκυριακές περιστάσεις, η επιστημονική κοινότητα έχει επεξεργαστεί ένα σύστημα έξι κυρίων αξόνων, το οποίο μπορεί κάλλιστα να συνδυαστεί με τα εκάστοτε εσωτερικά κριτήρια ενός οργανισμού, για την αξιολόγηση των δεδομένων, το λεγόμενο σύστημα 6V. 2019-11-03T11:46:21Z 2019-11-03T11:46:21Z 2019-07-24 Thesis http://hdl.handle.net/10889/12795 en 0 application/pdf

Effective algorithms and improved high-volume data analysis techniques

Similar Items