Effective algorithms and improved high-volume data analysis techniques

Behind the buzzword big data lays a challenge which is very true and very important both for academia and industry. Efficiently handling an overwhelming volume of data, possibly coming from a large number of sources abiding to vastly different functionality protocols, coding conventions, sampling r...

Πλήρης περιγραφή

Λεπτομέρειες βιβλιογραφικής εγγραφής
Κύριος συγγραφέας: Λιαπάκης, Ξενοφών
Άλλοι συγγραφείς: Παυλίδης, Γεώργιος
Μορφή: Thesis
Γλώσσα:English
Έκδοση: 2019
Θέματα:
Διαθέσιμο Online:http://hdl.handle.net/10889/12795
id nemertes-10889-12795
record_format dspace
institution UPatras
collection Nemertes
language English
topic Parallel computing
GPU computing
Special purpose computing
Linear algebra
System solution
Sparse matrices
Eigenvalues
Spectral factorization
Graph mining
Graph resilience
Triangles
Blockchain management
Distributed consensus
Digital transparency
Digital health
Mobile health applications
Insurance market
Computational kernel
Παράλληλη επεξεργασία
Γραμμική άλγεβρα
005.7
spellingShingle Parallel computing
GPU computing
Special purpose computing
Linear algebra
System solution
Sparse matrices
Eigenvalues
Spectral factorization
Graph mining
Graph resilience
Triangles
Blockchain management
Distributed consensus
Digital transparency
Digital health
Mobile health applications
Insurance market
Computational kernel
Παράλληλη επεξεργασία
Γραμμική άλγεβρα
005.7
Λιαπάκης, Ξενοφών
Effective algorithms and improved high-volume data analysis techniques
description Behind the buzzword big data lays a challenge which is very true and very important both for academia and industry. Efficiently handling an overwhelming volume of data, possibly coming from a large number of sources abiding to vastly different functionality protocols, coding conventions, sampling rates, and quality standards, is very important from engineering, algorithmic, economic, and even social perspective. However, no matter how challenging the technical part is, it is nonetheless only a fraction of the entire challenge. This happens because harnessing new, non-trivial knowledge from literally an entire data ocean is even more challenging given the additional constraint that the value of this newly obtained knowledge must at least equal the total extraction it, including factors such as power, storage cost, equipment procurements, and data collection cost. And this is the marginal case, which cannot be maintained indefinitely. On the contrary, the knowledge value must be a multiple of the total effort cost in order for any big data pipeline to be viable from a business perspective. The twofold objective of this PhD dissertation is to: To explore the applications of parallelism to accelerating critical computations in challenging problems from various fields. One very concrete example comes from the emerging field of computational combinatorics. Specifically, a novel graph structural resilience metric based on triangles and paths is proposed. Since this metric is purely structural, namely function oblivious, it can be applied to virtually any graph as long as the patterns it relies on have a physical meaning. To show how parallelism can be part of very efficient and wide applicable computational kernels, such as those found in the BLAS library for basic linear algebra operations, which can be applied to various engineering and financial problems, the proposed algorithms are examined from a computational kernel perspective. It is shown that they can be applied to other problems as well, increasing thus their usefulness.
author2 Παυλίδης, Γεώργιος
author_facet Παυλίδης, Γεώργιος
Λιαπάκης, Ξενοφών
format Thesis
author Λιαπάκης, Ξενοφών
author_sort Λιαπάκης, Ξενοφών
title Effective algorithms and improved high-volume data analysis techniques
title_short Effective algorithms and improved high-volume data analysis techniques
title_full Effective algorithms and improved high-volume data analysis techniques
title_fullStr Effective algorithms and improved high-volume data analysis techniques
title_full_unstemmed Effective algorithms and improved high-volume data analysis techniques
title_sort effective algorithms and improved high-volume data analysis techniques
publishDate 2019
url http://hdl.handle.net/10889/12795
work_keys_str_mv AT liapakēsxenophōn effectivealgorithmsandimprovedhighvolumedataanalysistechniques
AT liapakēsxenophōn apotelesmatikoialgorithmoikaibeltiōmenestechnikesanalysēsdedomenōnmegalouonkou
_version_ 1771297303476305920
spelling nemertes-10889-127952022-09-05T20:36:41Z Effective algorithms and improved high-volume data analysis techniques Αποτελεσματικοί αλγόριθμοι και βελτιωμένες τεχνικές ανάλυσης δεδομένων μεγάλου όγκου Λιαπάκης, Ξενοφών Παυλίδης, Γεώργιος Παυλίδης, Γεώργιος Μεγαλοοικονόμου, Βασίλης Γαροφαλάκης, Ιωάννης Τζήμας, Ιωάννης Σιούτας, Σπύρος Τσόλης, Δημήτρης Στυλιάρας, Γεώργιος Liapakis, Xenofon Parallel computing GPU computing Special purpose computing Linear algebra System solution Sparse matrices Eigenvalues Spectral factorization Graph mining Graph resilience Triangles Blockchain management Distributed consensus Digital transparency Digital health Mobile health applications Insurance market Computational kernel Παράλληλη επεξεργασία Γραμμική άλγεβρα 005.7 Behind the buzzword big data lays a challenge which is very true and very important both for academia and industry. Efficiently handling an overwhelming volume of data, possibly coming from a large number of sources abiding to vastly different functionality protocols, coding conventions, sampling rates, and quality standards, is very important from engineering, algorithmic, economic, and even social perspective. However, no matter how challenging the technical part is, it is nonetheless only a fraction of the entire challenge. This happens because harnessing new, non-trivial knowledge from literally an entire data ocean is even more challenging given the additional constraint that the value of this newly obtained knowledge must at least equal the total extraction it, including factors such as power, storage cost, equipment procurements, and data collection cost. And this is the marginal case, which cannot be maintained indefinitely. On the contrary, the knowledge value must be a multiple of the total effort cost in order for any big data pipeline to be viable from a business perspective. The twofold objective of this PhD dissertation is to: To explore the applications of parallelism to accelerating critical computations in challenging problems from various fields. One very concrete example comes from the emerging field of computational combinatorics. Specifically, a novel graph structural resilience metric based on triangles and paths is proposed. Since this metric is purely structural, namely function oblivious, it can be applied to virtually any graph as long as the patterns it relies on have a physical meaning. To show how parallelism can be part of very efficient and wide applicable computational kernels, such as those found in the BLAS library for basic linear algebra operations, which can be applied to various engineering and financial problems, the proposed algorithms are examined from a computational kernel perspective. It is shown that they can be applied to other problems as well, increasing thus their usefulness. Το ανωτέρω ρητό εκτός από μια επιτυχημένη αναλογία μεταξύ του ψηφιακού κόσμου του 21ου αιώνα και του αμέσως προηγούμενου, ο οποίος ξεκίνησε ουσιαστικά τον 19ο αιώνα, κρύβει και μια πολύ σημαντική αλήθεια. Όπως και το πετρέλαιο, τα δεδομένα, ειδικά όταν η ψηφιακή μάζα τους ξεπερνά κάποια σημαντικά τεχνολογικά όρια, θα πρέπει να αποκτηθούν εύκολα και θα πρέπει να είναι υψηλής ποιότητας για την εξόρυξη και την επακόλουθη έγκαιρη αξιοποίηση νέας και μη τετριμμένης γνώσης από αυτά. Αν και κάθε οργανισμός ο οποίος παράγει ή διαχειρίζεται σήμερα δεδομένα (θα πρέπει να) διαθέτει ουσιαστικά εσωτερικά κριτήρια ποιότητας δεδομένων αν όντως θέλει να αντλήσει από αυτά έγκαιρο και έγκυρο πρόσθετο πληροφοριακό πλεονέκτημα έναντι του ανταγωνισμού. Αν και τα εν λόγω κριτήρια προφανώς και εξαρτώνται όχι μόνον από την οργανωτική δομή, την φύση, τους πόρους, και την αποστολή του εκάστοτε οργανισμού αλλά ορισμένες φορές, κυρίως για ad hoc δράσεις, και από εντελώς συγκυριακές περιστάσεις, η επιστημονική κοινότητα έχει επεξεργαστεί ένα σύστημα έξι κυρίων αξόνων, το οποίο μπορεί κάλλιστα να συνδυαστεί με τα εκάστοτε εσωτερικά κριτήρια ενός οργανισμού, για την αξιολόγηση των δεδομένων, το λεγόμενο σύστημα 6V. 2019-11-03T11:46:21Z 2019-11-03T11:46:21Z 2019-07-24 Thesis http://hdl.handle.net/10889/12795 en 0 application/pdf