Περίληψη: | In this thesis we consider algorithms for fast estimations of leverage scores. Statistical leverage scores are a powerful tool for data analysis and statistics and have been successfully used for outlier detection in datasets, locating important nodes in graphs and more recently applied to numerical linear algebra algorithms. In order to build estimators, we consider dimensionality reduction techniques that use randomization in combination with iterative methods for solving linear systems with multiple right hand sides. Based on these techniques we try to overcome certain limitations of the current state-of-the-art algorithms and propose an approach which provably returns good estimations of leverage scores, scales well in parallel/distributed environments and effectively utilizes sparsity. We present our results on synthetic and real world data sets and evaluate its performance, and discuss the advantages and drawbacks relative to all considered approaches.
|