Robust Nonparametric Inference on Manifold Spaces
Loading...
Date
Authors
Advisor
Chenouri, Shoja'eddin
Journal Title
Journal ISSN
Volume Title
Publisher
University of Waterloo
Abstract
We propose rank-based procedures for robust and nonparametric statistical inference on manifold spaces. Particularly, we focus on the problems of multi-sample hypothesis testing, multiple change point analysis, and statistical process monitoring when data lie on a Riemannian manifold. These methodologies provide a unified framework to deal with various types of data structures such as matrices, curves, surfaces, networks, to name a few. These types of datasets frequently appear in a broad set of applications such as communication networks, manufacturing, computer vision, autonomous systems and robotics. We evaluate the proposed methods considering various types of object data such as matrices, curves, text mining data, networks, shape data and landmarks.
In Chapter 2, we develop robust and nonparametric methods for hypothesis testing when data lie on a manifold. We demonstrate that ranks generated from data depth can be used for two-sample and multiple sample hypothesis testing of change in location and scale parameters. Several important properties of these tests such as asymptotic convergence, size and power, robustness with respect to qualitative-robustness and breakdown point are developed under mild nonparametric assumptions. These tests have several advantages, they have a simple distribution under null, they are computationally cheap, and they enjoy invariance properties. We demonstrate the efficacy of these methods with a numerical simulation and a data analysis. We show that these tests are robust when data are heavy tailed or skewed, and have higher power compared to their competitors in some situations, while still maintaining a reasonable size.
In Chapter 3, we propose robust and nonparametric single and multiple change point detection methods for stochastic processes defined on manifolds. These methods consider a variant of CUSUM statistic which is based on the rank of data depth. We demonstrate that changes in the rank of depth values can be used to detect change in the distribution of data lie on manifolds. To detect more than one change point, we consider binary segmentation and wild binary segmentation algorithms along with the proposed data depth rank CUSUM statistic. We demonstrate that both of these algorithms are consistent estimators of the number of change point(s) and the location of change point(s). In addition to asymptotic results, we develop nonasymptotic sharp bounds for single and multiple change point estimators. These test statistics can be applied to both intrinsic and extrinsic
manifold analysis frameworks. In simulation, we compare our methods against several methods from the literature, and demonstrate that the proposed methods outperform their competitors in some situations where dataset is contaminated with outliers. We also present the application of our methods to vehicle health monitoring, traffic monitoring on highways, and mall pedestrian surveillance.
In Chapter 4, we extend these methods to the setting of statistical process monitoring. We investigate statistical process monitoring scheme on general metric spaces, and propose exponentially weighted moving average, CUSUM, and Mann-Whitney moving average Shewhart control charts using rank of data depth. These methods are nonparametric and robust to outliers through the use of data depth ranks. We show that when sample size is large, our methods have simple behaviour under the null hypothesis. Since our methods are based on data depth ranks, we do not need the estimate of covariance operator which makes our method computationally cheap. Such advantages make these methods a favorable choice for online process monitoring. We demonstrate the robustness of these methods theoretically and numerically. We extract several nonparametric control charts from the literature for comparative study. Simulation results indicated that the proposed methods
outperform their competitors in many situations in terms of out-of-control average run length, while keeping the in-control average run length at a reasonable level. We present the application of our methods to laser power-bed fusion additive manufacturing process.
In Chapter 5, we present some possible directions for future research related to dynamic network and longitudinal data analysis on Riemannian manifolds. It is anticipated that the contributions achieved in this thesis will be applicable to a wide range of interdisciplinary research problems.