The University of Waterloo Libraries will be performing maintenance on UWSpace tomorrow, November 5th, 2025, from 10 am – 6 pm EST.
UWSpace will be offline for all UW community members during this time. Please avoid submitting items to UWSpace until November 7th, 2025.

Manifold-Aware Regularization for Self-Supervised Representation Learning

Loading...
Thumbnail Image

Advisor

Fieguth, Paul

Journal Title

Journal ISSN

Volume Title

Publisher

University of Waterloo

Abstract

Self-supervised learning (SSL) has emerged as a dominant paradigm for representation learning, yet much of its recent progress has been guided by empirical heuristics rather than unifying theoretical principles. This thesis advances the understanding of SSL by framing representation learning as a problem of geometry preservation on the data manifold, where the objective is to shape embedding spaces that respect intrinsic structure while remaining discriminative for downstream tasks. We develop a suite of methods—ranging from optimal transport–regularized contrastive learning (SinSim) to kernelized variance–invariance–covariance regularization (Kernel VICReg)—that systematically move beyond the Euclidean metric paradigm toward geometry-adaptive distances and statistical dependency measures, such as maximum mean discrepancy (MMD) and Hilbert–Schmidt independence criterion (HSIC). Our contributions span both theory and practice. Theoretically, we unify contrastive and non-contrastive SSL objectives under a manifold-aware regularization framework, revealing deep connections between dependency reduction, spectral geometry, and invariance principles. We also challenge the pervasive assumption that Euclidean distance is the canonical measure for alignment, showing that embedding metrics are themselves learnable design choices whose compatibility with the manifold geometry critically affects representation quality. Practically, we validate our framework across diverse domains—including natural images and structured scientific data—demonstrating improvements in downstream generalization, robustness to distribution shift, and stability under limited augmentations. By integrating geometric priors, kernel methods, and distributional alignment into SSL, this work reframes representation learning as a principled interaction between statistical dependence control and manifold geometry. The thesis concludes by identifying open theoretical questions at the intersection of Riemannian geometry, kernel theory, and self-supervised objectives, outlining a research agenda for the next generation of geometry-aware foundation models.

Description

LC Subject Headings

Citation