Oberseminar "Mathematik des Maschinellen Lernens und Angewandte Analysis" - M.Sc. Maximilian Fleissner
Self-Supervised Learning with Few Samples
| Datum: | 15.07.2026, 10:15 - 11:15 Uhr |
| Kategorie: | Veranstaltung |
| Ort: | Hubland Nord, Geb. 30, 02.003 |
| Veranstalter: | Lehrstuhl für Mathematik III (Maschinelles Lernen) |
| Vortragende: | Maximilian Fleissner, Technische Universität München |
The modern paradigm of self-supervised learning (SSL) aims to learn low-dimensional representations from large sets of complex, unlabeled data. The representations can subsequently be used to solve various downstream prediction tasks with a much smaller set of labeled data. SSL often makes use of data augmentations, e.g. random masking of certain pixels of an image. Intuitively, such data augmentations should be irrelevant for downstream prediction tasks, and so the representations in SSL can be understood as projections to a low-dimensional subspace that is invariant under such data augmentations. In our work, we formalize SSL and its underlying assumptions in the context of linear models and shallow neural networks. We prove that learning an invariant subspace (suitably formalized) is often possible at a much faster rate if we reuse data augmentations across all samples, despite this introducing statistical dependency into the samples used to estimate it. We also show how these improved rates yield better performance on downstream regression tasks.
