Learning from Imperfect Data: Noisy Labels, Truncation, and Coarsening

Author	: Vasilis Kontonis (Ph.D.)
Publisher	:
Total Pages	: 0
Release	: 2023
ISBN-10	: OCLC:1410731106
ISBN-13	:
Rating	: 4/5 ( Downloads)

DOWNLOAD EBOOK

Book Synopsis Learning from Imperfect Data: Noisy Labels, Truncation, and Coarsening by : Vasilis Kontonis (Ph.D.)

Download or read book Learning from Imperfect Data: Noisy Labels, Truncation, and Coarsening written by Vasilis Kontonis (Ph.D.) and published by . This book was released on 2023 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: The datasets used in machine learning and statistics are \emph{huge} and often \emph{imperfect},\textit{e.g.}, they contain corrupted data, examples with wrong labels, or hidden biases. Most existing approaches (i) produce unreliable results when the datasets are corrupted, (ii) are computationally inefficient, or (iii) come without any theoretical/provable performance guarantees. In this thesis, we \emph{design learning algorithms} that are \textbf{computationally efficient} and at the same time \textbf{provably reliable}, even when used on imperfect datasets. We first focus on supervised learning settings with noisy labels. We present efficient and optimal learners under the semi-random noise models of Massart and Tsybakov -- where the true label of each example is flipped with probability at most 50\% -- and an efficient approximate learner under adversarial label noise -- where a small but arbitrary fraction of labels is flipped -- under structured feature distributions. Apart from classification, we extend our results to noisy label-ranking. In truncated statistics, the learner does not observe a representative set of samples from the whole population, but only truncated samples, \textit{i.e.}, samples from a potentially small subset of the support of the population distribution. We give the first efficient algorithms for learning Gaussian distributions with unknown truncation sets and initiate the study of non-parametric truncated statistics. Closely related to truncation is \emph{data coarsening}, where instead of observing the class of an example, the learner receives a set of potential classes, one of which is guaranteed to be the correct class. We initiate the theoretical study of the problem, and present the first efficient learning algorithms for learning from coarse data.

Health Wealth Mirror

Learning from Imperfect Data: Noisy Labels, Truncation, and Coarsening

Learning from Imperfect Data: Noisy Labels, Truncation, and Coarsening Related Books

Learning from Imperfect Data: Noisy Labels, Truncation, and Coarsening

Reinforcement Learning, second edition

Federated Learning

Machine Learning Algorithms

Toward Category-Level Object Recognition

New Books

Recent Books