Multilabel Classification with Partial Abstention: Bayes-Optimal Prediction under Label Independence
Main Article Content
Abstract
In contrast to conventional (single-label) classification, the setting of multilabel classification (MLC) allows an instance to belong to several classes simultaneously. Thus, instead of selecting a single class label, predictions take the form of a subset of all labels. In this paper, we study an extension of the setting of MLC, in which the learner is allowed to partially abstain from a prediction, that is, to deliver predictions on some but not necessarily all class labels. This option is useful in cases of uncertainty, where the learner does not feel confident enough on the entire label set. Adopting a decision-theoretic perspective, we propose a formal framework of MLC with partial abstention, which builds on two main building blocks: First, the extension of underlying MLC loss functions so as to accommodate abstention in a proper way, and second the problem of optimal prediction, that is, finding the Bayes-optimal prediction minimizing this generalized loss in expectation. It is well known that different (generalized) loss functions may have different risk-minimizing predictions, and finding the Bayes predictor typically comes down to solving a computationally complexity optimization problem. In the most general case, given a prediction of the (conditional) joint distribution of possible labelings, the minimizer of the expected loss needs to be found over a number of candidates which is exponential in the number of class labels. We elaborate on properties of risk minimizers for several commonly used (generalized) MLC loss functions, show them to have a specific structure, and leverage this structure to devise efficient methods for computing Bayes predictors. Experimentally, we show MLC with partial abstention to be effective in the sense of reducing loss when being allowed to abstain.