Data Mining and Knowledge Discovery vol:30 issue:2 pages:313-341
In multi-instance learning, instances are organized into bags, and
a bag is labeled positive if it contains at least one positive instance, and neg-
ative otherwise; the labels of the individual instances are not given. The task
is to learn a classifier from this limited information. While the original task
description involved learning an instance classifier, in the literature the task
is often interpreted as learning a bag classifier. Depending on which of these
two interpretations is used, it is more natural to evaluate classifiers according
to how well they predict, respectively, instance labels or bag labels. In the
literature, however, the two interpretations are often mixed, or the intended
interpretation is left implicit. In this paper, we investigate the difference be-
tween bag-level and instance-level accuracy, both analytically and empirically.
We show that there is a substantial difference between these two, and bet-
ter performance on one does not necessarily imply better performance on the
other. It is therefore useful to clearly distinguish them, and always use the
evaluation criterion most relevant for the task at hand.