Deakin University
Browse

File(s) under permanent embargo

Cost effective multi-label active learning via querying subexamples

conference contribution
posted on 2018-01-01, 00:00 authored by X Chen, G Yu, C Domeniconi, J Wang, Z Li, Zili ZhangZili Zhang
Multi-label active learning addresses the scarce labeled example problem by querying the most valuable unlabeled examples, or example-label pairs, to achieve a better performance with limited query cost. Current multi-label active learning methods require the scrutiny of the whole example in order to obtain its annotation. In contrast, one can find positive evidence with respect to a label by examining specific patterns (i.e., subexample), rather than the whole example, thus making the annotation process more efficient. Based on this observation, we propose a novel two-stage cost effective multi-label active learning framework, called CMAL. In the first stage, a novel example-label pair selection strategy is introduced. Our strategy leverages label correlation and label space sparsity of multi-label examples to select the most uncertain example-label pairs. Specifically, the unknown relevant label of an example can be inferred from the correlated labels that are already assigned to the example, thus reducing the uncertainty of the unknown label. In addition, the larger the number of relevant examples of a particular label, the smaller the uncertainty of the label is. In the second stage, CMAL queries the most plausible positive subexample-label pairs of the selected example-label pairs. Comprehensive experiments on multi-label datasets collected from different domains demonstrate the effectiveness of our proposed approach on cost effective queries. We also show that leveraging label correlation and label sparsity contribute to saving costs.

History

Event

IEEE Computer Society. Conference (2018 : Singapore)

Series

IEEE Computer Society Conference

Pagination

905 - 910

Publisher

Institute of Electrical and Electronics Engineers

Location

Singapore

Place of publication

Piscataway, N.J.

Start date

2018-11-17

End date

2018-11-20

ISSN

1550-4786

ISBN-13

9781538691588

Language

eng

Publication classification

E1 Full written paper - refereed

Copyright notice

2018, IEEE

Editor/Contributor(s)

[Unknown]

Title of proceedings

ICDM 2018 : Proceedings of the 2018 IEEE International Conference on Data Mining

Usage metrics

    Research Publications

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC