Deakin University
Browse

Characterizing adversarial subspaces using local intrinsic dimensionality

Version 2 2024-06-06, 10:41
Version 1 2018-01-01, 00:00
conference contribution
posted on 2024-06-06, 10:41 authored by X Ma, B Li, Y Wang, SM Erfani, S Wijewickrema, G Schoenebeck, D Song, ME Houle, J Bailey
© Learning Representations, ICLR 2018 - Conference Track Proceedings.All right reserved. Deep Neural Networks (DNNs) have recently been shown to be vulnerable against adversarial examples, which are carefully crafted instances that can mislead DNNs to make errors during prediction. To better understand such attacks, a characterization is needed of the properties of regions (the so-called ‘adversarial subspaces’) in which adversarial examples lie. We tackle this challenge by characterizing the dimensional properties of adversarial regions, via the use of Local Intrinsic Dimensionality (LID). LID assesses the space-filling capability of the region surrounding a reference example, based on the distance distribution of the example to its neighbors. We first provide explanations about how adversarial perturbation can affect the LID characteristic of adversarial regions, and then show empirically that LID characteristics can facilitate the distinction of adversarial examples generated using state-of-the-art attacks. As a proof-of-concept, we show that a potential application of LID is to distinguish adversarial examples, and the preliminary results show that it can outperform several state-of-the-art detection measures by large margins for five attack strategies considered in this paper across three benchmark datasets . Our analysis of the LID characteristic for adversarial regions not only motivates new directions of effective adversarial defense, but also opens up more challenges for developing new attacks to better understand the vulnerabilities of DNNs.

History

Location

Vancouver, British Columbia

Language

eng

Publication classification

E1.1 Full written paper - refereed

Pagination

1-15

Start date

2018-04-30

End date

2018-05-03

Title of proceedings

ICLR 2018 : Proceedings of the 6th International Conference on Learning Representations

Event

Learning Representations. Conference (2018 : 6th : Vancouver, British Columbia)

Publisher

ICLR

Place of publication

[Vancouver, British Columbia]

Usage metrics

    Research Publications

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC