Privacy aware K-means clustering with high utility

Nguyen, TD; Gupta, Sunil; Rana, Santu; Venkatesh, Svetha

Privacy aware K-means clustering with high utility

chapter

posted on 2024-06-05, 11:49 authored by TD Nguyen, Sunil GuptaSunil Gupta, Santu RanaSantu Rana, Svetha VenkateshSvetha Venkatesh

Privacy-preserving data mining aims to keep data safe, yet useful. But algorithms providing strong guarantees often end up with low utility. We propose a novel privacy preserving framework that thwarts an adversary from inferring an unknown data point by ensuring that the estimation error is almost invariant to the inclusion/exclusion of the data point. By focusing directly on the estimation error of the data point, our framework is able to significantly lower the perturbation required. We use this framework to propose a new privacy aware K-means clustering algorithm. Using both synthetic and real datasets, we demonstrate that the utility of this algorithm is almost equal to that of the unperturbed K-means, and at strict privacy levels, almost twice as good as compared to the differential privacy counterpart.

History

Volume

9652

Chapter number

31

Pagination

388-400

Publisher DOI

https://doi.org/10.1007/978-3-319-31750-2_31

ISSN

0302-9743

ISBN-13

9783319317533

Language

eng

Publication classification

B Book chapter, B1 Book chapter

Copyright notice

2016, Springer

Extent

44

Editor/Contributor(s)

Bailey J, Khan L, Washio T, Dobbie G, Huang JZ, Wang R

Publisher

Springer

Place of publication

Berlin, Germany

Title of book

Advances in knowledge discovery and data mining: 20th Pacific-Asia Conference, PAKDD 2016 Auckland, New Zealand, April 19-22, 2016 proceedings, part I

Series

Lecture notes in artificial intelligence; v.9652

Usage metrics

Keywords

080109 Pattern Recognition and Data Mining 970108 Expanding Knowledge in the Information and Computing Sciences Centre for Pattern Recognition and Data Analytics 4604 Cybersecurity and privacy 4605 Data management and data science

Privacy aware K-means clustering with high utility

History

Volume

Chapter number

Pagination

Publisher DOI

ISSN

ISBN-13

Language

Publication classification

Copyright notice

Extent

Editor/Contributor(s)

Publisher

Place of publication

Title of book

Series

Usage metrics

Categories

Keywords

Licence

Exports