File(s) under permanent embargo
Privacy aware K-means clustering with high utility
chapter
posted on 2016-04-12, 00:00 authored by Thanh Dai Nguyen, Sunil GuptaSunil Gupta, Santu RanaSantu Rana, Svetha VenkateshSvetha VenkateshPrivacy-preserving data mining aims to keep data safe, yet useful. But algorithms providing strong guarantees often end up with low utility. We propose a novel privacy preserving framework that thwarts an adversary from inferring an unknown data point by ensuring that the estimation error is almost invariant to the inclusion/exclusion of the data point. By focusing directly on the estimation error of the data point, our framework is able to significantly lower the perturbation required. We use this framework to propose a new privacy aware K-means clustering algorithm. Using both synthetic and real datasets, we demonstrate that the utility of this algorithm is almost equal to that of the unperturbed K-means, and at strict privacy levels, almost twice as good as compared to the differential privacy counterpart.
History
Title of book
Advances in knowledge discovery and data mining: 20th Pacific-Asia Conference, PAKDD 2016 Auckland, New Zealand, April 19-22, 2016 proceedings, part IVolume
9652Series
Lecture notes in artificial intelligence; v.9652Chapter number
31Pagination
388 - 400Publisher
SpringerPlace of publication
Berlin, GermanyPublisher DOI
ISSN
0302-9743ISBN-13
9783319317533Language
engPublication classification
B Book chapter; B1 Book chapterCopyright notice
2016, SpringerExtent
44Editor/Contributor(s)
J Bailey, L Khan, T Washio, G Dobbie, J Huang, R WangUsage metrics
Categories
No categories selectedKeywords
Licence
Exports
RefWorks
BibTeX
Ref. manager
Endnote
DataCite
NLM
DC