File(s) under permanent embargo
A differentially private algorithm for location data release
journal contributionposted on 2016-06-01, 00:00 authored by P Xiong, Tianqing Zhu, W Niu, Gang LiGang Li
The rise of mobile technologies in recent years has led to large volumes of location information, which are valuable resources for knowledge discovery such as travel patterns mining and traffic analysis. However, location dataset has been confronted with serious privacy concerns because adversaries may re-identify a user and his/her sensitivity information from these datasets with only a little background knowledge. Recently, several privacy-preserving techniques have been proposed to address the problem, but most of them lack a strict privacy notion and can hardly resist the number of possible attacks. This paper proposes a private release algorithm to randomize location dataset in a strict privacy notion, differential privacy, with the goal of preserving users’ identities and sensitive information. The algorithm aims to mask the exact locations of each user as well as the frequency that the user visits the locations with a given privacy budget. It includes three privacy-preserving operations: private location clustering shrinks the randomized domain and cluster weight perturbation hides the weights of locations, while private location selection hides the exact locations of a user. Theoretical analysis on privacy and utility confirms an improved trade-off between privacy and utility of released location data. Extensive experiments have been carried out on four real-world datasets, GeoLife, Flickr, Div400 and Instagram. The experimental results further suggest that this private release algorithm can successfully retain the utility of the datasets while preserving users’ privacy.