inner-banner-bg

Journal of Mathematical Techniques and Computational Mathematics(JMTCM)

ISSN: 2834-7706 | DOI: 10.33140/JMTCM

Impact Factor: 1.3

Augmented Differential Privacy Framework for Data Analytics

Abstract

P. H. Anantha Desik and Sumiran Naman

Differential privacy has emerged as a popular privacy framework for providing privacy preserving noisy query answers based on statistical properties of databases. It guarantees that the distribution of noisy query answers changes very little with the addition or deletion of any tuple. Differential enjoys popular reputation that providing privacy without building any assumptions about the data and protecting against attackers who know all but one record. Differential privacy is a relatively new field of research. Most users have a limited experience in managing differential privacy parameters and achieving a suitable level of privacy without affecting the quality of the analysis. A vast majority of users is still learning how to effectively apply differential privacy in practice. In this paper, we discussed: on the proposed augmented framework which enables the differential privacy data of any given query, the various differential privacy techniques, metrics for the privacy & utility tradeoff of the data and efficacy of the framework. Discussed state of the art of different differential privacy techniques defined in the framework Laplace, Laplace bounded, Randomized response and Exponential for different data types. The augmented framework consists of three parts one on privacy parameter inputs to control interactively and iteratively on the querying the data , the various differential privacy techniques, the metrics to measure privacy and utility threshold which allows the data analyst to evaluate the accuracy of the privacy safe data for selecting the privacy guaranteed data within the given privacy budget. The framework takes any dataset as input and, generates another dataset which is structurally and statistically very similar original dataset. The newly generated dataset has much stronger privacy guarantee on the selected sensitive and non-sensitive datatypes. We have also demonstrated analytical models developed using the privacy safe data from the framework as substitute to the models developed on the original datasets. We have demonstrated the framework and analytical model with sample data sets to present the similarity of original and differential privacy safe datasets.

PDF