Differentially Private Neural Tangent Kernels (DP-NTK) for Privacy-Preserving Data Generation | Journal of Artificial Intelligence Research

PDF

Published: Nov 24, 2024

DOI: https://doi.org/10.1613/jair.1.15985

Keywords:

machine learning, differential privacy, generative models

Yilin Yang

Kamil Adamczewski

Xiaoxiao Li

Danica J. Sutherland

Mijung Park

a:1:{s:5:"en_US";s:31:"Technical University of Denmark";}

Abstract

Maximum mean discrepancy (MMD) is a particularly useful distance metric for differentially private data generation: when used with finite-dimensional features, it allows us to summarize and privatize the data distribution once, which we can repeatedly use during generator training without further privacy loss. An important question in this framework is, then, what features are useful to distinguish between real and synthetic data distributions, and whether those enable us to generate quality synthetic data. This work considers using the features of neural tangent kernels (NTKs), more precisely empirical NTKs (e-NTKs). We find that, perhaps surprisingly, the expressiveness of the untrained e-NTK features is comparable to that of the features taken from pre-trained perceptual features using public data. As a result, our method improves the privacy-accuracy trade-off compared to other state-of-the-art methods, without relying on any public data, as demonstrated on several tabular and image benchmark datasets.

Issue

Vol. 81 (2024)

Section

Articles

afiliatedsites

JAIR is published by AI Access Foundation, a nonprofit public charity whose purpose is to facilitate the dissemination of scientific results in artificial intelligence. JAIR, established in 1993, was one of the first open-access scientific journals on the Web, and has been a leading publication venue since its inception. We invite you to check out our other initiatives.

Learn more

Article Sidebar

Main Article Content

Abstract

Article Details