Skip to main content

Showing 1–50 of 87 results for author: Ullman, J

  1. arXiv:2406.07407  [pdf, other

    cs.LG

    Private Geometric Median

    Authors: Mahdi Haghifam, Thomas Steinke, Jonathan Ullman

    Abstract: In this paper, we study differentially private (DP) algorithms for computing the geometric median (GM) of a dataset: Given $n$ points, $x_1,\dots,x_n$ in $\mathbb{R}^d$, the goal is to find a point $θ$ that minimizes the sum of the Euclidean distances to these points, i.e., $\sum_{i=1}^{n} \|θ- x_i\|_2$. Off-the-shelf methods, such as DP-GD, require strong a priori knowledge locating the data with… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: 36 pages

  2. arXiv:2405.20405  [pdf, other

    cs.DS cs.CR cs.IT cs.LG stat.ML

    Private Mean Estimation with Person-Level Differential Privacy

    Authors: Sushant Agarwal, Gautam Kamath, Mahbod Majid, Argyris Mouzakis, Rose Silver, Jonathan Ullman

    Abstract: We study person-level differentially private (DP) mean estimation in the case where each person holds multiple samples. DP here requires the usual notion of distributional stability when $\textit{all}$ of a person's datapoints can be modified. Informally, if $n$ people each have $m$ samples from an unknown $d$-dimensional distribution with bounded $k$-th moments, we show that \[n = \tilde Θ\left(\… ▽ More

    Submitted 18 July, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

    Comments: 72 pages, 3 figures

  3. arXiv:2402.11173  [pdf, other

    cs.LG cs.CR math.OC

    How to Make the Gradients Small Privately: Improved Rates for Differentially Private Non-Convex Optimization

    Authors: Andrew Lowy, Jonathan Ullman, Stephen J. Wright

    Abstract: We provide a simple and flexible framework for designing differentially private algorithms to find approximate stationary points of non-convex loss functions. Our framework is based on using a private approximate risk minimizer to "warm start" another private algorithm for finding stationary points. We use this framework to obtain improved, and sometimes optimal, rates for several classes of non-c… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

  4. arXiv:2312.13978  [pdf, other

    cs.LG cs.DS

    Metalearning with Very Few Samples Per Task

    Authors: Maryam Aliakbarpour, Konstantina Bairaktari, Gavin Brown, Adam Smith, Nathan Srebro, Jonathan Ullman

    Abstract: Metalearning and multitask learning are two frameworks for solving a group of related learning tasks more efficiently than we could hope to solve each of the individual tasks on their own. In multitask learning, we are given a fixed set of related learning tasks and need to output one accurate model per task, whereas in metalearning we are given tasks that are drawn i.i.d. from a metadistribution… ▽ More

    Submitted 1 April, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

  5. arXiv:2310.03838  [pdf, other

    cs.LG

    Chameleon: Increasing Label-Only Membership Leakage with Adaptive Poisoning

    Authors: Harsh Chaudhari, Giorgio Severi, Alina Oprea, Jonathan Ullman

    Abstract: The integration of machine learning (ML) in numerous critical applications introduces a range of privacy concerns for individuals who provide their datasets for model training. One such privacy risk is Membership Inference (MI), in which an attacker seeks to determine whether a particular data sample was included in the training dataset of a model. Current state-of-the-art MI attacks capitalize on… ▽ More

    Submitted 16 January, 2024; v1 submitted 5 October, 2023; originally announced October 2023.

    Comments: To appear at International Conference on Learning Representations (ICLR) 2024

  6. arXiv:2307.07604  [pdf, ps, other

    cs.CR cs.DS cs.LG

    Smooth Lower Bounds for Differentially Private Algorithms via Padding-and-Permuting Fingerprinting Codes

    Authors: Naty Peter, Eliad Tsfadia, Jonathan Ullman

    Abstract: Fingerprinting arguments, first introduced by Bun, Ullman, and Vadhan (STOC 2014), are the most widely used method for establishing lower bounds on the sample complexity or error of approximately differentially private (DP) algorithms. Still, there are many problems in differential privacy for which we don't know suitable lower bounds, and even for problems that we do, the lower bounds are not smo… ▽ More

    Submitted 3 July, 2024; v1 submitted 14 July, 2023; originally announced July 2023.

    Comments: Accepted to COLT 2024

  7. arXiv:2307.01415  [pdf, other

    cs.DS

    Matrix Multiplication Using Only Addition

    Authors: Daniel Cussen, Jeffrey D. Ullman

    Abstract: Matrix multiplication consumes a large fraction of the time taken in many machine-learning algorithms. Thus, accelerator chips that perform matrix multiplication faster than conventional processors or even GPU's are of increasing interest. In this paper, we demonstrate a method of performing matrix multiplication without a scalar multiplier circuit. In many cases of practical interest, only a sing… ▽ More

    Submitted 3 July, 2023; originally announced July 2023.

    Comments: 9 pages, 2 figures

    ACM Class: F.2.1

  8. arXiv:2306.01181  [pdf, other

    cs.LG cs.CR

    TMI! Finetuned Models Leak Private Information from their Pretraining Data

    Authors: John Abascal, Stanley Wu, Alina Oprea, Jonathan Ullman

    Abstract: Transfer learning has become an increasingly popular technique in machine learning as a way to leverage a pretrained model trained for one task to assist with building a finetuned model for a related task. This paradigm has been especially popular for $\textit{privacy}$ in machine learning, where the pretrained model is considered public, and only the data for finetuning is considered sensitive. H… ▽ More

    Submitted 21 March, 2024; v1 submitted 1 June, 2023; originally announced June 2023.

  9. arXiv:2305.13440  [pdf, ps, other

    cs.DS cs.LG

    Differentially Private Medians and Interior Points for Non-Pathological Data

    Authors: Maryam Aliakbarpour, Rose Silver, Thomas Steinke, Jonathan Ullman

    Abstract: We construct differentially private estimators with low sample complexity that estimate the median of an arbitrary distribution over $\mathbb{R}$ satisfying very mild moment conditions. Our result stands in contrast to the surprising negative result of Bun et al. (FOCS 2015) that showed there is no differentially private estimator with any finite sample complexity that returns any non-trivial appr… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

  10. arXiv:2302.01855  [pdf, ps, other

    cs.LG stat.ML

    From Robustness to Privacy and Back

    Authors: Hilal Asi, Jonathan Ullman, Lydia Zakynthinou

    Abstract: We study the relationship between two desiderata of algorithms in statistical inference and machine learning: differential privacy and robustness to adversarial data corruptions. Their conceptual similarity was first observed by Dwork and Lei (STOC 2009), who observed that private algorithms satisfy robustness, and gave a general method for converting robust algorithms to private ones. However, al… ▽ More

    Submitted 3 February, 2023; originally announced February 2023.

  11. arXiv:2301.13334  [pdf, ps, other

    math.ST cs.CR cs.DS stat.ML

    A Bias-Variance-Privacy Trilemma for Statistical Estimation

    Authors: Gautam Kamath, Argyris Mouzakis, Matthew Regehr, Vikrant Singhal, Thomas Steinke, Jonathan Ullman

    Abstract: The canonical algorithm for differentially private mean estimation is to first clip the samples to a bounded range and then add noise to their empirical mean. Clipping controls the sensitivity and, hence, the variance of the noise that we add for privacy. But clipping also introduces statistical bias. We prove that this tradeoff is inherent: no algorithm can simultaneously have low bias, low varia… ▽ More

    Submitted 28 February, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

  12. arXiv:2210.15819  [pdf, other

    math.ST cs.CR cs.LG

    Instance-Optimal Differentially Private Estimation

    Authors: Audra McMillan, Adam Smith, Jon Ullman

    Abstract: In this work, we study local minimax convergence estimation rates subject to $ε$-differential privacy. Unlike worst-case rates, which may be conservative, algorithms that are locally minimax optimal must adapt to easy instances of the problem. We construct locally minimax differentially private estimators for one-parameter exponential families and estimating the tail rate of a distribution. In the… ▽ More

    Submitted 27 October, 2022; originally announced October 2022.

  13. arXiv:2209.03112  [pdf, ps, other

    cs.LG cs.DS

    Multitask Learning via Shared Features: Algorithms and Hardness

    Authors: Konstantina Bairaktari, Guy Blanc, Li-Yang Tan, Jonathan Ullman, Lydia Zakynthinou

    Abstract: We investigate the computational efficiency of multitask learning of Boolean functions over the $d$-dimensional hypercube, that are related by means of a feature representation of size $k \ll d$ shared across all tasks. We present a polynomial time multitask learning algorithm for the concept class of halfspaces with margin $γ$, which is based on a simultaneous boosting technique and requires only… ▽ More

    Submitted 7 September, 2022; originally announced September 2022.

  14. arXiv:2208.12348  [pdf, other

    cs.LG cs.CR

    SNAP: Efficient Extraction of Private Properties with Poisoning

    Authors: Harsh Chaudhari, John Abascal, Alina Oprea, Matthew Jagielski, Florian Tramèr, Jonathan Ullman

    Abstract: Property inference attacks allow an adversary to extract global properties of the training dataset from a machine learning model. Such attacks have privacy implications for data owners sharing their datasets to train machine learning models. Several existing approaches for property inference attacks against deep neural networks have been proposed, but they all rely on the attacker training a large… ▽ More

    Submitted 21 June, 2023; v1 submitted 25 August, 2022; originally announced August 2022.

    Comments: 28 pages, 16 figures

  15. arXiv:2205.06369  [pdf, other

    cs.LG cs.CR

    How to Combine Membership-Inference Attacks on Multiple Updated Models

    Authors: Matthew Jagielski, Stanley Wu, Alina Oprea, Jonathan Ullman, Roxana Geambasu

    Abstract: A large body of research has shown that machine learning models are vulnerable to membership inference (MI) attacks that violate the privacy of the participants in the training data. Most MI research focuses on the case of a single standalone model, while production machine-learning platforms often update models over time, on data that often shifts in distribution, giving the attacker more informa… ▽ More

    Submitted 12 May, 2022; originally announced May 2022.

    Comments: 31 pages, 9 figures

  16. arXiv:2111.04609  [pdf, ps, other

    stat.ML cs.CR cs.DS cs.IT cs.LG

    A Private and Computationally-Efficient Estimator for Unbounded Gaussians

    Authors: Gautam Kamath, Argyris Mouzakis, Vikrant Singhal, Thomas Steinke, Jonathan Ullman

    Abstract: We give the first polynomial-time, polynomial-sample, differentially private estimator for the mean and covariance of an arbitrary Gaussian distribution $\mathcal{N}(μ,Σ)$ in $\mathbb{R}^d$. All previous estimators are either nonconstructive, with unbounded running time, or require the user to specify a priori bounds on the parameters $μ$ and $Σ$. The primary new technical tool in our algorithm is… ▽ More

    Submitted 11 February, 2022; v1 submitted 8 November, 2021; originally announced November 2021.

  17. arXiv:2106.13329  [pdf, ps, other

    cs.LG

    Covariance-Aware Private Mean Estimation Without Private Covariance Estimation

    Authors: Gavin Brown, Marco Gaboardi, Adam Smith, Jonathan Ullman, Lydia Zakynthinou

    Abstract: We present two sample-efficient differentially private mean estimators for $d$-dimensional (sub)Gaussian distributions with unknown covariance. Informally, given $n \gtrsim d/α^2$ samples from such a distribution with mean $μ$ and covariance $Σ$, our estimators output $\tildeμ$ such that $\| \tildeμ- μ\|_Σ \leq α$, where $\| \cdot \|_Σ$ is the Mahalanobis distance. All previous estimators with the… ▽ More

    Submitted 25 March, 2024; v1 submitted 24 June, 2021; originally announced June 2021.

    Comments: 49 pages. Appeared in NeurIPS 2021. Updated version contains improved analysis of Tukey depth mechanism: robustness guarantees, tighter error analysis, and techniques for faster implementation

  18. arXiv:2102.08598  [pdf, other

    cs.LG cs.CR cs.DS

    Leveraging Public Data for Practical Private Query Release

    Authors: Terrance Liu, Giuseppe Vietri, Thomas Steinke, Jonathan Ullman, Zhiwei Steven Wu

    Abstract: In many statistical problems, incorporating priors can significantly improve performance. However, the use of prior knowledge in differentially private query release has remained underexplored, despite such priors commonly being available in the form of public datasets, such as previous US Census releases. With the goal of releasing statistics about a private dataset, we present PMW^Pub, which --… ▽ More

    Submitted 10 June, 2021; v1 submitted 17 February, 2021; originally announced February 2021.

    Comments: ICML 2021

  19. arXiv:2102.07684   

    cs.DS cs.LG

    Fair and Optimal Cohort Selection for Linear Utilities

    Authors: Konstantina Bairaktari, Huy Le Nguyen, Jonathan Ullman

    Abstract: The rise of algorithmic decision-making has created an explosion of research around the fairness of those algorithms. While there are many compelling notions of individual fairness, beginning with the work of Dwork et al., these notions typically do not satisfy desirable composition properties. To this end, Dwork and Ilvento introduced the fair cohort selection problem, which captures a specific a… ▽ More

    Submitted 5 October, 2022; v1 submitted 15 February, 2021; originally announced February 2021.

    Comments: This paper has been subsumed by the arXiv paper arXiv:2009.02207

  20. arXiv:2009.08000  [pdf, other

    cs.DS cs.CR cs.LG

    The Limits of Pan Privacy and Shuffle Privacy for Learning and Estimation

    Authors: Albert Cheu, Jonathan Ullman

    Abstract: There has been a recent wave of interest in intermediate trust models for differential privacy that eliminate the need for a fully trusted central data collector, but overcome the limitations of local differential privacy. This interest has led to the introduction of the shuffle model (Cheu et al., EUROCRYPT 2019; Erlingsson et al., SODA 2019) and revisiting the pan-private model (Dwork et al., IT… ▽ More

    Submitted 3 December, 2020; v1 submitted 16 September, 2020; originally announced September 2020.

    Comments: Corrected a proof in the Appendix. Added a reference to parallel and concurrent work

  21. arXiv:2009.02207  [pdf, ps, other

    cs.DS cs.LG

    Fair and Useful Cohort Selection

    Authors: Konstantina Bairaktari, Paul Langton, Huy L. Nguyen, Niklas Smedemark-Margulies, Jonathan Ullman

    Abstract: A challenge in fair algorithm design is that, while there are compelling notions of individual fairness, these notions typically do not satisfy desirable composition properties, and downstream applications based on fair classifiers might not preserve fairness. To study fairness under composition, Dwork and Ilvento introduced an archetypal problem called fair-cohort-selection problem, where a singl… ▽ More

    Submitted 6 April, 2022; v1 submitted 4 September, 2020; originally announced September 2020.

    Comments: This is a merger of the previous version and arXiv:2102.07684

  22. arXiv:2006.07709  [pdf, other

    cs.CR cs.LG

    Auditing Differentially Private Machine Learning: How Private is Private SGD?

    Authors: Matthew Jagielski, Jonathan Ullman, Alina Oprea

    Abstract: We investigate whether Differentially Private SGD offers better privacy in practice than what is guaranteed by its state-of-the-art analysis. We do so via novel data poisoning attacks, which we show correspond to realistic privacy attacks. While previous work (Ma et al., arXiv 2019) proposed this connection between differential privacy and data poisoning as a defense against data poisoning, our us… ▽ More

    Submitted 13 June, 2020; originally announced June 2020.

  23. arXiv:2006.06618  [pdf, other

    stat.ML cs.CR cs.DS cs.IT cs.LG math.ST

    CoinPress: Practical Private Mean and Covariance Estimation

    Authors: Sourav Biswas, Yihe Dong, Gautam Kamath, Jonathan Ullman

    Abstract: We present simple differentially private estimators for the mean and covariance of multivariate sub-Gaussian data that are accurate at small sample sizes. We demonstrate the effectiveness of our algorithms both theoretically and empirically using synthetic and real-world datasets -- showing that their asymptotic error rates match the state-of-the-art theoretical bounds, and that they concretely ou… ▽ More

    Submitted 9 October, 2022; v1 submitted 11 June, 2020; originally announced June 2020.

    Comments: Code is available at https://github.com/twistedcubic/coin-press

  24. arXiv:2005.06154  [pdf, other

    cs.DB cs.CR cs.DC cs.IR

    Panda: Partitioned Data Security on Outsourced Sensitive and Non-sensitive Data

    Authors: Sharad Mehrotra, Shantanu Sharma, Jeffrey D. Ullman, Dhrubajyoti Ghosh, Peeyush Gupta

    Abstract: Despite extensive research on cryptography, secure and efficient query processing over outsourced data remains an open challenge. This paper continues along with the emerging trend in secure data processing that recognizes that the entire dataset may not be sensitive, and hence, non-sensitivity of data can be exploited to overcome limitations of existing encryption-based approaches. We, first, pro… ▽ More

    Submitted 13 May, 2020; originally announced May 2020.

    Comments: This version has been accepted in ACM Transactions on Management Information Systems. The final published version of this paper may differ from this accepted version. A preliminary version of this paper [arXiv:1812.09233] was accepted and presented in IEEE ICDE 2019

  25. arXiv:2005.00010  [pdf, other

    stat.ML cs.CR cs.DS cs.IT cs.LG

    A Primer on Private Statistics

    Authors: Gautam Kamath, Jonathan Ullman

    Abstract: Differentially private statistical estimation has seen a flurry of developments over the last several years. Study has been divided into two schools of thought, focusing on empirical statistics versus population statistics. We suggest that these two lines of work are more similar than different by giving examples of methods that were initially framed for empirical statistics, but can be applied ju… ▽ More

    Submitted 30 April, 2020; originally announced May 2020.

    Comments: 20 pages. Comments welcome

  26. arXiv:2004.10941  [pdf, other

    cs.LG stat.ML

    Private Query Release Assisted by Public Data

    Authors: Raef Bassily, Albert Cheu, Shay Moran, Aleksandar Nikolov, Jonathan Ullman, Zhiwei Steven Wu

    Abstract: We study the problem of differentially private query release assisted by access to public data. In this problem, the goal is to answer a large class $\mathcal{H}$ of statistical queries with error no more than $α$ using a combination of public and private samples. The algorithm is required to satisfy differential privacy only with respect to the private samples. We study the limits of this task in… ▽ More

    Submitted 22 April, 2020; originally announced April 2020.

  27. arXiv:2002.09464  [pdf, other

    cs.DS cs.CR cs.IT cs.LG stat.ML

    Private Mean Estimation of Heavy-Tailed Distributions

    Authors: Gautam Kamath, Vikrant Singhal, Jonathan Ullman

    Abstract: We give new upper and lower bounds on the minimax sample complexity of differentially private mean estimation of distributions with bounded $k$-th moments. Roughly speaking, in the univariate case, we show that $n = Θ\left(\frac{1}{α^2} + \frac{1}{α^{\frac{k}{k-1}}\varepsilon}\right)$ samples are necessary and sufficient to estimate the mean to $α$-accuracy under $\varepsilon$-differential privacy… ▽ More

    Submitted 16 February, 2021; v1 submitted 21 February, 2020; originally announced February 2020.

    Comments: Appeared in COLT 2020

  28. arXiv:1911.08339  [pdf, ps, other

    cs.DS cs.CR cs.LG

    The Power of Factorization Mechanisms in Local and Central Differential Privacy

    Authors: Alexander Edmonds, Aleksandar Nikolov, Jonathan Ullman

    Abstract: We give new characterizations of the sample complexity of answering linear queries (statistical queries) in the local and central models of differential privacy: *In the non-interactive local model, we give the first approximate characterization of the sample complexity. Informally our bounds are tight to within polylogarithmic factors in the number of queries and desired accuracy. Our character… ▽ More

    Submitted 19 November, 2019; originally announced November 2019.

  29. arXiv:1909.09630  [pdf, other

    cs.DS cs.CR

    Manipulation Attacks in Local Differential Privacy

    Authors: Albert Cheu, Adam Smith, Jonathan Ullman

    Abstract: Local differential privacy is a widely studied restriction on distributed algorithms that collect aggregates about sensitive user data, and is now deployed in several large systems. We initiate a systematic study of a fundamental limitation of locally differentially private protocols: they are highly vulnerable to adversarial manipulation. While any algorithm can be manipulated by adversaries who… ▽ More

    Submitted 20 September, 2019; originally announced September 2019.

  30. arXiv:1909.03951  [pdf, other

    cs.DS cs.CR cs.IT cs.LG stat.ML

    Differentially Private Algorithms for Learning Mixtures of Separated Gaussians

    Authors: Gautam Kamath, Or Sheffet, Vikrant Singhal, Jonathan Ullman

    Abstract: Learning the parameters of Gaussian mixture models is a fundamental and widely studied problem with numerous applications. In this work, we give new algorithms for learning the parameters of a high-dimensional, well separated, Gaussian mixture model subject to the strong constraint of differential privacy. In particular, we give a differentially private analogue of the algorithm of Achlioptas and… ▽ More

    Submitted 15 October, 2019; v1 submitted 9 September, 2019; originally announced September 2019.

    Comments: To appear in NeurIPS 2019

  31. arXiv:1905.13376  [pdf, other

    cs.DB cs.DC

    Efficient Multiway Hash Join on Reconfigurable Hardware

    Authors: Kunle Olukotun, Raghu Prabhakar, Rekha Singhal, Jeffrey D. Ullman, Yaqi Zhang

    Abstract: We propose the algorithms for performing multiway joins using a new type of coarse grain reconfigurable hardware accelerator~-- ``Plasticine''~-- that, compared with other accelerators, emphasizes high compute capability and high on-chip communication bandwidth. Joining three or more relations in a single step, i.e. multiway join, is efficient when the join of any two relations yields too large an… ▽ More

    Submitted 30 May, 2019; originally announced May 2019.

    Comments: 20 pages

  32. arXiv:1905.11947  [pdf, ps, other

    cs.DS cs.CR cs.IT cs.LG stat.ML

    Private Identity Testing for High-Dimensional Distributions

    Authors: Clément L. Canonne, Gautam Kamath, Audra McMillan, Jonathan Ullman, Lydia Zakynthinou

    Abstract: In this work we present novel differentially private identity (goodness-of-fit) testers for natural and widely studied classes of multivariate product distributions: Gaussians in $\mathbb{R}^d$ with known covariance and product distributions over $\{\pm 1\}^{d}$. Our testers have improved sample complexity compared to those derived from previous techniques, and are the first testers whose sample c… ▽ More

    Submitted 3 March, 2022; v1 submitted 28 May, 2019; originally announced May 2019.

    Comments: Discussing a mistake in the proof of one of the algorithms (Theorem 1.2, computationally inefficient tester), and pointing to follow-up work by Narayanan (2022) who improves upon our results and fixes this mistake

  33. arXiv:1905.10477  [pdf, ps, other

    cs.DS cs.LG

    Efficiently Estimating Erdos-Renyi Graphs with Node Differential Privacy

    Authors: Adam Sealfon, Jonathan Ullman

    Abstract: We give a simple, computationally efficient, and node-differentially-private algorithm for estimating the parameter of an Erdos-Renyi graph---that is, estimating p in a G(n,p)---with near-optimal accuracy. Our algorithm nearly matches the information-theoretically optimal exponential-time algorithm for the same problem due to Borgs et al. (FOCS 2018). More generally, we give an optimal, computatio… ▽ More

    Submitted 24 May, 2019; originally announced May 2019.

  34. arXiv:1902.09009  [pdf, ps, other

    cs.LG cs.DS stat.ML

    Efficient Private Algorithms for Learning Large-Margin Halfspaces

    Authors: Huy L. Nguyen, Jonathan Ullman, Lydia Zakynthinou

    Abstract: We present new differentially private algorithms for learning a large-margin halfspace. In contrast to previous algorithms, which are based on either differentially private simulations of the statistical query model or on private convex optimization, the sample complexity of our algorithms depends only on the margin of the data, and not on the dimension. We complement our results with a lower boun… ▽ More

    Submitted 23 February, 2020; v1 submitted 24 February, 2019; originally announced February 2019.

    Comments: changed title, added references and remarks

  35. arXiv:1812.09233  [pdf, other

    cs.DB cs.CR cs.DC cs.IR

    Partitioned Data Security on Outsourced Sensitive and Non-sensitive Data

    Authors: Sharad Mehrotra, Shantanu Sharma, Jeffrey D. Ullman, Anurag Mishra

    Abstract: Despite extensive research on cryptography, secure and efficient query processing over outsourced data remains an open challenge. This paper continues along the emerging trend in secure data processing that recognizes that the entire dataset may not be sensitive, and hence, non-sensitivity of data can be exploited to overcome limitations of existing encryption-based approaches. We propose a new se… ▽ More

    Submitted 19 December, 2018; originally announced December 2018.

    Comments: Accepted in IEEE International Conference on Data Engineering (ICDE), 2019. arXiv admin note: text overlap with arXiv:1812.01741

  36. arXiv:1812.02696  [pdf, other

    cs.LG cs.DS cs.GT stat.ML

    Differentially Private Fair Learning

    Authors: Matthew Jagielski, Michael Kearns, Jieming Mao, Alina Oprea, Aaron Roth, Saeed Sharifi-Malvajerdi, Jonathan Ullman

    Abstract: Motivated by settings in which predictive models may be required to be non-discriminatory with respect to certain attributes (such as race), but even collecting the sensitive attribute may be forbidden or restricted, we initiate the study of fair learning under the constraint of differential privacy. We design two learning algorithms that simultaneously promise differential privacy and equalized o… ▽ More

    Submitted 31 May, 2019; v1 submitted 6 December, 2018; originally announced December 2018.

  37. arXiv:1811.11148  [pdf, ps, other

    cs.DS cs.CR cs.IT cs.LG stat.ML

    The Structure of Optimal Private Tests for Simple Hypotheses

    Authors: Clément L. Canonne, Gautam Kamath, Audra McMillan, Adam Smith, Jonathan Ullman

    Abstract: Hypothesis testing plays a central role in statistical inference, and is used in many settings where privacy concerns are paramount. This work answers a basic question about privately testing simple hypotheses: given two distributions $P$ and $Q$, and a privacy level $\varepsilon$, how many i.i.d. samples are needed to distinguish $P$ from $Q$ subject to $\varepsilon$-differential privacy, and wha… ▽ More

    Submitted 2 April, 2019; v1 submitted 27 November, 2018; originally announced November 2018.

    Comments: To appear in STOC 2019

  38. Distributed Differential Privacy via Shuffling

    Authors: Albert Cheu, Adam Smith, Jonathan Ullman, David Zeber, Maxim Zhilyaev

    Abstract: We consider the problem of designing scalable, robust protocols for computing statistics about sensitive data. Specifically, we look at how best to design differentially private protocols in a distributed setting, where each user holds a private datum. The literature has mostly considered two models: the "central" model, in which a trusted server collects users' data in the clear, which allows gre… ▽ More

    Submitted 16 May, 2019; v1 submitted 3 August, 2018; originally announced August 2018.

    Comments: Updated to correct a typo

  39. arXiv:1806.06100  [pdf, ps, other

    cs.LG cs.DS stat.ML

    The Limits of Post-Selection Generalization

    Authors: Kobbi Nissim, Adam Smith, Thomas Steinke, Uri Stemmer, Jonathan Ullman

    Abstract: While statistics and machine learning offers numerous methods for ensuring generalization, these methods often fail in the presence of adaptivity---the common practice in which the choice of analysis depends on previous interactions with the same dataset. A recent line of work has introduced powerful, general purpose algorithms that ensure post hoc generalization (also called robust or post-select… ▽ More

    Submitted 15 June, 2018; originally announced June 2018.

  40. arXiv:1805.00216  [pdf, other

    cs.DS cs.CR cs.LG stat.ML

    Privately Learning High-Dimensional Distributions

    Authors: Gautam Kamath, Jerry Li, Vikrant Singhal, Jonathan Ullman

    Abstract: We present novel, computationally efficient, and differentially private algorithms for two fundamental high-dimensional learning problems: learning a multivariate Gaussian and learning a product distribution over the Boolean hypercube in total variation distance. The sample complexity of our algorithms nearly matches the sample complexity of the optimal non-private learners for these tasks in a wi… ▽ More

    Submitted 30 May, 2019; v1 submitted 1 May, 2018; originally announced May 2018.

    Comments: To appear in COLT 2019

  41. arXiv:1802.07128  [pdf, ps, other

    cs.LG

    Local Differential Privacy for Evolving Data

    Authors: Matthew Joseph, Aaron Roth, Jonathan Ullman, Bo Waggoner

    Abstract: There are now several large scale deployments of differential privacy used to collect statistical information about users. However, these deployments periodically recollect the data and recompute the statistics using algorithms designed for a single use. As a result, these systems do not provide meaningful privacy guarantees over long time scales. Moreover, existing techniques to mitigate this eff… ▽ More

    Submitted 19 November, 2018; v1 submitted 20 February, 2018; originally announced February 2018.

  42. arXiv:1802.02638   

    cs.CR cs.DS cs.LG

    Tight Lower Bounds for Locally Differentially Private Selection

    Authors: Jonathan Ullman

    Abstract: We prove a tight lower bound (up to constant factors) on the sample complexity of any non-interactive local differentially private protocol for optimizing a linear function over the simplex. This lower bound also implies a tight lower bound (again, up to constant factors) on the sample complexity of any non-interactive local differentially private protocol implementing the exponential mechanism. T… ▽ More

    Submitted 14 May, 2021; v1 submitted 7 February, 2018; originally announced February 2018.

    Comments: The results in this paper have been subsumed by: Alexander Edmonds, Aleksandar Nikolov, Jonathan Ullman. "The Power of Factorization Mechanisms in Local and Central Differential Privacy." STOC 2020 [arXiv:1911.08339]. Please cite that paper for the relevant results

  43. arXiv:1711.04213  [pdf, other

    cs.LG

    Skyline Identification in Multi-Armed Bandits

    Authors: Albert Cheu, Ravi Sundaram, Jonathan Ullman

    Abstract: We introduce a variant of the classical PAC multi-armed bandit problem. There is an ordered set of $n$ arms $A[1],\dots,A[n]$, each with some stochastic reward drawn from some unknown bounded distribution. The goal is to identify the $skyline$ of the set $A$, consisting of all arms $A[i]$ such that $A[i]$ has larger expected reward than all lower-numbered arms $A[1],\dots,A[i-1]$. We define a natu… ▽ More

    Submitted 9 January, 2018; v1 submitted 11 November, 2017; originally announced November 2017.

    Comments: 18 pages, 2 Figures; an ALT'18/ISIT'18 submission

  44. arXiv:1704.03024  [pdf, other

    cs.DS cs.CR

    Tight Lower Bounds for Differentially Private Selection

    Authors: Thomas Steinke, Jonathan Ullman

    Abstract: A pervasive task in the differential privacy literature is to select the $k$ items of "highest quality" out of a set of $d$ items, where the quality of each item depends on a sensitive dataset that must be protected. Variants of this task arise naturally in fundamental problems like feature selection and hypothesis testing, and also as subroutines for many sophisticated differentially private algo… ▽ More

    Submitted 10 April, 2017; originally announced April 2017.

  45. arXiv:1702.02970  [pdf, other

    cs.DS cs.CR

    The Price of Selection in Differential Privacy

    Authors: Mitali Bafna, Jonathan Ullman

    Abstract: In the differentially private top-$k$ selection problem, we are given a dataset $X \in \{\pm 1\}^{n \times d}$, in which each row belongs to an individual and each column corresponds to some binary attribute, and our goal is to find a set of $k \ll d$ columns whose means are approximately as large as possible. Differential privacy requires that our choice of these $k$ columns does not depend too m… ▽ More

    Submitted 9 February, 2017; originally announced February 2017.

  46. arXiv:1701.03493  [pdf, other

    cs.DM cs.DS

    Subgaussian Tail Bounds via Stability Arguments

    Authors: Thomas Steinke, Jonathan Ullman

    Abstract: Sums of independent, bounded random variables concentrate around their expectation approximately as well a Gaussian of the same variance. Well known results of this form include the Bernstein, Hoeffding, and Chernoff inequalities and many others. We present an alternative proof of these tail bounds based on what we call a stability argument, which avoids bounding the moment generating function or… ▽ More

    Submitted 21 April, 2017; v1 submitted 12 January, 2017; originally announced January 2017.

  47. arXiv:1609.04340  [pdf, other

    cs.CR cs.CY stat.ME

    PSI (Ψ): a Private data Sharing Interface

    Authors: Marco Gaboardi, James Honaker, Gary King, Jack Murtagh, Kobbi Nissim, Jonathan Ullman, Salil Vadhan

    Abstract: We provide an overview of PSI ("a Private data Sharing Interface"), a system we are developing to enable researchers in the social sciences and other fields to share and explore privacy-sensitive datasets with the strong privacy protections of differential privacy.

    Submitted 3 August, 2018; v1 submitted 14 September, 2016; originally announced September 2016.

    Comments: 34 pages, 8 figures

  48. arXiv:1607.06141  [pdf, other

    cs.CR cs.DS

    Strong Hardness of Privacy from Weak Traitor Tracing

    Authors: Lucas Kowalczyk, Tal Malkin, Jonathan Ullman, Mark Zhandry

    Abstract: Despite much study, the computational complexity of differential privacy remains poorly understood. In this paper we consider the computational complexity of accurately answering a family $Q$ of statistical queries over a data universe $X$ under differential privacy. A statistical query on a dataset $D \in X^n$ asks "what fraction of the elements of $D$ satisfy a given predicate $p$ on $X$?" Dwork… ▽ More

    Submitted 20 July, 2016; originally announced July 2016.

  49. arXiv:1607.05397  [pdf, ps, other

    cs.DS cs.GT cs.LG

    Multidimensional Dynamic Pricing for Welfare Maximization

    Authors: Aaron Roth, Aleksandrs Slivkins, Jonathan Ullman, Zhiwei Steven Wu

    Abstract: We study the problem of a seller dynamically pricing $d$ distinct types of indivisible goods, when faced with the online arrival of unit-demand buyers drawn independently from an unknown distribution. The goods are not in limited supply, but can only be produced at a limited rate and are costly to produce. The seller observes only the bundle of goods purchased at each day, but nothing else about t… ▽ More

    Submitted 10 June, 2017; v1 submitted 19 July, 2016; originally announced July 2016.

  50. arXiv:1605.08294  [pdf, other

    cs.CR

    Privacy Odometers and Filters: Pay-as-you-Go Composition

    Authors: Ryan Rogers, Aaron Roth, Jonathan Ullman, Salil Vadhan

    Abstract: In this paper we initiate the study of adaptive composition in differential privacy when the length of the composition, and the privacy parameters themselves can be chosen adaptively, as a function of the outcome of previously run analyses. This case is much more delicate than the setting covered by existing composition theorems, in which the algorithms themselves can be chosen adaptively, but the… ▽ More

    Submitted 5 August, 2021; v1 submitted 26 May, 2016; originally announced May 2016.