Sample-Optimal Private Regression in Polynomial Time

  • 2025-03-31 17:08:12
  • Prashanti Anderson, Ainesh Bakshi, Mahbod Majid, Stefan Tiegel
  • 0

Abstract

We consider the task of privately obtaining prediction error guarantees inordinary least-squares regression problems with Gaussian covariates (withunknown covariance structure). We provide the first sample-optimal polynomialtime algorithm for this task under both pure and approximate differentialprivacy. We show that any improvement to the sample complexity of our algorithmwould violate either statistical-query or information-theoretic lower bounds.Additionally, our algorithm is robust to a small fraction of arbitrary outliersand achieves optimal error rates as a function of the fraction of outliers. Incontrast, all prior efficient algorithms either incurred sample complexitieswith sub-optimal dimension dependence, scaling with the condition number of thecovariates, or obtained a polynomially worse dependence on the privacyparameters. Our technical contributions are two-fold: first, we leverage resilienceguarantees of Gaussians within the sum-of-squares framework. As a consequence,we obtain efficient sum-of-squares algorithms for regression with optimalrobustness rates and sample complexity. Second, we generalize the recentrobustness-to-privacy framework [HKMN23, (arXiv:2212.05015)] to account for thegeometry induced by the covariance of the input samples. This frameworkcrucially relies on the robust estimators to be sum-of-squares algorithms, andcombining the two steps yields a sample-optimal private regression algorithm.We believe our techniques are of independent interest, and we demonstrate thisby obtaining an efficient algorithm for covariance-aware mean estimation, withan optimal dependence on the privacy parameters.