Beyond Ordinary Lipschitz Constraints: Differentially Private Stochastic Optimization with Tsybakov Noise Condition

arXiv:2509.04668v1 Announce Type: new Abstract: We study Stochastic Convex Optimization in the Differential Privacy model (DP-SCO). Unlike previous studies, here we assume the population risk function satisfies the Tsybakov Noise Condition (TNC) with some parameter $theta>1$, where the Lipschitz constant of the loss could be extremely large or even unbounded, but the $ell_2$-norm gradient of the loss has bounded $k$-th moment with $kgeq 2$. For the Lipschitz case with $thetageq 2$, we first propose an $(varepsilon, delta)$-DP algorithm whose utility bound is $Tilde{O}left(left(tilde{r}_{2k}(frac{1}{sqrt{n}}+(frac{sqrt{d}}{nvarepsilon}))^frac{k-1}{k}right)^frac{theta}{theta-1}right)$ in high probability, where $n$ is the sample size, $d$ is the model dimension, and $tilde{r}_{2k}$ is a term that only depends on the $2k$-th moment of the gradient. It is notable that such an upper bound is independent of the Lipschitz constant. We then extend to the case where $thetageq bar{theta}> 1$ for some known constant $bar{theta}$. Moreover, when the privacy budget $varepsilon$ is small enough, we show an upper bound of $tilde{O}left(left(tilde{r}_{k}(frac{1}{sqrt{n}}+(frac{sqrt{d}}{nvarepsilon}))^frac{k-1}{k}right)^frac{theta}{theta-1}right)$ even if the loss function is not Lipschitz. For the lower bound, we show that for any $thetageq 2$, the private minimax rate for $rho$-zero Concentrated Differential Privacy is lower bounded by $Omegaleft(left(tilde{r}_{k}(frac{1}{sqrt{n}}+(frac{sqrt{d}}{nsqrt{rho}}))^frac{k-1}{k}right)^frac{theta}{theta-1}right)$.

2025-09-08 04:00 GMT · 7 months ago arxiv.org

arXiv:2509.04668v1 Announce Type: new Abstract: We study Stochastic Convex Optimization in the Differential Privacy model (DP-SCO). Unlike previous studies, here we assume the population risk function satisfies the Tsybakov Noise Condition (TNC) with some parameter $theta>1$, where the Lipschitz constant of the loss could be extremely large or even unbounded, but the $ell_2$-norm gradient of the loss has bounded $k$-th moment with $kgeq 2$. For the Lipschitz case with $thetageq 2$, we first propose an $(varepsilon, delta)$-DP algorithm whose utility bound is $Tilde{O}left(left(tilde{r}_{2k}(frac{1}{sqrt{n}}+(frac{sqrt{d}}{nvarepsilon}))^frac{k-1}{k}right)^frac{theta}{theta-1}right)$ in high probability, where $n$ is the sample size, $d$ is the model dimension, and $tilde{r}_{2k}$ is a term that only depends on the $2k$-th moment of the gradient. It is notable that such an upper bound is independent of the Lipschitz constant. We then extend to the case where $thetageq bar{theta}> 1$ for some known constant $bar{theta}$. Moreover, when the privacy budget $varepsilon$ is small enough, we show an upper bound of $tilde{O}left(left(tilde{r}_{k}(frac{1}{sqrt{n}}+(frac{sqrt{d}}{nvarepsilon}))^frac{k-1}{k}right)^frac{theta}{theta-1}right)$ even if the loss function is not Lipschitz. For the lower bound, we show that for any $thetageq 2$, the private minimax rate for $rho$-zero Concentrated Differential Privacy is lower bounded by $Omegaleft(left(tilde{r}_{k}(frac{1}{sqrt{n}}+(frac{sqrt{d}}{nsqrt{rho}}))^frac{k-1}{k}right)^frac{theta}{theta-1}right)$.

Original: https://arxiv.org/abs/2509.04668