Sparse Max-Affine Regression

2025-09-24 19:00 GMT · 6 months ago aimagpro.com

arXiv:2411.02225v2 Announce Type: replace
Abstract: This paper presents Sparse Gradient Descent as a solution for variable selection in convex piecewise linear regression, where the model is given as the maximum of $k$-affine functions $ x mapsto max_{j in [k]} langle a_j^star, x rangle + b_j^star$ for $j = 1,dots,k$. Here, ${ a_j^star}_{j=1}^k$ and ${b_j^star}_{j=1}^k$ denote the ground-truth weight vectors and intercepts. A non-asymptotic local convergence analysis is provided for Sp-GD under sub-Gaussian noise when the covariate distribution satisfies the sub-Gaussianity and anti-concentration properties. When the model order and parameters are fixed, Sp-GD provides an $epsilon$-accurate estimate given $mathcal{O}(max(epsilon^{-2}sigma_z^2,1)slog(d/s))$ observations where $sigma_z^2$ denotes the noise variance. This also implies the exact parameter recovery by Sp-GD from $mathcal{O}(slog(d/s))$ noise-free observations. The proposed initialization scheme uses sparse principal component analysis to estimate the subspace spanned by ${ a_j^star}_{j=1}^k$, then applies an $r$-covering search to estimate the model parameters. A non-asymptotic analysis is presented for this initialization scheme when the covariates and noise samples follow Gaussian distributions. When the model order and parameters are fixed, this initialization scheme provides an $epsilon$-accurate estimate given $mathcal{O}(epsilon^{-2}max(sigma_z^4,sigma_z^2,1)s^2log^4(d))$ observations. A new transformation named Real Maslov Dequantization (RMD) is proposed to transform sparse generalized polynomials into sparse max-affine models. The error decay rate of RMD is shown to be exponentially small in its temperature parameter. Furthermore, theoretical guarantees for Sp-GD are extended to the bounded noise model induced by RMD. Numerical Monte Carlo results corroborate theoretical findings for Sp-GD and the initialization scheme.