arXiv:2508.16531v2 Announce Type: replace-cross Abstract: Many algorithms are designed to work well on average over inputs. When running such an algorithm on an arbitrary input, we must ask: Can we trust the algorithm on this input? We identify a new class of algorithmic problems addressing this, which we call "Quality Control Problems." These problems are specified by a (positive, real-valued) "quality function" $rho$ and a distribution $D$ such that, with high probability, a sample drawn from $D$ is "high quality," meaning its $rho$-value is near $1$. The goal is to accept inputs $x sim D$ and reject potentially adversarially generated inputs $x$ with $rho(x)$ far from $1$. The objective of quality control is thus weaker than either component problem: testing for "$rho(x) approx 1$" or testing if $x sim D$, and offers the possibility of more efficient algorithms. In this work, we consider the sublinear version of the quality control problem, where $D in Delta({0,1}^N)$ and the goal is to solve the $(D ,rho)$-quality problem with $o(N)$ queries and time. As a case study, we consider random graphs, i.e., $D = G_{n,p}$ (and $N = binom{n}2$), and the $k$-clique count function $rho_k := C_k(G)/mathbb{E}_{G' sim G_{n,p}}[C_k(G')]$, where $C_k(G)$ is the number of $k$-cliques in $G$. Testing if $G sim G_{n,p}$ with one sample, let alone with sublinear query access to the sample, is of course impossible. Testing if $rho_k(G)approx 1$ requires $p^{-Omega(k^2)}$ samples. In contrast, we show that the quality control problem for $G_{n,p}$ (with $n geq p^{-ck}$ for some constant $c$) with respect to $rho_k$ can be tested with $p^{-O(k)}$ queries and time, showing quality control is provably superpolynomially more efficient in this setting. More generally, for a motif $H$ of maximum degree $Delta(H)$, the respective quality control problem can be solved with $p^{-O(Delta(H))}$ queries and running time.
Original: https://arxiv.org/abs/2508.16531
