Archives AI News

From Silicon Valley to Nairobi: What the Global South’s AI leapfrogging teaches tech leaders

When I write about the cognitive migration now underway, brought about by the rapid advance of gen AI, I do so from the perspective of someone who has spent four decades in the technology industry. My own journey runs from…

October 6, 2025

StreamTensor: A PyTorch-to-Accelerator Compiler that Streams LLM Intermediates Across FPGA Dataflows

Why treat LLM inference as batched kernels to DRAM when a dataflow compiler can pipe tiles through on-chip FIFOs and stream converters?StreamTensor is a compiler that lowers PyTorch LLM graphs (GPT-2, Llama, Qwen, Gemma) into stream-scheduled dataflow accelerators on AMD’s…

October 6, 2025

Improved Monte Carlo Planning via Causal Disentanglement for Structurally-Decomposed Markov Decision Processes

arXiv:2406.16151v2 Announce Type: replace-cross Abstract: Markov Decision Processes (MDPs), as a general-purpose framework, often overlook the benefits of incorporating the causal structure of the transition and reward dynamics. For a subclass of resource allocation problems, we introduce the Structurally Decomposed…

October 6, 2025

Highly Efficient and Effective LLMs with Multi-Boolean Architectures

arXiv:2505.22811v2 Announce Type: replace-cross Abstract: Weight binarization has emerged as a promising strategy to reduce the complexity of large language models (LLMs). Existing approaches fall into post-training binarization, which is simple but causes severe performance loss, and training-aware methods, which…

October 6, 2025

First Hallucination Tokens Are Different from Conditional Ones

arXiv:2507.20836v3 Announce Type: replace Abstract: Large Language Models (LLMs) hallucinate, and detecting these cases is key to ensuring trust. While many approaches address hallucination detection at the response or span level, recent work explores token-level detection, enabling more fine-grained intervention.…

October 6, 2025

AReUReDi: Annealed Rectified Updates for Refining Discrete Flows with Multi-Objective Guidance

arXiv:2510.00352v2 Announce Type: replace Abstract: Designing sequences that satisfy multiple, often conflicting, objectives is a central challenge in therapeutic and biomolecular engineering. Existing generative frameworks largely operate in continuous spaces with single-objective guidance, while discrete approaches lack guarantees for multi-objective…

October 6, 2025

Wasserstein Bounds for generative diffusion models with Gaussian tail targets

arXiv:2412.11251v2 Announce Type: replace Abstract: We present an estimate of the Wasserstein distance between the data distribution and the generation of score-based generative models. The sampling complexity with respect to dimension is $mathcal{O}(sqrt{d})$, with a logarithmic constant. In the analysis,…

October 6, 2025

On the $O(frac{sqrt{d}}{K^{1/4}})$ Convergence Rate of AdamW Measured by $ell_1$ Norm

arXiv:2505.11840v3 Announce Type: replace Abstract: As the default optimizer for training large language models, AdamW has achieved remarkable success in deep learning. However, its convergence behavior is not theoretically well-understood. This paper establishes the convergence rate $frac{1}{K}sum_{k=1}^KEleft[||nabla f(x^k)||_1right]leq O(frac{sqrt{d}C}{K^{1/4}})$ for…

October 6, 2025

Uncertainty-Guided Model Selection for Tabular Foundation Models in Biomolecule Efficacy Prediction

arXiv:2510.02476v1 Announce Type: new Abstract: In-context learners like TabPFN are promising for biomolecule efficacy prediction, where established molecular feature sets and relevant experimental results can serve as powerful contextual examples. However, their performance is highly sensitive to the provided context,…

October 6, 2025

Oracle-based Uniform Sampling from Convex Bodies

arXiv:2510.02983v1 Announce Type: cross Abstract: We propose new Markov chain Monte Carlo algorithms to sample a uniform distribution on a convex body $K$. Our algorithms are based on the Alternating Sampling Framework/proximal sampler, which uses Gibbs sampling on an augmented…

October 6, 2025