Archives AI News

Three Concrete Challenges and Two Hopes for the Safety of Unsupervised Elicitation

arXiv:2602.20400v1 Announce Type: new Abstract: To steer language models towards truthful outputs on tasks which are beyond human capability, previous work has suggested training models on easy tasks to steer them on harder ones (easy-to-hard generalization), or using unsupervised training…

February 25, 2026

Sim2Radar: Toward Bridging the Radar Sim-to-Real Gap with VLM-Guided Scene Reconstruction

arXiv:2602.13314v3 Announce Type: replace-cross Abstract: Millimeter-wave (mmWave) radar provides reliable perception in visually degraded indoor environments (e.g., smoke, dust, and low light), but learning-based radar perception is bottlenecked by the scarcity and cost of collecting and annotating large-scale radar datasets.…

February 25, 2026

Enhancing maritime cybersecurity with technology and policy

Strahinja Janjusevic brings an international perspective and US Naval Academy education to his graduate research in the MIT Technology and Policy Program.

February 25, 2026

Is Exchangeability better than I.I.D to handle Data Distribution Shifts while Pooling Data for Data-scarce Medical image segmentation?

arXiv:2507.19575v2 Announce Type: replace-cross Abstract: Data scarcity is a major challenge in medical imaging, particularly for deep learning models. While data pooling (combining datasets from multiple sources) and data addition (adding more data from a new dataset) have been shown…

February 25, 2026

Anatomy of Capability Emergence: Scale-Invariant Representation Collapse and Top-Down Reorganization in Neural Networks

arXiv:2602.15997v3 Announce Type: replace Abstract: Capability emergence during neural network training remains mechanistically opaque. We track five geometric measures across five model scales (405K–85M parameters), 120 task$times$level$times$ model combinations (119 achieving accuracy-based emergence) across eight algorithmic tasks, and three Pythia…

February 25, 2026

A hierarchy tree data structure for behavior-based user segment representation

arXiv:2508.01115v2 Announce Type: replace Abstract: User attributes are essential in multiple stages of modern recommendation systems and are particularly important for mitigating the cold-start problem and improving the experience of new or infrequent users. We propose Behavior-based User Segmentation (BUS),…

February 25, 2026

ContextPilot: Fast Long-Context Inference via Context Reuse

arXiv:2511.03475v3 Announce Type: replace Abstract: AI applications increasingly depend on long-context inference, where LLMs consume substantial context to support stronger reasoning. Common examples include retrieval-augmented generation, agent memory layers, and multi-agent orchestration. As input contexts get longer, prefill latency becomes…

February 25, 2026

Scaling State-Space Models on Multiple GPUs with Tensor Parallelism

arXiv:2602.21144v1 Announce Type: cross Abstract: Selective state space models (SSMs) have rapidly become a compelling backbone for large language models, especially for long-context workloads. Yet in deployment, their inference performance is often bounded by the memory capacity, bandwidth, and latency…

February 25, 2026

Armijo Line-search Can Make (Stochastic) Gradient Descent Provably Faster

arXiv:2503.00229v4 Announce Type: replace Abstract: Armijo line-search (Armijo-LS) is a standard method to set the step-size for gradient descent (GD). For smooth functions, Armijo-LS alleviates the need to know the global smoothness constant L and adapts to the “local” smoothness,…

February 25, 2026

Multimodal Crystal Flow: Any-to-Any Modality Generation for Unified Crystal Modeling

arXiv:2602.20210v1 Announce Type: new Abstract: Crystal modeling spans a family of conditional and unconditional generation tasks across different modalities, including crystal structure prediction (CSP) and emph{de novo} generation (DNG). While recent deep generative models have shown promising performance, they remain…

February 25, 2026