Archives AI News

Preference Leakage: A Contamination Problem in LLM-as-a-judge

arXiv:2502.01534v3 Announce Type: replace Abstract: Large Language Models (LLMs) as judges and LLM-based data synthesis have emerged as two fundamental LLM-driven data annotation methods in model development. While their combination significantly enhances the efficiency of model training and evaluation, little…

March 5, 2026

Optimal trajectory-guided stochastic co-optimization for e-fuel system design and real-time operation

arXiv:2603.03484v1 Announce Type: new Abstract: E-fuels are promising long-term energy carriers supporting the net-zero transition. However, the large combinatorial design-operation spaces under renewable uncertainty make the use of mathematical programming impractical for co-optimizing e-fuel production systems. Here, we present MasCOR,…

March 5, 2026

Knowing When to Quit: Probabilistic Early Exits for Speech Separation

arXiv:2507.09768v3 Announce Type: replace Abstract: In recent years, deep learning-based single-channel speech separation has improved considerably, in large part driven by increasingly compute- and parameter-efficient neural network architectures. Most such architectures are, however, designed with a fixed compute and parameter…

March 5, 2026

When Small Variations Become Big Failures: Reliability Challenges in Compute-in-Memory Neural Accelerators

arXiv:2603.03491v1 Announce Type: new Abstract: Compute-in-memory (CiM) architectures promise significant improvements in energy efficiency and throughput for deep neural network acceleration by alleviating the von Neumann bottleneck. However, their reliance on emerging non-volatile memory devices introduces device-level non-idealities-such as write…

March 5, 2026

Circuit Insights: Towards Interpretability Beyond Activations

arXiv:2510.14936v2 Announce Type: replace Abstract: The fields of explainable AI and mechanistic interpretability aim to uncover the internal structure of neural networks, with circuit discovery as a central tool for understanding model computations. Existing approaches, however, rely on manual inspection…

March 5, 2026

Solving adversarial examples requires solving exponential misalignment

arXiv:2603.03507v1 Announce Type: new Abstract: Adversarial attacks – input perturbations imperceptible to humans that fool neural networks – remain both a persistent failure mode in machine learning, and a phenomenon with mysterious origins. To shed light, we define and analyze…

March 5, 2026

SpecBridge: Bridging Mass Spectrometry and Molecular Representations via Cross-Modal Alignment

arXiv:2601.17204v3 Announce Type: replace Abstract: Small-molecule identification from tandem mass spectrometry (MS/MS) remains a bottleneck in untargeted settings where spectral libraries are incomplete. While deep learning offers a solution, current approaches typically fall into two extremes: explicit generative models that…

March 5, 2026

EnECG: Efficient Ensemble Learning for Electrocardiogram Multi-task Foundation Model

arXiv:2511.22935v2 Announce Type: replace Abstract: Electrocardiogram (ECG) analysis plays a vital role in the early detection, monitoring, and management of various cardiovascular conditions. While existing models have achieved notable success in ECG interpretation, they fail to leverage the interrelated nature…

March 5, 2026

Dynamic Adversarial Reinforcement Learning for Robust Multimodal Large Language Models

arXiv:2602.22227v3 Announce Type: replace Abstract: Despite their impressive capabilities, Multimodal Large Language Models (MLLMs) exhibit perceptual fragility when confronted with visually complex scenes. This weakness stems from a reliance on finite training datasets, which are prohibitively expensive to scale and…

March 5, 2026

SafeDPO: A Simple Approach to Direct Preference Optimization with Enhanced Safety

arXiv:2505.20065v2 Announce Type: replace Abstract: As Large Language Models (LLMs) are increasingly deployed in real-world applications, balancing helpfulness and safety has become a central challenge. A natural approach is to incorporate safety constraints into Reinforcement Learning from Human Feedback (RLHF),…

March 5, 2026