Archives AI News

Credal Ensemble Distillation for Uncertainty Quantification

arXiv:2511.13766v1 Announce Type: new Abstract: Deep ensembles (DE) have emerged as a powerful approach for quantifying predictive uncertainty and distinguishing its aleatoric and epistemic components, thereby enhancing model robustness and reliability. However, their high computational and memory costs during inference…

November 19, 2025

MPD-SGR: Robust Spiking Neural Networks with Membrane Potential Distribution-Driven Surrogate Gradient Regularization

arXiv:2511.12199v2 Announce Type: replace Abstract: The surrogate gradient (SG) method has shown significant promise in enhancing the performance of deep spiking neural networks (SNNs), but it also introduces vulnerabilities to adversarial attacks. Although spike coding strategies and neural dynamics parameters…

November 19, 2025

Dynamic Temperature Scheduler for Knowledge Distillation

arXiv:2511.13767v1 Announce Type: new Abstract: Knowledge Distillation (KD) trains a smaller student model using a large, pre-trained teacher model, with temperature as a key hyperparameter controlling the softness of output probabilities. Traditional methods use a fixed temperature throughout training, which…

November 19, 2025

MoM: Linear Sequence Modeling with Mixture-of-Memories

arXiv:2502.13685v4 Announce Type: replace-cross Abstract: Linear sequence modeling methods, such as linear attention, state space modeling, and linear RNNs, offer significant efficiency improvements by reducing the complexity of training and inference. However, these methods typically compress the entire input sequence…

November 19, 2025

Compiling to linear neurons

arXiv:2511.13769v1 Announce Type: new Abstract: We don’t program neural networks directly. Instead, we rely on an indirect style where learning algorithms, like gradient descent, determine a neural network’s function by learning from data. This indirect style is often a virtue;…

November 19, 2025

MOON: Generative MLLM-based Multimodal Representation Learning for E-commerce Product Understanding

arXiv:2508.11999v2 Announce Type: replace-cross Abstract: With the rapid advancement of e-commerce, exploring general representations rather than task-specific ones has attracted increasing research attention. For product understanding, although existing discriminative dual-flow architectures drive progress in this field, they inherently struggle to…

November 19, 2025

Self-Attention as Distributional Projection: A Unified Interpretation of Transformer Architecture

arXiv:2511.13780v1 Announce Type: new Abstract: This paper presents a mathematical interpretation of self-attention by connecting it to distributional semantics principles. We show that self-attention emerges from projecting corpus-level co-occurrence statistics into sequence context. Starting from the co-occurrence matrix underlying GloVe…

November 19, 2025

DINO-Detect: A Simple yet Effective Framework for Blur-Robust AI-Generated Image Detection

arXiv:2511.12511v2 Announce Type: replace-cross Abstract: With growing concerns over image authenticity and digital safety, the field of AI-generated image (AIGI) detection has progressed rapidly. Yet, most AIGI detectors still struggle under real-world degradations, particularly motion blur, which frequently occurs in…

November 19, 2025

Exploring Transferability of Self-Supervised Learning by Task Conflict Calibration

arXiv:2511.13787v1 Announce Type: new Abstract: In this paper, we explore the transferability of SSL by addressing two central questions: (i) what is the representation transferability of SSL, and (ii) how can we effectively model this transferability? Transferability is defined as…

November 19, 2025

O3SLM: Open Weight, Open Data, and Open Vocabulary Sketch-Language Model

arXiv:2511.14368v1 Announce Type: cross Abstract: While Large Vision Language Models (LVLMs) are increasingly deployed in real-world applications, their ability to interpret abstract visual inputs remains limited. Specifically, they struggle to comprehend hand-drawn sketches, a modality that offers an intuitive means…

November 19, 2025