Archives AI News

New York Smells: A Large Multimodal Dataset for Olfaction

arXiv:2511.20544v1 Announce Type: cross Abstract: While olfaction is central to how animals perceive the world, this rich chemical sensory modality remains largely inaccessible to machines. One key bottleneck is the lack of diverse, multimodal olfactory training data collected in natural…

November 26, 2025

A Systematic Study of Compression Ordering for Large Language Models

arXiv:2511.19495v1 Announce Type: new Abstract: Large Language Models (LLMs) require substantial computational resources, making model compression essential for efficient deployment in constrained environments. Among the dominant compression techniques: knowledge distillation, structured pruning, and low-bit quantization, their individual effects are well…

November 26, 2025

Value Improved Actor Critic Algorithms

arXiv:2406.01423v3 Announce Type: replace Abstract: To learn approximately optimal acting policies for decision problems, modern Actor Critic algorithms rely on deep Neural Networks (DNNs) to parameterize the acting policy and greedification operators to iteratively improve it. The reliance on DNNs…

November 26, 2025

Xmodel-2.5: 1.3B Data-Efficient Reasoning SLM

arXiv:2511.19496v1 Announce Type: new Abstract: Large language models deliver strong reasoning and tool-use skills, yet their computational demands make them impractical for edge or cost-sensitive deployments. We present textbf{Xmodel-2.5}, a 1.3-billion-parameter small language model designed as a emph{drop-in agent core}.…

November 26, 2025

Your Pre-trained LLM is Secretly an Unsupervised Confidence Calibrator

arXiv:2505.16690v5 Announce Type: replace Abstract: Post-training of large language models is essential for adapting pre-trained language models (PLMs) to align with human preferences and downstream tasks. While PLMs typically exhibit well-calibrated confidence, post-trained language models (PoLMs) often suffer from over-confidence,…

November 26, 2025

PeriodNet: Boosting the Potential of Attention Mechanism for Time Series Forecasting

arXiv:2511.19497v1 Announce Type: new Abstract: The attention mechanism has demonstrated remarkable potential in sequence modeling, exemplified by its successful application in natural language processing with models such as Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-trained Transformer (GPT). Despite…

November 26, 2025

Relative Advantage Debiasing for Watch-Time Prediction in Short-Video Recommendation

arXiv:2508.11086v3 Announce Type: replace Abstract: Watch time is widely used as a proxy for user satisfaction in video recommendation platforms. However, raw watch times are influenced by confounding factors such as video duration, popularity, and individual user behaviors, potentially distorting…

November 26, 2025

Hierarchical Dual-Strategy Unlearning for Biomedical and Healthcare Intelligence Using Imperfect and Privacy-Sensitive Medical Data

arXiv:2511.19498v1 Announce Type: new Abstract: Large language models (LLMs) exhibit exceptional performance but pose substantial privacy risks due to training data memorization, particularly within healthcare contexts involving imperfect or privacy-sensitive patient information. We present a hierarchical dual-strategy framework for selective…

November 26, 2025

SLOFetch: Compressed-Hierarchical Instruction Prefetching for Cloud Microservices

arXiv:2511.04774v3 Announce Type: replace Abstract: Large-scale networked services rely on deep soft-ware stacks and microservice orchestration, which increase instruction footprints and create frontend stalls that inflate tail latency and energy. We revisit instruction prefetching for these cloud workloads and present…

November 26, 2025

Beyond Binary Classification: A Semi-supervised Approach to Generalized AI-generated Image Detection

arXiv:2511.19499v1 Announce Type: new Abstract: The rapid advancement of generators (e.g., StyleGAN, Midjourney, DALL-E) has produced highly realistic synthetic images, posing significant challenges to digital media authenticity. These generators are typically based on a few core architectural families, primarily Generative…

November 26, 2025