Archives AI News

New York Smells: A Large Multimodal Dataset for Olfaction

arXiv:2511.20544v1 Announce Type: cross Abstract: While olfaction is central to how animals perceive the world, this rich chemical sensory modality remains largely inaccessible to machines. One key bottleneck is the lack of diverse, multimodal olfactory training data collected in natural…

A Systematic Study of Compression Ordering for Large Language Models

arXiv:2511.19495v1 Announce Type: new Abstract: Large Language Models (LLMs) require substantial computational resources, making model compression essential for efficient deployment in constrained environments. Among the dominant compression techniques: knowledge distillation, structured pruning, and low-bit quantization, their individual effects are well…

Value Improved Actor Critic Algorithms

arXiv:2406.01423v3 Announce Type: replace Abstract: To learn approximately optimal acting policies for decision problems, modern Actor Critic algorithms rely on deep Neural Networks (DNNs) to parameterize the acting policy and greedification operators to iteratively improve it. The reliance on DNNs…

Xmodel-2.5: 1.3B Data-Efficient Reasoning SLM

arXiv:2511.19496v1 Announce Type: new Abstract: Large language models deliver strong reasoning and tool-use skills, yet their computational demands make them impractical for edge or cost-sensitive deployments. We present textbf{Xmodel-2.5}, a 1.3-billion-parameter small language model designed as a emph{drop-in agent core}.…

Your Pre-trained LLM is Secretly an Unsupervised Confidence Calibrator

arXiv:2505.16690v5 Announce Type: replace Abstract: Post-training of large language models is essential for adapting pre-trained language models (PLMs) to align with human preferences and downstream tasks. While PLMs typically exhibit well-calibrated confidence, post-trained language models (PoLMs) often suffer from over-confidence,…

PeriodNet: Boosting the Potential of Attention Mechanism for Time Series Forecasting

arXiv:2511.19497v1 Announce Type: new Abstract: The attention mechanism has demonstrated remarkable potential in sequence modeling, exemplified by its successful application in natural language processing with models such as Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-trained Transformer (GPT). Despite…

SLOFetch: Compressed-Hierarchical Instruction Prefetching for Cloud Microservices

arXiv:2511.04774v3 Announce Type: replace Abstract: Large-scale networked services rely on deep soft-ware stacks and microservice orchestration, which increase instruction footprints and create frontend stalls that inflate tail latency and energy. We revisit instruction prefetching for these cloud workloads and present…