Archives AI News

Steganographic Backdoor Attacks in NLP: Ultra-Low Poisoning and Defense Evasion

arXiv:2511.14301v2 Announce Type: replace-cross Abstract: Transformer models are foundational to natural language processing (NLP) applications, yet remain vulnerable to backdoor attacks introduced through poisoned data, which implant hidden behaviors during training. To strengthen the ability to prevent such compromises, recent…

Modality-Balanced Collaborative Distillation for Multi-Modal Domain Generalization

arXiv:2511.20258v1 Announce Type: cross Abstract: Weight Averaging (WA) has emerged as a powerful technique for enhancing generalization by promoting convergence to a flat loss landscape, which correlates with stronger out-of-distribution performance. However, applying WA directly to multi-modal domain generalization (MMDG)…

RFX: High-Performance Random Forests with GPU Acceleration and QLORA Compression

arXiv:2511.19493v1 Announce Type: new Abstract: RFX (Random Forests X), where X stands for compression or quantization, presents a production-ready implementation of Breiman and Cutler’s Random Forest classification methodology in Python. RFX v1.0 provides complete classification: out-of-bag error estimation, overall and…

New York Smells: A Large Multimodal Dataset for Olfaction

arXiv:2511.20544v1 Announce Type: cross Abstract: While olfaction is central to how animals perceive the world, this rich chemical sensory modality remains largely inaccessible to machines. One key bottleneck is the lack of diverse, multimodal olfactory training data collected in natural…

A Systematic Study of Compression Ordering for Large Language Models

arXiv:2511.19495v1 Announce Type: new Abstract: Large Language Models (LLMs) require substantial computational resources, making model compression essential for efficient deployment in constrained environments. Among the dominant compression techniques: knowledge distillation, structured pruning, and low-bit quantization, their individual effects are well…

Value Improved Actor Critic Algorithms

arXiv:2406.01423v3 Announce Type: replace Abstract: To learn approximately optimal acting policies for decision problems, modern Actor Critic algorithms rely on deep Neural Networks (DNNs) to parameterize the acting policy and greedification operators to iteratively improve it. The reliance on DNNs…

Xmodel-2.5: 1.3B Data-Efficient Reasoning SLM

arXiv:2511.19496v1 Announce Type: new Abstract: Large language models deliver strong reasoning and tool-use skills, yet their computational demands make them impractical for edge or cost-sensitive deployments. We present textbf{Xmodel-2.5}, a 1.3-billion-parameter small language model designed as a emph{drop-in agent core}.…

Your Pre-trained LLM is Secretly an Unsupervised Confidence Calibrator

arXiv:2505.16690v5 Announce Type: replace Abstract: Post-training of large language models is essential for adapting pre-trained language models (PLMs) to align with human preferences and downstream tasks. While PLMs typically exhibit well-calibrated confidence, post-trained language models (PoLMs) often suffer from over-confidence,…

PeriodNet: Boosting the Potential of Attention Mechanism for Time Series Forecasting

arXiv:2511.19497v1 Announce Type: new Abstract: The attention mechanism has demonstrated remarkable potential in sequence modeling, exemplified by its successful application in natural language processing with models such as Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-trained Transformer (GPT). Despite…