Archives AI News

Approximately Unimodal Likelihood Models for Ordinal Regression

arXiv:2510.00122v1 Announce Type: new Abstract: Ordinal regression (OR, also called ordinal classification) is classification of ordinal data, in which the underlying target variable is categorical and considered to have a natural ordinal relation for the underlying explanatory variable. A key…

October 2, 2025

GRID: Scalable Task-Agnostic Prompt-Based Continual Learning for Language Models

arXiv:2507.14725v3 Announce Type: replace Abstract: Prompt-based continual learning (CL) provides a parameter-efficient approach for adapting large language models (LLMs) across task sequences. However, most existing methods rely on task-aware inference and maintain a growing set of task-specific prompts, which introduces…

October 2, 2025

Federated Learning Meets LLMs: Feature Extraction From Heterogeneous Clients

arXiv:2510.00065v1 Announce Type: new Abstract: Federated learning (FL) enables collaborative model training without sharing raw data, making it attractive for privacy-sensitive domains such as healthcare, finance, and IoT. A major obstacle, however, is the heterogeneity of tabular data across clients,…

October 2, 2025

Adaptive and Resource-efficient Agentic AI Systems for Mobile and Embedded Devices: A Survey

arXiv:2510.00078v1 Announce Type: new Abstract: Foundation models have reshaped AI by unifying fragmented architectures into scalable backbones with multimodal reasoning and contextual adaptation. In parallel, the long-standing notion of AI agents, defined by the sensing-decision-action loop, is entering a new…

October 2, 2025

Linear Regression in p-adic metric spaces

arXiv:2510.00043v1 Announce Type: new Abstract: Many real-world machine learning problems involve inherently hierarchical data, yet traditional approaches rely on Euclidean metrics that fail to capture the discrete, branching nature of hierarchical relationships. We present a theoretical foundation for machine learning…

October 2, 2025

DexBench: Benchmarking LLMs for Personalized Decision Making in Diabetes Management

arXiv:2510.00038v1 Announce Type: new Abstract: We present DexBench, the first benchmark designed to evaluate large language model (LLM) performance across real-world decision-making tasks faced by individuals managing diabetes in their daily lives. Unlike prior health benchmarks that are either generic,…

October 2, 2025

Rethinking RoPE Scaling in Quantized LLM: Theory, Outlier, and Channel-Band Analysis with Weight Rescaling

arXiv:2510.00028v1 Announce Type: new Abstract: Extending the context window support of large language models (LLMs) is crucial for tasks with long-distance dependencies. RoPE-based interpolation and extrapolation methods, such as linear scaling and frequency-aware schemes, enable longer input length support without…

October 2, 2025

Krony-PT: GPT2 compressed with Kronecker Products

arXiv:2412.12351v2 Announce Type: replace Abstract: We introduce Krony-PT, a compression technique for GPT-2 based on Kronecker products. We specifically target the feed-forward weights of each transformer block, and systematically compress the feed-forward layer matrices to various degrees. We introduce a…

October 2, 2025

BigBang-Proton Technical Report: Next-Word-Prediction is Scientific Multitask Learner

arXiv:2510.00129v1 Announce Type: new Abstract: We introduce BigBang-Proton, a unified sequence-based architecture for auto-regressive language modeling pretrained on cross-scale, cross-structure, cross-discipline real-world scientific tasks to construct a scientific multi-task learner. BigBang-Proton incorporates three fundamental innovations compared to mainstream general-purpose LLMs:…

October 2, 2025

ExPLAIND: Unifying Model, Data, and Training Attribution to Study Model Behavior

arXiv:2505.20076v3 Announce Type: replace Abstract: Post-hoc interpretability methods typically attribute a model’s behavior to its components, data, or training trajectory in isolation. This leads to explanations that lack a unified view and may miss key interactions. While combining existing methods…

October 2, 2025