Archives AI News

REX: Causal discovery based on machine learning and explainability techniques

arXiv:2501.12706v2 Announce Type: replace Abstract: Explainable Artificial Intelligence (XAI) techniques hold significant potential for enhancing the causal discovery process, which is crucial for understanding complex systems in areas like healthcare, economics, and artificial intelligence. However, no causal discovery methods currently…

October 17, 2025

Noise-Adaptive Layerwise Learning Rates: Accelerating Geometry-Aware Optimization for Deep Neural Network Training

arXiv:2510.14009v1 Announce Type: new Abstract: Geometry-aware optimization algorithms, such as Muon, have achieved remarkable success in training deep neural networks (DNNs). These methods leverage the underlying geometry of DNNs by selecting appropriate norms for different layers and updating parameters via…

October 17, 2025

Uni-LoRA: One Vector is All You Need

arXiv:2506.00799v2 Announce Type: replace Abstract: Low-Rank Adaptation (LoRA) has become the de facto parameter-efficient fine-tuning (PEFT) method for large language models (LLMs) by constraining weight updates to low-rank matrices. Recent works such as Tied-LoRA, VeRA, and VB-LoRA push efficiency further…

October 17, 2025

Context-Selective State Space Models: Feedback is All You Need

arXiv:2510.14027v1 Announce Type: new Abstract: Transformers, powered by the attention mechanism, are the backbone of most foundation models, yet they suffer from quadratic complexity and difficulties in dealing with long-range dependencies in the input sequence. Recent work has shown that…

October 17, 2025

ECG-Soup: Harnessing Multi-Layer Synergy for ECG Foundation Models

arXiv:2509.00102v2 Announce Type: replace Abstract: Transformer-based foundation models for Electrocardiograms (ECGs) have recently achieved impressive performance in many downstream applications.

October 17, 2025

CausalVerse: Benchmarking Causal Representation Learning with Configurable High-Fidelity Simulations

arXiv:2510.14049v1 Announce Type: new Abstract: Causal Representation Learning (CRL) aims to uncover the data-generating process and identify the underlying causal variables and relations, whose evaluation remains inherently challenging due to the requirement of known ground-truth causal variables and causal structure.…

October 17, 2025

Think Just Enough: Sequence-Level Entropy as a Confidence Signal for LLM Reasoning

arXiv:2510.08146v2 Announce Type: replace Abstract: We introduce a simple, yet novel entropy-based framework to drive token efficiency in large language models during reasoning tasks. Our approach uses Shannon entropy from token-level logprobs as a confidence signal to enable early stopping,…

October 17, 2025

FedHFT: Efficient Federated Finetuning with Heterogeneous Edge Clients

arXiv:2510.14054v1 Announce Type: new Abstract: Fine-tuning pre-trained large language models (LLMs) has become a common practice for personalized natural language understanding (NLU) applications on downstream tasks and domain-specific datasets. However, there are two main challenges: (i) limited and/or heterogeneous data…

October 17, 2025

SOHES: Self-supervised Open-world Hierarchical Entity Segmentation

arXiv:2404.12386v2 Announce Type: replace-cross Abstract: Open-world entity segmentation, as an emerging computer vision task, aims at segmenting entities in images without being restricted by pre-defined classes, offering impressive generalization capabilities on unseen images and concepts. Despite its promise, existing entity…

October 17, 2025

On the expressivity of sparse maxout networks

arXiv:2510.14068v1 Announce Type: new Abstract: We study the expressivity of sparse maxout networks, where each neuron takes a fixed number of inputs from the previous layer and employs a, possibly multi-argument, maxout activation. This setting captures key characteristics of convolutional…

October 17, 2025