Archives AI News

Adaptive Label Error Detection: A Bayesian Approach to Mislabeled Data Detection

arXiv:2601.10084v1 Announce Type: new Abstract: Machine learning classification systems are susceptible to poor performance when trained with incorrect ground truth labels, even when data is well-curated by expert annotators. As machine learning becomes more widespread, it is increasingly imperative to…

Autoencoding Random Forests

arXiv:2505.21441v4 Announce Type: replace-cross Abstract: We propose a principled method for autoencoding with random forests. Our strategy builds on foundational results from nonparametric statistics and spectral graph theory to learn a low-dimensional embedding of the model that optimally represents relationships…

LeMoF: Level-guided Multimodal Fusion for Heterogeneous Clinical Data

arXiv:2601.10092v1 Announce Type: new Abstract: Multimodal clinical prediction is widely used to integrate heterogeneous data such as Electronic Health Records (EHR) and biosignals. However, existing methods tend to rely on static modality integration schemes and simple fusion strategies. As a…

Permissive Information-Flow Analysis for Large Language Models

arXiv:2410.03055v3 Announce Type: replace Abstract: Large Language Models (LLMs) are rapidly becoming commodity components of larger software systems. This poses natural security and privacy problems: poisoned data retrieved from one component can change the model’s behavior and compromise the entire…

Disco-RAG: Discourse-Aware Retrieval-Augmented Generation

arXiv:2601.04377v3 Announce Type: replace-cross Abstract: Retrieval-Augmented Generation (RAG) has emerged as an important means of enhancing the performance of large language models (LLMs) in knowledge-intensive tasks. However, most existing RAG strategies treat retrieved passages in a flat and unstructured way,…

Fairness Definitions in Language Models Explained

arXiv:2407.18454v3 Announce Type: replace-cross Abstract: Language Models (LMs) have demonstrated exceptional performance across various Natural Language Processing (NLP) tasks. Despite these advancements, LMs can inherit and amplify societal biases related to sensitive attributes such as gender and race, limiting their…