Toward Faithfulness-guided Ensemble Interpretation of Neural Network

arXiv:2509.04588v1 Announce Type: new Abstract: Interpretable and faithful explanations for specific neural inferences are crucial for understanding and evaluating model behavior. Our work introduces textbf{F}aithfulness-guided textbf{E}nsemble textbf{I}nterpretation (textbf{FEI}), an innovative framework that enhances the breadth and effectiveness of faithfulness, advancing interpretability by providing superior visualization. Through an analysis of existing evaluation benchmarks, textbf{FEI} employs a smooth approximation to elevate quantitative faithfulness scores. Diverse variations of textbf{FEI} target enhanced faithfulness in hidden layer encodings, expanding interpretability. Additionally, we propose a novel qualitative metric that assesses hidden layer faithfulness. In extensive experiments, textbf{FEI} surpasses existing methods, demonstrating substantial advances in qualitative visualization and quantitative faithfulness scores. Our research establishes a comprehensive framework for elevating faithfulness in neural network explanations, emphasizing both breadth and precision

September 8, 2025

2025-09-08 04:00 GMT · 9 months ago arxiv.org

Original: https://arxiv.org/abs/2509.04588