Aligning Attention with Human Rationales for Self-Explaining Hate Speech Detection
arXiv:2511.07065v1 Announce Type: cross Abstract: The opaque nature of deep learning models presents significant challenges for the ethical deployment of hate speech detection systems. To address this limitation, we introduce Supervised Rational Attention (SRA), a framework that explicitly aligns model…
