Audio-Guided Dynamic Modality Fusion with Stereo-Aware Attention for Audio-Visual Navigation
arXiv:2509.16924v1 Announce Type: new Abstract: In audio-visual navigation (AVN) tasks, an embodied agent must autonomously localize a sound source in unknown and complex 3D environments based on audio-visual signals. Existing methods often rely on static modality fusion strategies and neglect…
