Contextualize-then-Aggregate: Circuits for In-Context Learning in Gemma-2 2B
arXiv:2504.00132v4 Announce Type: replace-cross Abstract: In-Context Learning (ICL) is an intriguing ability of large language models (LLMs). Despite a substantial amount of work on its behavioral aspects and how it emerges in miniature setups, it remains unclear which mechanism assembles…
