Benchmarking System Dynamics AI Assistants: Cloud Versus Local LLMs on CLD Extraction and Discussion
arXiv:2604.18566v1 Announce Type: cross Abstract: We present a systematic evaluation of large language model families — spanning both proprietary cloud APIs and locally-hosted open-source models — on two purpose-built benchmarks for System Dynamics AI assistance: the textbf{CLD Leaderboard} (53 tests,…
