Archives AI News

DSperse: A Framework for Targeted Verification in Zero-Knowledge Machine Learning

arXiv:2508.06972v3 Announce Type: replace Abstract: DSperse is a modular framework for distributed machine learning inference with strategic cryptographic verification. Operating within the emerging paradigm of distributed zero-knowledge machine learning, DSperse avoids the high cost and rigidity of full-model circuitization by…

SynBench: A Benchmark for Differentially Private Text Generation

arXiv:2509.14594v1 Announce Type: new Abstract: Data-driven decision support in high-stakes domains like healthcare and finance faces significant barriers to data sharing due to regulatory, institutional, and privacy concerns. While recent generative AI models, such as large language models, have shown…

AgentCompass: Towards Reliable Evaluation of Agentic Workflows in Production

arXiv:2509.14647v1 Announce Type: new Abstract: With the growing adoption of Large Language Models (LLMs) in automating complex, multi-agent workflows, organizations face mounting risks from errors, emergent behaviors, and systemic failures that current evaluation methods fail to capture. We present AgentCompass,…

DisastIR: A Comprehensive Information Retrieval Benchmark for Disaster Management

arXiv:2505.15856v2 Announce Type: replace-cross Abstract: Effective disaster management requires timely access to accurate and contextually relevant information. Existing Information Retrieval (IR) benchmarks, however, focus primarily on general or specialized domains, such as medicine or finance, neglecting the unique linguistic complexity…

SCORPION: Addressing Scanner-Induced Variability in Histopathology

arXiv:2507.20907v2 Announce Type: replace-cross Abstract: Ensuring reliable model performance across diverse domains is a critical challenge in computational pathology. A particular source of variability in Whole-Slide Images is introduced by differences in digital scanners, thus calling for better scanner generalization.…