Archives AI News

Forecasting in Offline Reinforcement Learning for Non-stationary Environments

arXiv:2512.01987v2 Announce Type: replace Abstract: Offline Reinforcement Learning (RL) provides a promising avenue for training policies from pre-collected datasets when gathering additional interaction data is infeasible. However, existing offline RL methods often assume stationarity or only consider synthetic perturbations at…

Training a Scientific Reasoning Model for Chemistry

arXiv:2506.17238v2 Announce Type: replace Abstract: Reasoning models are large language models that emit a long chain-of-thought before answering, providing both higher accuracy and explicit reasoning for their response. A major question has been whether language model reasoning generalizes beyond mathematics,…