Balancing Synthetic Data and Replay for Enhancing Task-Specific Capabilities
arXiv:2510.11842v1 Announce Type: new Abstract: Adapting language models to new tasks through continued pretraining faces a fundamental trade-off: models must learn new capabilities while avoiding catastrophic forgetting of existing knowledge. While prior work has studied synthetic data generation techniques, the…
