Research

Academic or research source. Check the methodology, sample size, and whether it's been replicated.

SMAC: Score-Matched Actor-Critics for Robust Offline-to-Online Transfer

Modern offline Reinforcement Learning (RL) methods find performant actor-critics, however, fine-tuning these actor-critics online with value-based RL algorithms typically causes immediate drops in performance.

arXiv cs.LG · Feb 19, 2026 18:47 UTC · Paper: ~15 min

Read Original

SMAC: Score-Matched Actor-Critics for Robust Offline-to-Online Transfer

TLDR

Artifacts

Paper PDF

Open

O open S save B back M mode