Skip to content
Provenance Brief
Research

Academic or research source. Check the methodology, sample size, and whether it's been replicated.

SMAC: Score-Matched Actor-Critics for Robust Offline-to-Online Transfer

Modern offline Reinforcement Learning (RL) methods find performant actor-critics, however, fine-tuning these actor-critics online with value-based RL algorithms typically causes immediate drops in performance.

Read Original

SMAC: Score-Matched Actor-Critics for Robust Offline-to-Online Transfer

TLDR

Modern offline Reinforcement Learning (RL) methods find performant actor-critics, however, fine-tuning these actor-critics online with value-based RL algorithms typically causes immediate drops in performance.

Artifacts
Paper PDF
Open
O open S save B back M mode