online RL framework