U E D R , A S I H C RSS

머신러닝스터디/2017/Reinforcement Learning/ (rev. 1.6)

머신러닝스터디/2017/Reinforcement Learning/

Reinforcement Learning

Lecture 5: Model Free Control

동영상 주소: https://www.youtube.com/watch?v=0g4j2k_Ggc4&t=2466s
  • on policy vs off policy
  • ε-Greedy
  • Sarsa
    • TD?
    • on policy
    • Sarsa는 다음과 같은 조건에서 converge한다
      1. GLIE sequence of policies
      2. Robinson Monro sequence of step sizes
Valid XHTML 1.0! Valid CSS! powered by MoniWiki
last modified 2021-02-07 05:29:28
Processing time 0.0198 sec