- CNN, Artistic style
 - Reinforcement learning, game play
 
Reinforcement Learning ¶
https://en.wikipedia.org/wiki/Bellman_equation
Planning vs Learning
Planning
Planning vs Learning
Planning
- Know about Model
 - Dynamic Programming
 
- Model free
 - Monte Carlo method, Temporal Difference learning
 
Monte-Carlo Reinforcement Learning ¶
- 직접적인 경험으로부터 배움
 - Model-free : 직접적인 MDP transition과 보상을 알 필요가 없다
 - 끝난 에피소드로부터 학습한다.
 - episodic MDP 문제만 풀 수 있다.
 










