[[pagelist(^(머신러닝스터디/2017/))]]

 * CNN, Artistic style
 * Reinforcement learning, game play

=== Reinforcement Learning ===
https://en.wikipedia.org/wiki/Bellman_equation
Planning vs Learning
Planning
 * Know about Model
 * Dynamic Programming
Learning
 * Model free
 * Monte Carlo method, Temporal Difference learning
==== Monte-Carlo Reinforcement Learning ====

 * 직접적인 경험으로부터 배움
 * Model-free : 직접적인 MDP transition과 보상을 알 필요가 없다
 * 끝난 에피소드로부터 학습한다.
 * episodic MDP 문제만 풀 수 있다.


==== Temporal-Difference Learning ====

 * 경험으로부터 학습한다
 * model-free
 * 끌나지 않은 경험에서도 학습 가능하다(Bootstraping)