Study Log (2020.02)
2020-02-21
- multi_step_actor
- simple_actor_test.py
- rl/simple_action_actor.py
- rl/brain.py
- rl/QAgent.py
- optim/RAam.py
- Debug로 코드 따라가며 이해하기
- simple_actor_test.py
2020-02-20
- 모두를 위한 RL강좌
- Lecture #19
- Lecture #20
- S-K RL
- 소스코드 업데이트 및 환경 구축
- Mac에서 Tree 구조 출력 필요시 다음 명령어로 설치 필요
brew install tree
- Mac에서 Tree 구조 출력 필요시 다음 명령어로 설치 필요
- rl_networks.py
- class NaiveActor(torch.nn.Module):
- machine, batch 컨디션 체크 후 node_indices return
- class NaiveCritic(torch.nn.Module):
- critic_updater 레이어에서 node_doable 한 인덱스의 평균 return
- class NaiveActor(torch.nn.Module):
- 소스코드 업데이트 및 환경 구축
- multi_step_actor
- simple_actor_test.py
- rl/mlp.py
- rl/simple_action_brain.py
- memory/simple_memory.py
- simple_actor_test.py
2020-02-19
- 모두를 위한 RL강좌
- Lecture #13
- Lecture #14
- Lecture #15
- Lecture #16) Lab 7-2: DQN 2 (Nature 2015)
- Lecture #17
- Lecture #18
2020-02-18
- 모두를 위한 RL강좌
- Lecture #10
- Lecture #11
- Lecture #12) Lab 6-1: Q Network for Frozen Lake
- 팡요랩
- Lecture #10
2020-02-17
- 모두를 위한 RL강좌
- Lecture #6
- Lecture #7
- Lecture #8
- Lecture #9) Lab 05-2: Q-learning (Table) Demo by Jae Hyun Lee (with music)
- 팡요랩
- Lecture #9
2020-02-16
- 모두를 위한 RL강좌
- Lecture #1
- Lecture #2
- Lecture #3
- Lecture #4
- Lecture #5) Lab 3: Dummy Q-learning (table)
2020-02-14
- Reinforcement Learning
- Chapter 13. Policy Gradient Methods
- 13.5 Actor–Critic Methods
- 13.6 Policy Gradient for Continuing Problems
- 13.7 Policy Parameterization for Continuous Actions
- 13.8 Summary
- Page #339
- Chapter 13. Policy Gradient Methods
2020-02-13
- Reinforcement Learning
- Chapter 13. Policy Gradient Methods
- 13.4 REINFORCE with Baseline
- Page #329
- Chapter 13. Policy Gradient Methods
- 팡요랩
- Lecture #8
2020-02-11
- 팡요랩
- Lecture #7
- 정확하게 이해되지 않는 부분이 있어서 한 번 더 리뷰
- Lecture #7
2020-02-10
- 팡요랩
- Lecture #7
2020-02-08
- Reinforcement Learning
- Chapter 13. Policy Gradient Methods
- 13.3 REINFORCE: Monte Carlo Policy Gradient
- Page #329
- Chapter 13. Policy Gradient Methods
- 팡요랩
- Lecture #6
2020-02-05
- Reinforcement Learning
- Chapter 13. Policy Gradient Methods
- 13.2 The Policy Gradient Theorem
- Page #326
- Chapter 13. Policy Gradient Methods
2020-02-02
- Reinforcement Learning
- Chapter 12. Eligibility Traces
- 12.12 Implementation Issues
- 12.13 Conclusions
- Chapter 13. Policy Gradient Methods
- 13.1 Policy Approximation and its Advantages
- Page #324
- Chapter 12. Eligibility Traces
Template
- Fundamental of Reinforcement Learning
- Chapter #.
- 모두를 위한 머신러닝/딥러닝 강의
- Lecture #.
- UCL Course on RL
- Lecture #.
- Reinforcement Learning
- Page #.
- 팡요랩
- Lecture #.
- Pattern Recognition & Machine Learning
Comments