Sang Hun Kim

AI/Optimization/Scheduling/Etc

Study Log (2020.02)

1 minute read

2020-02-21

multi_step_actor
- simple_actor_test.py
  - rl/simple_action_actor.py
  - rl/brain.py
  - rl/QAgent.py
  - optim/RAam.py
- Debug로 코드 따라가며 이해하기

2020-02-20

모두를 위한 RL강좌
- Lecture #19
- Lecture #20
S-K RL
- 소스코드 업데이트 및 환경 구축
  - Mac에서 Tree 구조 출력 필요시 다음 명령어로 설치 필요
    brew install tree
- rl_networks.py
  - class NaiveActor(torch.nn.Module):
    - machine, batch 컨디션 체크 후 node_indices return
  - class NaiveCritic(torch.nn.Module):
    - critic_updater 레이어에서 node_doable 한 인덱스의 평균 return
multi_step_actor
- simple_actor_test.py
  - rl/mlp.py
  - rl/simple_action_brain.py
  - memory/simple_memory.py

2020-02-19

모두를 위한 RL강좌
- Lecture #13
- Lecture #14
- Lecture #15
- Lecture #16) Lab 7-2: DQN 2 (Nature 2015)
- Lecture #17
- Lecture #18

2020-02-18

모두를 위한 RL강좌
- Lecture #10
- Lecture #11
- Lecture #12) Lab 6-1: Q Network for Frozen Lake
팡요랩
- Lecture #10

2020-02-17

모두를 위한 RL강좌
- Lecture #6
- Lecture #7
- Lecture #8
- Lecture #9) Lab 05-2: Q-learning (Table) Demo by Jae Hyun Lee (with music)
팡요랩
- Lecture #9

2020-02-16

모두를 위한 RL강좌
- Lecture #1
- Lecture #2
- Lecture #3
- Lecture #4
- Lecture #5) Lab 3: Dummy Q-learning (table)

2020-02-14

Reinforcement Learning
- Chapter 13. Policy Gradient Methods
  - 13.5 Actor–Critic Methods
  - 13.6 Policy Gradient for Continuing Problems
  - 13.7 Policy Parameterization for Continuous Actions
  - 13.8 Summary
- Page #339

2020-02-13

Reinforcement Learning
- Chapter 13. Policy Gradient Methods
  - 13.4 REINFORCE with Baseline
- Page #329
팡요랩
- Lecture #8

2020-02-11

팡요랩
- Lecture #7
  - 정확하게 이해되지 않는 부분이 있어서 한 번 더 리뷰

2020-02-10

팡요랩
- Lecture #7

2020-02-08

Reinforcement Learning
- Chapter 13. Policy Gradient Methods
  - 13.3 REINFORCE: Monte Carlo Policy Gradient
- Page #329
팡요랩
- Lecture #6

2020-02-05

Reinforcement Learning
- Chapter 13. Policy Gradient Methods
  - 13.2 The Policy Gradient Theorem
- Page #326

2020-02-02

Reinforcement Learning
- Chapter 12. Eligibility Traces
  - 12.12 Implementation Issues
  - 12.13 Conclusions
- Chapter 13. Policy Gradient Methods
  - 13.1 Policy Approximation and its Advantages
- Page #324

Template

Fundamental of Reinforcement Learning
- Chapter #.
모두를 위한 머신러닝/딥러닝 강의
- Lecture #.
UCL Course on RL
- Lecture #.
Reinforcement Learning
- Page #.
팡요랩
- Lecture #.
Pattern Recognition & Machine Learning

Twitter Facebook LinkedIn

Comments

You May Also Enjoy

Study Log (2022.09)

less than 1 minute read

Study Log (2022.09)

1 minute read

2022-09-20 모델 성능 개선으로 익히는 강화학습 A-Z Part06. 모델 기반 강화학습 Ch 03. 최적제어와 모델기반 강화학습 07. pytorch 모델 MPC 구...

Study Log (2022.08)

1 minute read

2022-08-31 모델 성능 개선으로 익히는 강화학습 A-Z Part 5. 심층강화학습 Ch 01. 심층강화학습 논문 읽기 ...

Study Log (2022.07)

less than 1 minute read

2022-07-05 모델 성능 개선으로 익히는 강화학습 A-Z Part 5. 심층강화학습 Ch 01. 심층강화학습 논문 읽기 11. Asynchrnous Advantage A...