Study Log (2020.02)

1 minute read

2020-02-21

  • multi_step_actor
    • simple_actor_test.py
      • rl/simple_action_actor.py
      • rl/brain.py
      • rl/QAgent.py
      • optim/RAam.py
    • Debug로 코드 따라가며 이해하기

2020-02-20

  • 모두를 위한 RL강좌
    • Lecture #19
    • Lecture #20
  • S-K RL
    • 소스코드 업데이트 및 환경 구축
      • Mac에서 Tree 구조 출력 필요시 다음 명령어로 설치 필요
        brew install tree
        
    • rl_networks.py
      • class NaiveActor(torch.nn.Module):
        • machine, batch 컨디션 체크 후 node_indices return
      • class NaiveCritic(torch.nn.Module):
        • critic_updater 레이어에서 node_doable 한 인덱스의 평균 return
  • multi_step_actor
    • simple_actor_test.py
      • rl/mlp.py
      • rl/simple_action_brain.py
      • memory/simple_memory.py

2020-02-19


2020-02-18


2020-02-17


2020-02-16


2020-02-14

  • Reinforcement Learning
    • Chapter 13. Policy Gradient Methods
      • 13.5 Actor–Critic Methods
      • 13.6 Policy Gradient for Continuing Problems
      • 13.7 Policy Parameterization for Continuous Actions
      • 13.8 Summary
    • Page #339

2020-02-13


2020-02-11

  • 팡요랩
    • Lecture #7
      • 정확하게 이해되지 않는 부분이 있어서 한 번 더 리뷰

2020-02-10


2020-02-08


2020-02-05


2020-02-02

  • Reinforcement Learning
    • Chapter 12. Eligibility Traces
      • 12.12 Implementation Issues
      • 12.13 Conclusions
    • Chapter 13. Policy Gradient Methods
      • 13.1 Policy Approximation and its Advantages
    • Page #324

Template

Updated:

Comments