<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://missflash.github.io/feed.xml" rel="self" type="application/atom+xml" /><link href="https://missflash.github.io/" rel="alternate" type="text/html" /><updated>2026-05-25T13:13:05+00:00</updated><id>https://missflash.github.io/feed.xml</id><title type="html">MissFlash</title><subtitle>Tech Baby&apos;s Journal!</subtitle><author><name>Sang Hun Kim</name></author><entry><title type="html">Study Log (2022.09)</title><link href="https://missflash.github.io/study-log-202209/" rel="alternate" type="text/html" title="Study Log (2022.09)" /><published>2022-09-01T12:36:24+00:00</published><updated>2022-09-01T12:36:24+00:00</updated><id>https://missflash.github.io/study-log-202209</id><content type="html" xml:base="https://missflash.github.io/study-log-202209/"><![CDATA[<h1 id="2022-09-20">2022-09-20</h1>
<ul>
  <li><a href="https://fastcampus.co.kr/data_online_rein">모델 성능 개선으로 익히는 강화학습 A-Z</a>
    <ul>
      <li>Part06. 모델 기반 강화학습
        <ul>
          <li>Ch 03. 최적제어와 모델기반 강화학습
            <ul>
              <li>07. pytorch 모델 MPC 구현</li>
            </ul>
          </li>
        </ul>
      </li>
    </ul>
  </li>
</ul>

<hr />

<h1 id="2022-09-14">2022-09-14</h1>
<ul>
  <li><a href="https://fastcampus.co.kr/data_online_rein">모델 성능 개선으로 익히는 강화학습 A-Z</a>
    <ul>
      <li>Part06. 모델 기반 강화학습
        <ul>
          <li>Ch 03. 최적제어와 모델기반 강화학습
            <ul>
              <li>04. Model predictive Control (MPC)</li>
              <li>05. 최적화 맛보기</li>
              <li>06. pytorch 모델 제약조건이 있는 최적화문제</li>
            </ul>
          </li>
        </ul>
      </li>
    </ul>
  </li>
  <li><a href="https://pasus.tistory.com/">Deep Campus</a>
    <ul>
      <li><a href="https://pasus.tistory.com/122?category=1135402">A2C 알고리즘-1: 크리틱 신경망</a></li>
      <li><a href="https://pasus.tistory.com/123?category=1135402">A2C 알고리즘-2: 액터 신경망</a></li>
      <li><a href="https://pasus.tistory.com/124?category=1135402">Tensorflow2로 만든 A2C 코드: Pendulum-v0</a></li>
    </ul>
  </li>
</ul>

<hr />

<h1 id="2022-09-06">2022-09-06</h1>
<ul>
  <li><a href="https://fastcampus.co.kr/data_online_rein">모델 성능 개선으로 익히는 강화학습 A-Z</a>
    <ul>
      <li>Part06. 모델 기반 강화학습
        <ul>
          <li>Ch 03. 최적제어와 모델기반 강화학습
            <ul>
              <li>02. Guided Policy Search GPS - 1</li>
              <li>03. Guided Policy Search GPS - 2</li>
            </ul>
          </li>
        </ul>
      </li>
    </ul>
  </li>
  <li><a href="https://pasus.tistory.com/">Deep Campus</a>
    <ul>
      <li><a href="https://pasus.tistory.com/24?category=1135400">2D 컨볼루션 계산하기</a></li>
      <li><a href="https://pasus.tistory.com/25?category=1135400">이미지 필터 설계해 보기</a></li>
      <li><a href="https://pasus.tistory.com/26?category=1135400">컨볼루션과 상관도</a></li>
      <li><a href="https://pasus.tistory.com/37?category=1135402">강화학습의 한계</a></li>
      <li><a href="https://pasus.tistory.com/41?category=1135402">강화학습 문제</a></li>
      <li><a href="https://pasus.tistory.com/119?category=1135402">정책 그래디언트 기반 강화학습의 원리</a></li>
    </ul>
  </li>
</ul>

<hr />

<h1 id="2022-09-05">2022-09-05</h1>
<ul>
  <li><a href="https://fastcampus.co.kr/data_online_rein">모델 성능 개선으로 익히는 강화학습 A-Z</a>
    <ul>
      <li>Part06. 모델 기반 강화학습
        <ul>
          <li>Ch 03. 최적제어와 모델기반 강화학습
            <ul>
              <li>01. Differential Dynamic Programing LQR iLQR</li>
            </ul>
          </li>
        </ul>
      </li>
    </ul>
  </li>
</ul>

<hr />

<h1 id="2022-09-04">2022-09-04</h1>
<ul>
  <li><a href="https://pasus.tistory.com/">Deep Campus</a>
    <ul>
      <li><a href="https://pasus.tistory.com/11?category=1135400">컨볼루션(convolution)이란</a></li>
      <li><a href="https://pasus.tistory.com/12?category=1135400">LTI 시스템 - 선형</a></li>
      <li><a href="https://pasus.tistory.com/13?category=1135400">LTI 시스템 – 시불변</a></li>
      <li><a href="https://pasus.tistory.com/17?category=1135400">LTI 시스템과 컨볼루션</a></li>
      <li><a href="https://pasus.tistory.com/19?category=1135400">컨볼루션 공식대로 계산하기</a></li>
      <li><a href="https://pasus.tistory.com/21?category=1135400">컨볼루션 쉽게 계산하기</a></li>
      <li><a href="https://pasus.tistory.com/22?category=1135400">이동평균(moving average) 필터 설계해 보기</a></li>
      <li><a href="https://pasus.tistory.com/23?category=1135400">2D 컨볼루션</a></li>
    </ul>
  </li>
</ul>

<hr />

<h1 id="2022-09-02">2022-09-02</h1>
<ul>
  <li><a href="https://fastcampus.co.kr/data_online_rein">모델 성능 개선으로 익히는 강화학습 A-Z</a>
    <ul>
      <li>Part06. 모델 기반 강화학습
        <ul>
          <li>Ch 02. Discrete Planning and MBRL
            <ul>
              <li>02. 이산화된 행동공간 Planning - 2</li>
              <li>03. Differentiable simulator와 PILCO</li>
            </ul>
          </li>
        </ul>
      </li>
    </ul>
  </li>
  <li><a href="https://pasus.tistory.com/">Deep Campus</a>
    <ul>
      <li><a href="https://pasus.tistory.com/209?category=1287736">[GP-1] 가우시안 프로세스 (Gaussian Process)의 개념</a></li>
      <li><a href="https://pasus.tistory.com/210?category=1287736">[GP-2] GP 회귀 (GP Regression)</a></li>
      <li><a href="https://pasus.tistory.com/211?category=1287736">[GP-3] GP 커널 학습</a></li>
      <li><a href="https://pasus.tistory.com/212?category=1287736">[GP-4] 베이지안 최적화 (Bayesian Optimization)</a></li>
    </ul>
  </li>
</ul>

<hr />

<h1 id="template">Template</h1>
<ul>
  <li><a href="https://dnddnjs.gitbook.io/rl/">Fundamental of Reinforcement Learning</a>
    <ul>
      <li>Chapter #.</li>
    </ul>
  </li>
  <li><a href="http://hunkim.github.io/ml/">모두를 위한 머신러닝/딥러닝 강의</a>
    <ul>
      <li>Lecture #.</li>
    </ul>
  </li>
  <li><a href="http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html">UCL Course on RL</a>
    <ul>
      <li>Lecture #.</li>
    </ul>
  </li>
  <li><a href="http://incompleteideas.net/book/the-book-2nd.html">Reinforcement Learning</a>
    <ul>
      <li>Page #.</li>
    </ul>
  </li>
  <li><a href="https://www.youtube.com/playlist?list=PLpRS2w0xWHTcTZyyX8LMmtbcMXpd3s4TU">팡요랩</a>
    <ul>
      <li><a href="https://www.youtube.com/watch?v=wYgyiCEkwC8">강화학습 1강 - 강화학습 introduction</a></li>
      <li><a href="https://www.youtube.com/watch?v=NMesGSXr8H4">강화학습 2강 - Markov Decision Process</a></li>
      <li><a href="https://www.youtube.com/watch?v=rrTxOkbHj-M">강화학습 3강 - Planning by Dynamic Programming</a></li>
      <li><a href="https://www.youtube.com/watch?v=47FyZtBRglI">강화학습 4강 - Model Free Prediction</a></li>
      <li><a href="https://www.youtube.com/watch?v=2h-FD3e1YgQ">강화학습 5강 - Model Free Control</a></li>
      <li><a href="https://www.youtube.com/watch?v=71nH1BUjhNw">강화학습 6강 - Value Function Approximation</a></li>
      <li><a href="https://www.youtube.com/watch?v=2YFBordM1fA">강화학습 7강 - Policy Gradient</a></li>
      <li><a href="https://www.youtube.com/watch?v=S216ZLuCdM0">강화학습 8강 - Integrating Learning and Planning</a></li>
      <li><a href="https://www.youtube.com/watch?v=nm6RwuA_pGE">강화학습 9강 - Exploration and Exploitation</a></li>
      <li><a href="https://www.youtube.com/watch?v=C5_2v4pRc5c">강화학습 10강 - Classic Games</a></li>
    </ul>
  </li>
  <li><a href="http://norman3.github.io/prml/">Pattern Recognition &amp; Machine Learning</a></li>
  <li>S-K RL</li>
  <li>multi_step_actor</li>
</ul>]]></content><author><name>Sang Hun Kim</name></author><summary type="html"><![CDATA[2022-09-20 모델 성능 개선으로 익히는 강화학습 A-Z Part06. 모델 기반 강화학습 Ch 03. 최적제어와 모델기반 강화학습 07. pytorch 모델 MPC 구현]]></summary></entry><entry><title type="html">Study Log (2022.09)</title><link href="https://missflash.github.io/study-log-202210/" rel="alternate" type="text/html" title="Study Log (2022.09)" /><published>2022-09-01T12:36:24+00:00</published><updated>2022-09-01T12:36:24+00:00</updated><id>https://missflash.github.io/study-log-202210</id><content type="html" xml:base="https://missflash.github.io/study-log-202210/"><![CDATA[<!-- # 2022-10-19
* [Deep Campus](https://pasus.tistory.com/)
  * [가치함수 (Value Function)](https://pasus.tistory.com/125?category=1135402)
  * [강화학습에서의 이산공간과 연속공간 문제](https://pasus.tistory.com/126?category=1135402)
  * [벨만 최적 방정식 (Bellman Optimality Equation)](https://pasus.tistory.com/127?category=1135402) -->

<hr />

<h1 id="2022-10-18">2022-10-18</h1>
<ul>
  <li><a href="https://fastcampus.co.kr/data_online_rein">모델 성능 개선으로 익히는 강화학습 A-Z</a>
    <ul>
      <li>Part06. 모델 기반 강화학습
        <ul>
          <li>Ch 03. 최적제어와 모델기반 강화학습
            <ul>
              <li>08. Toward solvability of neural optimization problems</li>
            </ul>
          </li>
          <li>Ch 04. 심층 모델기반 강화학습 사례 소개
            <ul>
              <li>01. 잠재공간에서 Planning 하기 - 1</li>
              <li>02. 잠재공간에서 Planning 하기 - 2</li>
              <li>03. 잠재공간에서 Rollout 하기</li>
            </ul>
          </li>
          <li>Ch 05. 강좌 마무리
            <ul>
              <li>01. 강화학습 강좌 마무리</li>
            </ul>
          </li>
        </ul>
      </li>
    </ul>
  </li>
</ul>

<hr />

<h1 id="template">Template</h1>
<ul>
  <li><a href="https://dnddnjs.gitbook.io/rl/">Fundamental of Reinforcement Learning</a>
    <ul>
      <li>Chapter #.</li>
    </ul>
  </li>
  <li><a href="http://hunkim.github.io/ml/">모두를 위한 머신러닝/딥러닝 강의</a>
    <ul>
      <li>Lecture #.</li>
    </ul>
  </li>
  <li><a href="http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html">UCL Course on RL</a>
    <ul>
      <li>Lecture #.</li>
    </ul>
  </li>
  <li><a href="http://incompleteideas.net/book/the-book-2nd.html">Reinforcement Learning</a>
    <ul>
      <li>Page #.</li>
    </ul>
  </li>
  <li><a href="https://www.youtube.com/playlist?list=PLpRS2w0xWHTcTZyyX8LMmtbcMXpd3s4TU">팡요랩</a>
    <ul>
      <li><a href="https://www.youtube.com/watch?v=wYgyiCEkwC8">강화학습 1강 - 강화학습 introduction</a></li>
      <li><a href="https://www.youtube.com/watch?v=NMesGSXr8H4">강화학습 2강 - Markov Decision Process</a></li>
      <li><a href="https://www.youtube.com/watch?v=rrTxOkbHj-M">강화학습 3강 - Planning by Dynamic Programming</a></li>
      <li><a href="https://www.youtube.com/watch?v=47FyZtBRglI">강화학습 4강 - Model Free Prediction</a></li>
      <li><a href="https://www.youtube.com/watch?v=2h-FD3e1YgQ">강화학습 5강 - Model Free Control</a></li>
      <li><a href="https://www.youtube.com/watch?v=71nH1BUjhNw">강화학습 6강 - Value Function Approximation</a></li>
      <li><a href="https://www.youtube.com/watch?v=2YFBordM1fA">강화학습 7강 - Policy Gradient</a></li>
      <li><a href="https://www.youtube.com/watch?v=S216ZLuCdM0">강화학습 8강 - Integrating Learning and Planning</a></li>
      <li><a href="https://www.youtube.com/watch?v=nm6RwuA_pGE">강화학습 9강 - Exploration and Exploitation</a></li>
      <li><a href="https://www.youtube.com/watch?v=C5_2v4pRc5c">강화학습 10강 - Classic Games</a></li>
    </ul>
  </li>
  <li><a href="http://norman3.github.io/prml/">Pattern Recognition &amp; Machine Learning</a></li>
  <li>S-K RL</li>
  <li>multi_step_actor</li>
</ul>]]></content><author><name>Sang Hun Kim</name></author><summary type="html"><![CDATA[]]></summary></entry><entry><title type="html">Study Log (2022.08)</title><link href="https://missflash.github.io/study-log-202208/" rel="alternate" type="text/html" title="Study Log (2022.08)" /><published>2022-08-01T12:36:24+00:00</published><updated>2022-08-01T12:36:24+00:00</updated><id>https://missflash.github.io/study-log-202208</id><content type="html" xml:base="https://missflash.github.io/study-log-202208/"><![CDATA[<h1 id="2022-08-31">2022-08-31</h1>
<ul>
  <li><a href="https://fastcampus.co.kr/data_online_rein">모델 성능 개선으로 익히는 강화학습 A-Z</a>
    <ul>
      <li>Part 5. 심층강화학습
        <ul>
          <li>Ch 01. 심층강화학습 논문 읽기
            <ul>
              <li>
                <ol>
                  <li>Soft-actor-critic (SAC) - 2</li>
                </ol>
              </li>
            </ul>
          </li>
        </ul>
      </li>
      <li>Part06. 모델 기반 강화학습
        <ul>
          <li>Ch 01. 모델 기반 강화학습 소개
            <ul>
              <li>
                <ol>
                  <li>모델기반 강화학습 소개 및 Dyna</li>
                </ol>
              </li>
            </ul>
          </li>
          <li>Ch 02. Discrete Planning and MBRL
            <ul>
              <li>
                <ol>
                  <li>이산화된 행동공간 Planning - 1</li>
                </ol>
              </li>
            </ul>
          </li>
        </ul>
      </li>
    </ul>
  </li>
</ul>

<hr />

<h1 id="2022-08-30">2022-08-30</h1>
<ul>
  <li><a href="https://fastcampus.co.kr/data_online_rein">모델 성능 개선으로 익히는 강화학습 A-Z</a>
    <ul>
      <li>Part 5. 심층강화학습
        <ul>
          <li>Ch 01. 심층강화학습 논문 읽기
            <ul>
              <li>13. The darkside of PPO</li>
              <li>14. Soft-actor-critic (SAC) - 1</li>
            </ul>
          </li>
        </ul>
      </li>
    </ul>
  </li>
</ul>

<hr />

<h1 id="template">Template</h1>
<ul>
  <li><a href="https://dnddnjs.gitbook.io/rl/">Fundamental of Reinforcement Learning</a>
    <ul>
      <li>Chapter #.</li>
    </ul>
  </li>
  <li><a href="http://hunkim.github.io/ml/">모두를 위한 머신러닝/딥러닝 강의</a>
    <ul>
      <li>Lecture #.</li>
    </ul>
  </li>
  <li><a href="http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html">UCL Course on RL</a>
    <ul>
      <li>Lecture #.</li>
    </ul>
  </li>
  <li><a href="http://incompleteideas.net/book/the-book-2nd.html">Reinforcement Learning</a>
    <ul>
      <li>Page #.</li>
    </ul>
  </li>
  <li><a href="https://www.youtube.com/playlist?list=PLpRS2w0xWHTcTZyyX8LMmtbcMXpd3s4TU">팡요랩</a>
    <ul>
      <li><a href="https://www.youtube.com/watch?v=wYgyiCEkwC8">강화학습 1강 - 강화학습 introduction</a></li>
      <li><a href="https://www.youtube.com/watch?v=NMesGSXr8H4">강화학습 2강 - Markov Decision Process</a></li>
      <li><a href="https://www.youtube.com/watch?v=rrTxOkbHj-M">강화학습 3강 - Planning by Dynamic Programming</a></li>
      <li><a href="https://www.youtube.com/watch?v=47FyZtBRglI">강화학습 4강 - Model Free Prediction</a></li>
      <li><a href="https://www.youtube.com/watch?v=2h-FD3e1YgQ">강화학습 5강 - Model Free Control</a></li>
      <li><a href="https://www.youtube.com/watch?v=71nH1BUjhNw">강화학습 6강 - Value Function Approximation</a></li>
      <li><a href="https://www.youtube.com/watch?v=2YFBordM1fA">강화학습 7강 - Policy Gradient</a></li>
      <li><a href="https://www.youtube.com/watch?v=S216ZLuCdM0">강화학습 8강 - Integrating Learning and Planning</a></li>
      <li><a href="https://www.youtube.com/watch?v=nm6RwuA_pGE">강화학습 9강 - Exploration and Exploitation</a></li>
      <li><a href="https://www.youtube.com/watch?v=C5_2v4pRc5c">강화학습 10강 - Classic Games</a></li>
    </ul>
  </li>
  <li><a href="http://norman3.github.io/prml/">Pattern Recognition &amp; Machine Learning</a></li>
  <li>S-K RL</li>
  <li>multi_step_actor</li>
</ul>]]></content><author><name>Sang Hun Kim</name></author><summary type="html"><![CDATA[2022-08-31 모델 성능 개선으로 익히는 강화학습 A-Z Part 5. 심층강화학습 Ch 01. 심층강화학습 논문 읽기 Soft-actor-critic (SAC) - 2 Part06. 모델 기반 강화학습 Ch 01. 모델 기반 강화학습 소개 모델기반 강화학습 소개 및 Dyna Ch 02. Discrete Planning and MBRL 이산화된 행동공간 Planning - 1]]></summary></entry><entry><title type="html">Study Log (2022.07)</title><link href="https://missflash.github.io/study-log-202207/" rel="alternate" type="text/html" title="Study Log (2022.07)" /><published>2022-07-01T12:36:24+00:00</published><updated>2022-07-01T12:36:24+00:00</updated><id>https://missflash.github.io/study-log-202207</id><content type="html" xml:base="https://missflash.github.io/study-log-202207/"><![CDATA[<h1 id="2022-07-05">2022-07-05</h1>
<ul>
  <li><a href="https://fastcampus.co.kr/data_online_rein">모델 성능 개선으로 익히는 강화학습 A-Z</a>
    <ul>
      <li>Part 5. 심층강화학습
        <ul>
          <li>Ch 01. 심층강화학습 논문 읽기
            <ul>
              <li>11. Asynchrnous Advantage Actor Critic (A3C)</li>
              <li>12. Proximal Policy Optimization (PPO)</li>
            </ul>
          </li>
        </ul>
      </li>
    </ul>
  </li>
</ul>

<hr />

<h1 id="template">Template</h1>
<ul>
  <li><a href="https://dnddnjs.gitbook.io/rl/">Fundamental of Reinforcement Learning</a>
    <ul>
      <li>Chapter #.</li>
    </ul>
  </li>
  <li><a href="http://hunkim.github.io/ml/">모두를 위한 머신러닝/딥러닝 강의</a>
    <ul>
      <li>Lecture #.</li>
    </ul>
  </li>
  <li><a href="http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html">UCL Course on RL</a>
    <ul>
      <li>Lecture #.</li>
    </ul>
  </li>
  <li><a href="http://incompleteideas.net/book/the-book-2nd.html">Reinforcement Learning</a>
    <ul>
      <li>Page #.</li>
    </ul>
  </li>
  <li><a href="https://www.youtube.com/playlist?list=PLpRS2w0xWHTcTZyyX8LMmtbcMXpd3s4TU">팡요랩</a>
    <ul>
      <li><a href="https://www.youtube.com/watch?v=wYgyiCEkwC8">강화학습 1강 - 강화학습 introduction</a></li>
      <li><a href="https://www.youtube.com/watch?v=NMesGSXr8H4">강화학습 2강 - Markov Decision Process</a></li>
      <li><a href="https://www.youtube.com/watch?v=rrTxOkbHj-M">강화학습 3강 - Planning by Dynamic Programming</a></li>
      <li><a href="https://www.youtube.com/watch?v=47FyZtBRglI">강화학습 4강 - Model Free Prediction</a></li>
      <li><a href="https://www.youtube.com/watch?v=2h-FD3e1YgQ">강화학습 5강 - Model Free Control</a></li>
      <li><a href="https://www.youtube.com/watch?v=71nH1BUjhNw">강화학습 6강 - Value Function Approximation</a></li>
      <li><a href="https://www.youtube.com/watch?v=2YFBordM1fA">강화학습 7강 - Policy Gradient</a></li>
      <li><a href="https://www.youtube.com/watch?v=S216ZLuCdM0">강화학습 8강 - Integrating Learning and Planning</a></li>
      <li><a href="https://www.youtube.com/watch?v=nm6RwuA_pGE">강화학습 9강 - Exploration and Exploitation</a></li>
      <li><a href="https://www.youtube.com/watch?v=C5_2v4pRc5c">강화학습 10강 - Classic Games</a></li>
    </ul>
  </li>
  <li><a href="http://norman3.github.io/prml/">Pattern Recognition &amp; Machine Learning</a></li>
  <li>S-K RL</li>
  <li>multi_step_actor</li>
</ul>]]></content><author><name>Sang Hun Kim</name></author><summary type="html"><![CDATA[2022-07-05 모델 성능 개선으로 익히는 강화학습 A-Z Part 5. 심층강화학습 Ch 01. 심층강화학습 논문 읽기 11. Asynchrnous Advantage Actor Critic (A3C) 12. Proximal Policy Optimization (PPO)]]></summary></entry><entry><title type="html">Study Log (2022.06)</title><link href="https://missflash.github.io/study-log-202206/" rel="alternate" type="text/html" title="Study Log (2022.06)" /><published>2022-06-01T05:13:30+00:00</published><updated>2022-06-01T05:13:30+00:00</updated><id>https://missflash.github.io/study-log-202206</id><content type="html" xml:base="https://missflash.github.io/study-log-202206/"><![CDATA[<h1 id="2022-06-17">2022-06-17</h1>
<ul>
  <li><a href="https://fastcampus.co.kr/data_online_rein">모델 성능 개선으로 익히는 강화학습 A-Z</a>
    <ul>
      <li>Part 5. 심층강화학습
        <ul>
          <li>Ch 01. 심층강화학습 논문 읽기
            <ul>
              <li>09. Maximization bias이 싫어요 DDQN TD3 - 1</li>
              <li>10. Maximization bias이 싫어요 DDQN TD3 - 2</li>
            </ul>
          </li>
        </ul>
      </li>
    </ul>
  </li>
</ul>

<hr />

<h1 id="template">Template</h1>
<ul>
  <li><a href="https://dnddnjs.gitbook.io/rl/">Fundamental of Reinforcement Learning</a>
    <ul>
      <li>Chapter #.</li>
    </ul>
  </li>
  <li><a href="http://hunkim.github.io/ml/">모두를 위한 머신러닝/딥러닝 강의</a>
    <ul>
      <li>Lecture #.</li>
    </ul>
  </li>
  <li><a href="http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html">UCL Course on RL</a>
    <ul>
      <li>Lecture #.</li>
    </ul>
  </li>
  <li><a href="http://incompleteideas.net/book/the-book-2nd.html">Reinforcement Learning</a>
    <ul>
      <li>Page #.</li>
    </ul>
  </li>
  <li><a href="https://www.youtube.com/playlist?list=PLpRS2w0xWHTcTZyyX8LMmtbcMXpd3s4TU">팡요랩</a>
    <ul>
      <li><a href="https://www.youtube.com/watch?v=wYgyiCEkwC8">강화학습 1강 - 강화학습 introduction</a></li>
      <li><a href="https://www.youtube.com/watch?v=NMesGSXr8H4">강화학습 2강 - Markov Decision Process</a></li>
      <li><a href="https://www.youtube.com/watch?v=rrTxOkbHj-M">강화학습 3강 - Planning by Dynamic Programming</a></li>
      <li><a href="https://www.youtube.com/watch?v=47FyZtBRglI">강화학습 4강 - Model Free Prediction</a></li>
      <li><a href="https://www.youtube.com/watch?v=2h-FD3e1YgQ">강화학습 5강 - Model Free Control</a></li>
      <li><a href="https://www.youtube.com/watch?v=71nH1BUjhNw">강화학습 6강 - Value Function Approximation</a></li>
      <li><a href="https://www.youtube.com/watch?v=2YFBordM1fA">강화학습 7강 - Policy Gradient</a></li>
      <li><a href="https://www.youtube.com/watch?v=S216ZLuCdM0">강화학습 8강 - Integrating Learning and Planning</a></li>
      <li><a href="https://www.youtube.com/watch?v=nm6RwuA_pGE">강화학습 9강 - Exploration and Exploitation</a></li>
      <li><a href="https://www.youtube.com/watch?v=C5_2v4pRc5c">강화학습 10강 - Classic Games</a></li>
    </ul>
  </li>
  <li><a href="http://norman3.github.io/prml/">Pattern Recognition &amp; Machine Learning</a></li>
  <li>S-K RL</li>
  <li>multi_step_actor</li>
</ul>]]></content><author><name>Sang Hun Kim</name></author><summary type="html"><![CDATA[2022-06-17 모델 성능 개선으로 익히는 강화학습 A-Z Part 5. 심층강화학습 Ch 01. 심층강화학습 논문 읽기 09. Maximization bias이 싫어요 DDQN TD3 - 1 10. Maximization bias이 싫어요 DDQN TD3 - 2]]></summary></entry><entry><title type="html">Study Log (2022.05)</title><link href="https://missflash.github.io/study-log-202205/" rel="alternate" type="text/html" title="Study Log (2022.05)" /><published>2022-05-01T05:13:30+00:00</published><updated>2022-05-01T05:13:30+00:00</updated><id>https://missflash.github.io/study-log-202205</id><content type="html" xml:base="https://missflash.github.io/study-log-202205/"><![CDATA[<h1 id="2022-05-31">2022-05-31</h1>
<ul>
  <li><a href="http://www.yes24.com/Product/Goods/57237963">프로그래머를 위한 베이지안 with 파이썬</a>
    <ul>
      <li>Ch 07. 베이지안 A/B 테스트
        <ul>
          <li>7.1 서론</li>
          <li>7.2 전환율 테스트 개요</li>
          <li>7.3 선형손실함수 추가하기</li>
          <li>7.4 전환율을 넘어서: t-검정</li>
          <li>7.5 증분 추정하기</li>
          <li>7.6 결론</li>
          <li>7.7 참고자료</li>
        </ul>
      </li>
    </ul>
  </li>
</ul>

<hr />

<h1 id="2022-05-30">2022-05-30</h1>
<ul>
  <li><a href="http://www.yes24.com/Product/Goods/57237963">프로그래머를 위한 베이지안 with 파이썬</a>
    <ul>
      <li>Ch 06. 우선순위 바로잡기
        <ul>
          <li>6.5 해당 분야 전문가로부터 사전확률분포 유도하기</li>
          <li>6.6 켤레 사전확률분포</li>
          <li>6.7 제프리 사전확률분포</li>
          <li>6.8 N이 증가할 때 사전확률분포의 효과</li>
          <li>6.9 결론</li>
          <li>6.10 부록</li>
          <li>6.11 참고자료</li>
        </ul>
      </li>
    </ul>
  </li>
</ul>

<hr />

<h1 id="2022-05-27">2022-05-27</h1>
<ul>
  <li><a href="http://www.yes24.com/Product/Goods/57237963">프로그래머를 위한 베이지안 with 파이썬</a>
    <ul>
      <li>Ch 06. 우선순위 바로잡기
        <ul>
          <li>6.1 서론</li>
          <li>6.2 주관적인 사전확률분포 vs. 객관적인 사전확률분포</li>
          <li>6.3 알아두면 유용한 사전확률분포</li>
          <li>6.4 예제: 베이지안 MAB (Multi-Armed-Bandits)</li>
        </ul>
      </li>
    </ul>
  </li>
</ul>

<hr />

<h1 id="2022-05-26">2022-05-26</h1>
<ul>
  <li><a href="http://www.yes24.com/Product/Goods/57237963">프로그래머를 위한 베이지안 with 파이썬</a>
    <ul>
      <li>Ch 05. 오히려 큰 손해를 보시겠습니까?
        <ul>
          <li>5.3 베이지안 방법을 통한 기계학습</li>
          <li>5.4 결론</li>
          <li>5.5 참고자료</li>
        </ul>
      </li>
    </ul>
  </li>
</ul>

<hr />

<h1 id="2022-05-24">2022-05-24</h1>
<ul>
  <li><a href="http://www.yes24.com/Product/Goods/57237963">프로그래머를 위한 베이지안 with 파이썬</a>
    <ul>
      <li>Ch 05. 오히려 큰 손해를 보시겠습니까?
        <ul>
          <li>5.1 서론</li>
          <li>5.2 손실함수</li>
        </ul>
      </li>
    </ul>
  </li>
</ul>

<hr />

<h1 id="2022-05-23">2022-05-23</h1>
<ul>
  <li><a href="http://www.yes24.com/Product/Goods/57237963">프로그래머를 위한 베이지안 with 파이썬</a>
    <ul>
      <li>Ch 03. MCMC 블랙박스 열기
        <ul>
          <li>3.1 베이지안 지형</li>
          <li>3.2 수렴 판정하기</li>
          <li>3.3 MCMC에 대한 유용한 팁</li>
          <li>3.4 결론</li>
          <li>3.5 참고자료</li>
        </ul>
      </li>
      <li>Ch 04. 아무도 알려주지 않는 위대한 이론
        <ul>
          <li>4.1 서론</li>
          <li>4.2 큰 수의 법칙</li>
          <li>4.3 작은 수의 혼란</li>
          <li>4.4 결론</li>
          <li>4.5 부록</li>
          <li>4.6 연습문제</li>
          <li>4.7 참고자료</li>
        </ul>
      </li>
    </ul>
  </li>
</ul>

<hr />

<h1 id="2022-05-20">2022-05-20</h1>
<ul>
  <li><a href="http://www.yes24.com/Product/Goods/57237963">프로그래머를 위한 베이지안 with 파이썬</a>
    <ul>
      <li>Ch 02. PyMC 더 알아보기
        <ul>
          <li>2.1 서론</li>
          <li>2.2 모델링 방법</li>
          <li>2.3 우리의 모델이 적절한가?</li>
          <li>2.4 결론</li>
          <li>2.5 부록</li>
          <li>2.6 연습문제</li>
          <li>2.7 참고자료</li>
        </ul>
      </li>
    </ul>
  </li>
</ul>

<hr />

<h1 id="2022-05-19">2022-05-19</h1>
<ul>
  <li><a href="http://www.yes24.com/Product/Goods/57237963">프로그래머를 위한 베이지안 with 파이썬</a>
    <ul>
      <li>Ch 01. 베이지안 추론의 철학
        <ul>
          <li>1.1 서론</li>
          <li>1.2 베이지안 프레임워크</li>
          <li>1.3 확률분포</li>
          <li>1.4 컴퓨터를 사용하여 베이지안 추론하기</li>
          <li>1.5 결론</li>
          <li>1.6 부록</li>
          <li>1.7 연습문제</li>
          <li>1.8 참고자료</li>
        </ul>
      </li>
    </ul>
  </li>
</ul>

<hr />

<h1 id="2022-05-01">2022-05-01</h1>
<ul>
  <li><a href="http://www.yes24.com/Product/Goods/89605439">단단한 강화학습</a></li>
</ul>

<hr />

<h1 id="template">Template</h1>
<ul>
  <li><a href="https://dnddnjs.gitbook.io/rl/">Fundamental of Reinforcement Learning</a>
    <ul>
      <li>Chapter #.</li>
    </ul>
  </li>
  <li><a href="http://hunkim.github.io/ml/">모두를 위한 머신러닝/딥러닝 강의</a>
    <ul>
      <li>Lecture #.</li>
    </ul>
  </li>
  <li><a href="http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html">UCL Course on RL</a>
    <ul>
      <li>Lecture #.</li>
    </ul>
  </li>
  <li><a href="http://incompleteideas.net/book/the-book-2nd.html">Reinforcement Learning</a>
    <ul>
      <li>Page #.</li>
    </ul>
  </li>
  <li><a href="https://www.youtube.com/playlist?list=PLpRS2w0xWHTcTZyyX8LMmtbcMXpd3s4TU">팡요랩</a>
    <ul>
      <li><a href="https://www.youtube.com/watch?v=wYgyiCEkwC8">강화학습 1강 - 강화학습 introduction</a></li>
      <li><a href="https://www.youtube.com/watch?v=NMesGSXr8H4">강화학습 2강 - Markov Decision Process</a></li>
      <li><a href="https://www.youtube.com/watch?v=rrTxOkbHj-M">강화학습 3강 - Planning by Dynamic Programming</a></li>
      <li><a href="https://www.youtube.com/watch?v=47FyZtBRglI">강화학습 4강 - Model Free Prediction</a></li>
      <li><a href="https://www.youtube.com/watch?v=2h-FD3e1YgQ">강화학습 5강 - Model Free Control</a></li>
      <li><a href="https://www.youtube.com/watch?v=71nH1BUjhNw">강화학습 6강 - Value Function Approximation</a></li>
      <li><a href="https://www.youtube.com/watch?v=2YFBordM1fA">강화학습 7강 - Policy Gradient</a></li>
      <li><a href="https://www.youtube.com/watch?v=S216ZLuCdM0">강화학습 8강 - Integrating Learning and Planning</a></li>
      <li><a href="https://www.youtube.com/watch?v=nm6RwuA_pGE">강화학습 9강 - Exploration and Exploitation</a></li>
      <li><a href="https://www.youtube.com/watch?v=C5_2v4pRc5c">강화학습 10강 - Classic Games</a></li>
    </ul>
  </li>
  <li><a href="http://norman3.github.io/prml/">Pattern Recognition &amp; Machine Learning</a></li>
  <li>S-K RL</li>
  <li>multi_step_actor</li>
</ul>]]></content><author><name>Sang Hun Kim</name></author><summary type="html"><![CDATA[2022-05-31 프로그래머를 위한 베이지안 with 파이썬 Ch 07. 베이지안 A/B 테스트 7.1 서론 7.2 전환율 테스트 개요 7.3 선형손실함수 추가하기 7.4 전환율을 넘어서: t-검정 7.5 증분 추정하기 7.6 결론 7.7 참고자료]]></summary></entry><entry><title type="html">Google Photo 일괄 삭제 방법</title><link href="https://missflash.github.io/google-photo-auto-delete/" rel="alternate" type="text/html" title="Google Photo 일괄 삭제 방법" /><published>2022-04-27T06:10:01+00:00</published><updated>2022-04-27T06:10:01+00:00</updated><id>https://missflash.github.io/google-photo-auto-delete</id><content type="html" xml:base="https://missflash.github.io/google-photo-auto-delete/"><![CDATA[<h1 id="사전-작업">사전 작업</h1>
<ul>
  <li>Google Photo 언어 확인 (영어, 한국어)
    <ul>
      <li><a href="https://photos.google.com">https://photos.google.com</a></li>
    </ul>
  </li>
  <li>Google Chrome 브라우저 설치</li>
  <li>Google Chrome 개발자 도구 실행
    <ul>
      <li>빈 공간 오른쪽 마우스 &gt; 검사 (Inspect) &gt; Console 탭 실행</li>
    </ul>
  </li>
</ul>

<h1 id="영어-설정">영어 설정</h1>
<ul>
  <li>Console 창에 스크립트 붙여넣기 후 Enter</li>
</ul>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>// How many photos to delete?
// Put a number value, like this
// const maxImageCount = 5896
const maxImageCount = "ALL_PHOTOS";

// Selector for Images and buttons
const ELEMENT_SELECTORS = {
    checkboxClass: '.ckGgle',
    deleteButton: 'button[aria-label="Delete"]',
    languageAgnosticDeleteButton: 'div[data-delete-origin] &gt; button',
    deleteButton: 'button[aria-label="Delete"]',
    confirmationButton: '#yDmH0d &gt; div.llhEMd.iWO5td &gt; div &gt; div.g3VIld.V639qd.bvQPzd.oEOLpc.Up8vH.J9Nfi.A9Uzve.iWO5td &gt; div.XfpsVe.J9fJmf &gt; button.VfPpkd-LgbsSe.VfPpkd-LgbsSe-OWXEXe-k8QpJ.nCP5yc.kHssdc.HvOprf'
}

// Time Configuration (in milliseconds)
const TIME_CONFIG = {
    delete_cycle: 10000,
    press_button_delay: 2000
};

const MAX_RETRIES = 10;
let imageCount = 0;
let checkboxes;

let buttons = {
    deleteButton: null,
    confirmationButton: null
}

let deleteTask = setInterval(() =&gt; {
    let attemptCount = 1;

    do {
        checkboxes = document.querySelectorAll(ELEMENT_SELECTORS['checkboxClass']);
    } while (checkboxes.length &lt;= 0 &amp;&amp; attemptCount++ &lt; MAX_RETRIES);

    if (checkboxes.length &lt;= 0) {
        console.log("[INFO] No more images to delete.");
        clearInterval(deleteTask);
        console.log("[SUCCESS] Tool exited.");
        return;
    }

    imageCount += checkboxes.length;
    checkboxes.forEach((checkbox) =&gt; { checkbox.click() });
    console.log("[INFO] Deleting", checkboxes.length, "images");

    setTimeout(() =&gt; {
        try {
            buttons.deleteButton = document.querySelector(ELEMENT_SELECTORS['languageAgnosticDeleteButton']);
            buttons.deleteButton.click();
        } catch {
            buttons.deleteButton = document.querySelector(ELEMENT_SELECTORS['deleteButton']);
            buttons.deleteButton.click();
        }

        setTimeout(() =&gt; {
            buttons.confirmation_button = document.querySelector(ELEMENT_SELECTORS['confirmationButton']);
            buttons.confirmation_button.click();

            console.log(`[INFO] ${imageCount}/${maxImageCount} Deleted`);
            if (maxImageCount !== "ALL_PHOTOS" &amp;&amp; imageCount &gt;= parseInt(maxImageCount)) {
                console.log(`${imageCount} photos deleted as requested`);
                clearInterval(deleteTask);
                console.log("[SUCCESS] Tool exited.");
                return;
            }

        }, TIME_CONFIG['press_button_delay']);
    }, TIME_CONFIG['press_button_delay']);
}, TIME_CONFIG['delete_cycle']);
</code></pre></div></div>

<h1 id="한국어-설정">한국어 설정</h1>
<ul>
  <li>Console 창에 스크립트 붙여넣기 후 Enter</li>
</ul>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>const maxImageCount = "ALL_PHOTOS";

// Selector for Images and buttons
const ELEMENT_SELECTORS = {
    checkboxClass: '.ckGgle',
    deleteButton: 'button[aria-label="삭제"]',
    confirmationButton: '#yDmH0d &gt; div.llhEMd.iWO5td &gt; div &gt; div.g3VIld.V639qd.bvQPzd.oEOLpc.Up8vH.J9Nfi.A9Uzve.iWO5td &gt; div.XfpsVe.J9fJmf &gt; button.VfPpkd-LgbsSe.VfPpkd-LgbsSe-OWXEXe-k8QpJ.nCP5yc.kHssdc.HvOprf'
}

// Time Configuration (in milliseconds)
const TIME_CONFIG = {
    //delete_cycle: 7000,
    delete_cycle: 30000,
    press_button_delay: 1000
};

const MAX_RETRIES = 10;
let imageCount = 0;
let checkboxes;

let buttons = {
    deleteButton: null,
    confirmationButton: null
}

let deleteTask = setInterval(() =&gt; {
    let attemptCount = 1;

    do {
        checkboxes = document.querySelectorAll(ELEMENT_SELECTORS['checkboxClass']);
    } while (checkboxes.length &lt;= 0 &amp;&amp; attemptCount++ &lt; MAX_RETRIES);

    if (checkboxes.length &lt;= 0) {
        console.log("[INFO] No more images to delete.");
        clearInterval(deleteTask);
        console.log("[SUCCESS] Tool exited.");
        return;
    }

    imageCount += checkboxes.length;
    checkboxes.forEach((checkbox) =&gt; { checkbox.click() });
    console.log("[INFO] Deleting", checkboxes.length, "images");

    setTimeout(() =&gt; {
        buttons.deleteButton = document.querySelector(ELEMENT_SELECTORS['deleteButton']);
        buttons.deleteButton.click();
        setTimeout(() =&gt; {
            buttons.confirmation_button = document.querySelector(ELEMENT_SELECTORS['confirmationButton']);
            buttons.confirmation_button.click();
            console.log(`[INFO] ${imageCount}/${maxImageCount} Deleted`);

            if (maxImageCount !== "ALL_PHOTOS" &amp;&amp; imageCount &gt;= parseInt(maxImageCount)) {
                console.log(`${imageCount} photos deleted as requested`);
                clearInterval(deleteTask);
                console.log("[SUCCESS] Tool exited.");
                return;
            }
        }, TIME_CONFIG['press_button_delay']);
    }, TIME_CONFIG['press_button_delay']);
}, TIME_CONFIG['delete_cycle']);
</code></pre></div></div>

<ul>
  <li>참고자료 : <a href="https://github.com/mrishab/google-photos-delete-tool/">https://github.com/mrishab/google-photos-delete-tool/</a></li>
</ul>]]></content><author><name>Sang Hun Kim</name></author><summary type="html"><![CDATA[]]></summary></entry><entry><title type="html">Study Log (2022.04)</title><link href="https://missflash.github.io/study-log-202204/" rel="alternate" type="text/html" title="Study Log (2022.04)" /><published>2022-04-01T05:13:30+00:00</published><updated>2022-04-01T05:13:30+00:00</updated><id>https://missflash.github.io/study-log-202204</id><content type="html" xml:base="https://missflash.github.io/study-log-202204/"><![CDATA[<h1 id="2022-04-01">2022-04-01</h1>
<ul>
  <li><a href="http://www.yes24.com/Product/Goods/89605439">단단한 강화학습</a></li>
</ul>

<hr />

<h1 id="template">Template</h1>
<ul>
  <li><a href="https://dnddnjs.gitbook.io/rl/">Fundamental of Reinforcement Learning</a>
    <ul>
      <li>Chapter #.</li>
    </ul>
  </li>
  <li><a href="http://hunkim.github.io/ml/">모두를 위한 머신러닝/딥러닝 강의</a>
    <ul>
      <li>Lecture #.</li>
    </ul>
  </li>
  <li><a href="http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html">UCL Course on RL</a>
    <ul>
      <li>Lecture #.</li>
    </ul>
  </li>
  <li><a href="http://incompleteideas.net/book/the-book-2nd.html">Reinforcement Learning</a>
    <ul>
      <li>Page #.</li>
    </ul>
  </li>
  <li><a href="https://www.youtube.com/playlist?list=PLpRS2w0xWHTcTZyyX8LMmtbcMXpd3s4TU">팡요랩</a>
    <ul>
      <li><a href="https://www.youtube.com/watch?v=wYgyiCEkwC8">강화학습 1강 - 강화학습 introduction</a></li>
      <li><a href="https://www.youtube.com/watch?v=NMesGSXr8H4">강화학습 2강 - Markov Decision Process</a></li>
      <li><a href="https://www.youtube.com/watch?v=rrTxOkbHj-M">강화학습 3강 - Planning by Dynamic Programming</a></li>
      <li><a href="https://www.youtube.com/watch?v=47FyZtBRglI">강화학습 4강 - Model Free Prediction</a></li>
      <li><a href="https://www.youtube.com/watch?v=2h-FD3e1YgQ">강화학습 5강 - Model Free Control</a></li>
      <li><a href="https://www.youtube.com/watch?v=71nH1BUjhNw">강화학습 6강 - Value Function Approximation</a></li>
      <li><a href="https://www.youtube.com/watch?v=2YFBordM1fA">강화학습 7강 - Policy Gradient</a></li>
      <li><a href="https://www.youtube.com/watch?v=S216ZLuCdM0">강화학습 8강 - Integrating Learning and Planning</a></li>
      <li><a href="https://www.youtube.com/watch?v=nm6RwuA_pGE">강화학습 9강 - Exploration and Exploitation</a></li>
      <li><a href="https://www.youtube.com/watch?v=C5_2v4pRc5c">강화학습 10강 - Classic Games</a></li>
    </ul>
  </li>
  <li><a href="http://norman3.github.io/prml/">Pattern Recognition &amp; Machine Learning</a></li>
  <li>S-K RL</li>
  <li>multi_step_actor</li>
</ul>]]></content><author><name>Sang Hun Kim</name></author><summary type="html"><![CDATA[2022-04-01 단단한 강화학습]]></summary></entry><entry><title type="html">Study Log (2022.03)</title><link href="https://missflash.github.io/study-log-202203/" rel="alternate" type="text/html" title="Study Log (2022.03)" /><published>2022-03-01T05:13:30+00:00</published><updated>2022-03-01T05:13:30+00:00</updated><id>https://missflash.github.io/study-log-202203</id><content type="html" xml:base="https://missflash.github.io/study-log-202203/"><![CDATA[<h1 id="2022-03-01">2022-03-01</h1>
<ul>
  <li><a href="http://www.yes24.com/Product/Goods/89605439">단단한 강화학습</a></li>
</ul>

<hr />

<h1 id="template">Template</h1>
<ul>
  <li><a href="https://dnddnjs.gitbook.io/rl/">Fundamental of Reinforcement Learning</a>
    <ul>
      <li>Chapter #.</li>
    </ul>
  </li>
  <li><a href="http://hunkim.github.io/ml/">모두를 위한 머신러닝/딥러닝 강의</a>
    <ul>
      <li>Lecture #.</li>
    </ul>
  </li>
  <li><a href="http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html">UCL Course on RL</a>
    <ul>
      <li>Lecture #.</li>
    </ul>
  </li>
  <li><a href="http://incompleteideas.net/book/the-book-2nd.html">Reinforcement Learning</a>
    <ul>
      <li>Page #.</li>
    </ul>
  </li>
  <li><a href="https://www.youtube.com/playlist?list=PLpRS2w0xWHTcTZyyX8LMmtbcMXpd3s4TU">팡요랩</a>
    <ul>
      <li><a href="https://www.youtube.com/watch?v=wYgyiCEkwC8">강화학습 1강 - 강화학습 introduction</a></li>
      <li><a href="https://www.youtube.com/watch?v=NMesGSXr8H4">강화학습 2강 - Markov Decision Process</a></li>
      <li><a href="https://www.youtube.com/watch?v=rrTxOkbHj-M">강화학습 3강 - Planning by Dynamic Programming</a></li>
      <li><a href="https://www.youtube.com/watch?v=47FyZtBRglI">강화학습 4강 - Model Free Prediction</a></li>
      <li><a href="https://www.youtube.com/watch?v=2h-FD3e1YgQ">강화학습 5강 - Model Free Control</a></li>
      <li><a href="https://www.youtube.com/watch?v=71nH1BUjhNw">강화학습 6강 - Value Function Approximation</a></li>
      <li><a href="https://www.youtube.com/watch?v=2YFBordM1fA">강화학습 7강 - Policy Gradient</a></li>
      <li><a href="https://www.youtube.com/watch?v=S216ZLuCdM0">강화학습 8강 - Integrating Learning and Planning</a></li>
      <li><a href="https://www.youtube.com/watch?v=nm6RwuA_pGE">강화학습 9강 - Exploration and Exploitation</a></li>
      <li><a href="https://www.youtube.com/watch?v=C5_2v4pRc5c">강화학습 10강 - Classic Games</a></li>
    </ul>
  </li>
  <li><a href="http://norman3.github.io/prml/">Pattern Recognition &amp; Machine Learning</a></li>
  <li>S-K RL</li>
  <li>multi_step_actor</li>
</ul>]]></content><author><name>Sang Hun Kim</name></author><summary type="html"><![CDATA[2022-03-01 단단한 강화학습]]></summary></entry><entry><title type="html">Study Log (2022.02)</title><link href="https://missflash.github.io/study-log-202202/" rel="alternate" type="text/html" title="Study Log (2022.02)" /><published>2022-02-01T05:13:30+00:00</published><updated>2022-02-01T05:13:30+00:00</updated><id>https://missflash.github.io/study-log-202202</id><content type="html" xml:base="https://missflash.github.io/study-log-202202/"><![CDATA[<h1 id="2022-02-28">2022-02-28</h1>
<ul>
  <li><a href="https://fastcampus.co.kr/data_online_rein">모델 성능 개선으로 익히는 강화학습 A-Z</a>
    <ul>
      <li>Part 5. 심층강화학습
        <ul>
          <li>Ch 01. 심층강화학습 논문 읽기
            <ul>
              <li>08. 심층강화학습을 여행하는 히치하이커를 위한 안내서</li>
            </ul>
          </li>
        </ul>
      </li>
    </ul>
  </li>
</ul>

<hr />

<h1 id="2022-02-23">2022-02-23</h1>
<ul>
  <li><a href="https://fastcampus.co.kr/data_online_rein">모델 성능 개선으로 익히는 강화학습 A-Z</a>
    <ul>
      <li>Part 5. 심층강화학습
        <ul>
          <li>Ch 01. 심층강화학습 논문 읽기
            <ul>
              <li>07. DDPG 구현하기</li>
            </ul>
          </li>
        </ul>
      </li>
    </ul>
  </li>
</ul>

<hr />

<h1 id="2022-02-21">2022-02-21</h1>
<ul>
  <li><a href="https://fastcampus.co.kr/data_online_rein">모델 성능 개선으로 익히는 강화학습 A-Z</a>
    <ul>
      <li>Part 5. 심층강화학습
        <ul>
          <li>Ch 01. 심층강화학습 논문 읽기
            <ul>
              <li>05. Deep Deterministic Policy Gradient (DDPG) - 1</li>
              <li>06. Deep Deterministic Policy Gradient (DDPG) - 2</li>
            </ul>
          </li>
        </ul>
      </li>
    </ul>
  </li>
</ul>

<hr />

<h1 id="2022-02-16">2022-02-16</h1>
<ul>
  <li><a href="https://fastcampus.co.kr/data_online_rein">모델 성능 개선으로 익히는 강화학습 A-Z</a>
    <ul>
      <li>Part 5. 심층강화학습
        <ul>
          <li>Ch 01. 심층강화학습 논문 읽기
            <ul>
              <li>03. DQN 구현하기</li>
              <li>04. DQN과 아이들</li>
            </ul>
          </li>
        </ul>
      </li>
    </ul>
  </li>
</ul>

<hr />

<h1 id="2022-02-14">2022-02-14</h1>
<ul>
  <li><a href="https://fastcampus.co.kr/data_online_rein">모델 성능 개선으로 익히는 강화학습 A-Z</a>
    <ul>
      <li>Part 5. 심층강화학습
        <ul>
          <li>Ch 01. 심층강화학습 논문 읽기
            <ul>
              <li>01. Deep Q-network (DQN) - 1</li>
              <li>02. Deep Q-network (DQN) - 2</li>
            </ul>
          </li>
        </ul>
      </li>
    </ul>
  </li>
</ul>

<hr />

<h1 id="2022-02-09">2022-02-09</h1>
<ul>
  <li><a href="https://fastcampus.co.kr/data_online_rein">모델 성능 개선으로 익히는 강화학습 A-Z</a>
    <ul>
      <li>Part 4. 정책 최적화
        <ul>
          <li>Ch 03. 정책 경사 다시 소개
            <ul>
              <li>01. 정책 경사 Trajectory 최적화!</li>
            </ul>
          </li>
        </ul>
      </li>
    </ul>
  </li>
</ul>

<hr />

<h1 id="2022-02-07">2022-02-07</h1>
<ul>
  <li><a href="https://fastcampus.co.kr/data_online_rein">모델 성능 개선으로 익히는 강화학습 A-Z</a>
    <ul>
      <li>Part 4. 정책 최적화
        <ul>
          <li>Ch 02. Actor-critic 소개
            <ul>
              <li>01. Actor-critic 가치기반 강화학습과 정책 경사의 만남</li>
              <li>02. Actor-critic 실습</li>
            </ul>
          </li>
        </ul>
      </li>
    </ul>
  </li>
</ul>

<hr />

<h1 id="2022-02-02">2022-02-02</h1>
<ul>
  <li><a href="https://fastcampus.co.kr/data_online_rein">모델 성능 개선으로 익히는 강화학습 A-Z</a>
    <ul>
      <li>Part 4. 정책 최적화
        <ul>
          <li>Ch 01. 정책 경사 소개
            <ul>
              <li>03. 정책 경사 실습</li>
              <li>04. 정책 경사 실습 2</li>
            </ul>
          </li>
        </ul>
      </li>
    </ul>
  </li>
</ul>

<hr />

<h1 id="template">Template</h1>
<ul>
  <li><a href="https://dnddnjs.gitbook.io/rl/">Fundamental of Reinforcement Learning</a>
    <ul>
      <li>Chapter #.</li>
    </ul>
  </li>
  <li><a href="http://hunkim.github.io/ml/">모두를 위한 머신러닝/딥러닝 강의</a>
    <ul>
      <li>Lecture #.</li>
    </ul>
  </li>
  <li><a href="http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html">UCL Course on RL</a>
    <ul>
      <li>Lecture #.</li>
    </ul>
  </li>
  <li><a href="http://incompleteideas.net/book/the-book-2nd.html">Reinforcement Learning</a>
    <ul>
      <li>Page #.</li>
    </ul>
  </li>
  <li><a href="https://www.youtube.com/playlist?list=PLpRS2w0xWHTcTZyyX8LMmtbcMXpd3s4TU">팡요랩</a>
    <ul>
      <li><a href="https://www.youtube.com/watch?v=wYgyiCEkwC8">강화학습 1강 - 강화학습 introduction</a></li>
      <li><a href="https://www.youtube.com/watch?v=NMesGSXr8H4">강화학습 2강 - Markov Decision Process</a></li>
      <li><a href="https://www.youtube.com/watch?v=rrTxOkbHj-M">강화학습 3강 - Planning by Dynamic Programming</a></li>
      <li><a href="https://www.youtube.com/watch?v=47FyZtBRglI">강화학습 4강 - Model Free Prediction</a></li>
      <li><a href="https://www.youtube.com/watch?v=2h-FD3e1YgQ">강화학습 5강 - Model Free Control</a></li>
      <li><a href="https://www.youtube.com/watch?v=71nH1BUjhNw">강화학습 6강 - Value Function Approximation</a></li>
      <li><a href="https://www.youtube.com/watch?v=2YFBordM1fA">강화학습 7강 - Policy Gradient</a></li>
      <li><a href="https://www.youtube.com/watch?v=S216ZLuCdM0">강화학습 8강 - Integrating Learning and Planning</a></li>
      <li><a href="https://www.youtube.com/watch?v=nm6RwuA_pGE">강화학습 9강 - Exploration and Exploitation</a></li>
      <li><a href="https://www.youtube.com/watch?v=C5_2v4pRc5c">강화학습 10강 - Classic Games</a></li>
    </ul>
  </li>
  <li><a href="http://norman3.github.io/prml/">Pattern Recognition &amp; Machine Learning</a></li>
  <li>S-K RL</li>
  <li>multi_step_actor</li>
</ul>]]></content><author><name>Sang Hun Kim</name></author><summary type="html"><![CDATA[2022-02-28 모델 성능 개선으로 익히는 강화학습 A-Z Part 5. 심층강화학습 Ch 01. 심층강화학습 논문 읽기 08. 심층강화학습을 여행하는 히치하이커를 위한 안내서]]></summary></entry></feed>