000 | 01047nam a22001817a 4500 | ||
---|---|---|---|
008 | 240117b |||||||| |||| 00| 0 eng d | ||
020 | _a9780262039246 | ||
082 |
_a006.31 _bSUT |
||
100 | _aSutton, Richard S. | ||
245 | _aReinforcement Learning: An Introduction | ||
250 | _a2nd ed | ||
260 |
_bMIT Press _c2018 _aUSA |
||
300 | _a526p | ||
500 | _aIntroduction Part 1: Tabular Solution Methods Multi Armed Bandits Finite Markov Decision Processes Dynamic Programming Monte Carlo Methods Temporal Difference Learning N-Step Bootstrapping Planning and Learning with Tabular Methods Part 2: Approximate Solution Methods On-Policy Prediction with Approximation On-Policy Control with Approximation Off-Policy Methods with Approximation Eligibility Traces Policy Gradient Methods Part 3: Looking Deeper Psychology Neuroscience Applications and Case Studies Frontiers | ||
600 | _aComputer Engineering | ||
700 |
_aBarto, Andrew G. _937086 |
||
942 |
_2ddc _cLB _m006.31 _kSUT |
||
999 |
_c148134 _d148134 |