000 01047nam a22001817a 4500
008 240117b |||||||| |||| 00| 0 eng d
020 _a9780262039246
082 _a006.31
_bSUT
100 _aSutton, Richard S.
245 _aReinforcement Learning: An Introduction
250 _a2nd ed
260 _bMIT Press
_c2018
_aUSA
300 _a526p
500 _aIntroduction Part 1: Tabular Solution Methods Multi Armed Bandits Finite Markov Decision Processes Dynamic Programming Monte Carlo Methods Temporal Difference Learning N-Step Bootstrapping Planning and Learning with Tabular Methods Part 2: Approximate Solution Methods On-Policy Prediction with Approximation On-Policy Control with Approximation Off-Policy Methods with Approximation Eligibility Traces Policy Gradient Methods Part 3: Looking Deeper Psychology Neuroscience Applications and Case Studies Frontiers
600 _aComputer Engineering
700 _aBarto, Andrew G.
_937086
942 _2ddc
_cLB
_m006.31
_kSUT
999 _c148134
_d148134