MARC View

000			01047nam a22001817a 4500
008			240117b \|\|\|\|\|\|\|\| \|\|\|\| 00\| 0 eng d
020			_a9780262039246
082			_a006.31 _bSUT
100			_aSutton, Richard S.
245			_aReinforcement Learning: An Introduction
250			_a2nd ed
260			_bMIT Press _c2018 _aUSA
300			_a526p
500			_aIntroduction Part 1: Tabular Solution Methods Multi Armed Bandits Finite Markov Decision Processes Dynamic Programming Monte Carlo Methods Temporal Difference Learning N-Step Bootstrapping Planning and Learning with Tabular Methods Part 2: Approximate Solution Methods On-Policy Prediction with Approximation On-Policy Control with Approximation Off-Policy Methods with Approximation Eligibility Traces Policy Gradient Methods Part 3: Looking Deeper Psychology Neuroscience Applications and Case Studies Frontiers
600			_aComputer Engineering
700			_aBarto, Andrew G. _937086
942			_2ddc _cLB _m006.31 _kSUT
999			_c148134 _d148134