Reinforcement Learning: An Introduction

Provides a clear and simple account of the key ideas and algorithms of reinforcement learning. Familiarity with elementary concepts of probability is assumed.

**Publication date**: 31 Dec 1998

**ISBN-10**:
0262193981

**ISBN-13**:
n/a

**Paperback**:
432 pages

**Views**: 15,485

Reinforcement Learning: An Introduction

Provides a clear and simple account of the key ideas and algorithms of reinforcement learning. Familiarity with elementary concepts of probability is assumed.

Excerpts from the Preface:

Our goal in writing this book was to provide a clear and simple account of the key ideas and algorithms of reinforcement learning. We wanted our treatment to be accessible to readers in all of the related disciplines, but we could not cover all of these perspectives in detail. Our treatment takes almost exclusively the point of view of artificial intelligence and engineering, leaving coverage of connections to psychology, neuroscience, and other fields to others or to another time. We also chose not to produce a rigorous formal treatment of reinforcement learning. We did not reach for the highest possible level of mathematical abstraction and did not rely on a theorem-proof format. We tried to choose a level of mathematical detail that points the mathematically inclined in the right directions without distracting from the simplicity and potential generality of the underlying ideas.

The book consists of three parts. Part I is introductory and problem oriented. We focus on the simplest aspects of reinforcement learning and on its main distinguishing features. One full chapter is devoted to introducing the reinforcement learning problem whose solution we explore in the rest of the book. Part II presents what we see as the three most important elementary solution methods: dynamic programming, simple Monte Carlo methods, and temporal-difference learning. The first of these is a planning method and assumes explicit knowledge of all aspects of a problem, whereas the other two are learning methods. Part III is concerned with generalizing these methods and blending them. Eligibility traces allow unification of Monte Carlo and temporal-difference methods, and function approximation methods such as artificial neural networks extend all the methods so that they can be applied to much larger problems. We bring planning and learning methods together again and relate them to heuristic search. Finally, we summarize our view of the state of reinforcement learning research and briefly present case studies, including some of the most impressive applications of reinforcement learning to date.

Intended Audience:

The book is largely self-contained. The only mathematical background assumed is familiarity with elementary concepts of probability, such as expectations of random variables. Chapter 8 is substantially easier to digest if the reader has some knowledge of artificial neural networks or some other kind of supervised learning method, but it can be read without prior background. We strongly recommend working the exercises provided throughout the book. Solution manuals are available to instructors. This and other related and timely material is available via the Internet.

Reviews:

Amazon.com

:) "The book is very readable by average computer students. Possibly the only difficult one is chapter 8, which deals with some neural network concepts. I highly recommend this book to anyone who wants to learn about this subject. "

:) "The book is easy and interesting to read. The diagrams, especially those on TD, throw a great deal of insight on the basic concept of TD. The intuitive ideas behind RL are developed clearly. At the same time all the fundamental concepts are made mathematically precise using very simple language and notation. Anybody new to RL should find this book extremely useful."

Our goal in writing this book was to provide a clear and simple account of the key ideas and algorithms of reinforcement learning. We wanted our treatment to be accessible to readers in all of the related disciplines, but we could not cover all of these perspectives in detail. Our treatment takes almost exclusively the point of view of artificial intelligence and engineering, leaving coverage of connections to psychology, neuroscience, and other fields to others or to another time. We also chose not to produce a rigorous formal treatment of reinforcement learning. We did not reach for the highest possible level of mathematical abstraction and did not rely on a theorem-proof format. We tried to choose a level of mathematical detail that points the mathematically inclined in the right directions without distracting from the simplicity and potential generality of the underlying ideas.

The book consists of three parts. Part I is introductory and problem oriented. We focus on the simplest aspects of reinforcement learning and on its main distinguishing features. One full chapter is devoted to introducing the reinforcement learning problem whose solution we explore in the rest of the book. Part II presents what we see as the three most important elementary solution methods: dynamic programming, simple Monte Carlo methods, and temporal-difference learning. The first of these is a planning method and assumes explicit knowledge of all aspects of a problem, whereas the other two are learning methods. Part III is concerned with generalizing these methods and blending them. Eligibility traces allow unification of Monte Carlo and temporal-difference methods, and function approximation methods such as artificial neural networks extend all the methods so that they can be applied to much larger problems. We bring planning and learning methods together again and relate them to heuristic search. Finally, we summarize our view of the state of reinforcement learning research and briefly present case studies, including some of the most impressive applications of reinforcement learning to date.

Intended Audience:

The book is largely self-contained. The only mathematical background assumed is familiarity with elementary concepts of probability, such as expectations of random variables. Chapter 8 is substantially easier to digest if the reader has some knowledge of artificial neural networks or some other kind of supervised learning method, but it can be read without prior background. We strongly recommend working the exercises provided throughout the book. Solution manuals are available to instructors. This and other related and timely material is available via the Internet.

Reviews:

Amazon.com

:) "The book is very readable by average computer students. Possibly the only difficult one is chapter 8, which deals with some neural network concepts. I highly recommend this book to anyone who wants to learn about this subject. "

:) "The book is easy and interesting to read. The diagrams, especially those on TD, throw a great deal of insight on the basic concept of TD. The intuitive ideas behind RL are developed clearly. At the same time all the fundamental concepts are made mathematically precise using very simple language and notation. Anybody new to RL should find this book extremely useful."

Tweet 0

About The Author(s)

No information is available for this author.

Richard S. Sutton is Professor and iCORE chair Department of Computing Science at University of Alberta.

Book Categories

Computer Science
34
Introduction to Computer Science
35
Algorithms and Data Structures
17
Object Oriented Programming
19
Theory of Computation
18
Formal Methods
18
Functional Programming
10
Logic Programming
19
Artificial Intelligence
19
Computer Vision
6
Big Data
2
Neural Networks
18
Compiler Design and Construction
14
Computer Organization and Architecture
8
Parallel Computing
3
Concurrent Programming
19
Operating Systems
17
Data Communication and Networks
23
Information Security
6
Information Theory
23
Digital Libraries
14
Information Systems
57
Software Engineering
17
Game Development and Multimedia
7
Data Mining
17
Machine Learning

Mathematics
58
Mathematics
10
Algebra
5
Category Theory
18
Linear Algebra
9
Computer Aided Mathematics
8
Discrete Mathematics
6
Numerical Methods
5
Graph Theory
13
Operations Research
12
Statistics

Supporting Fields
Operating System
Programming/Scripting
6
Ada
12
Assembly
25
C / C++
8
Common Lisp
2
Forth
33
Java
8
JavaScript
1
Lua
13
Microsoft .NET
11
Perl
5
PHP
34
Python
1
Rebol
9
Ruby
1
Scheme
3
Tcl/Tk

Miscellaneous