Exercise 21.1

Table of Contents

Part Ⅰ Artificial Intelligence
1. 1. Introduction
2. 2. Intelligent Agent
Part Ⅱ Problem-solving
Part Ⅲ Knowledge, reasoning, and planning
Part Ⅳ Uncertain knowledge and reasoning
Part Ⅴ Learning
Part Ⅵ Communicating, perceiving, and acting
Part Ⅶ Conclusions
1. 26. Philosophical Foundations
2. Future Exercises

Implement a passive learning agent in a simple environment, such as the $4\times 3$ world. For the case of an initially unknown environment model, compare the learning performance of the direct utility estimation, TD, and ADP algorithms. Do the comparison for the optimal policy and for several random policies. For which do the utility estimates converge faster? What happens when the size of the environment is increased? (Try environments with and without obstacles.)

Answer Improve This Solution

View Answer

Request Answer

Aritificial Intelligence: A Modern Approach

Stuart J. Russell and Peter Norvig