Exercise 21.5

Write out the parameter update equations for TD learning with \(\hat{U}(x,y) = \theta_0 + \theta_1 x + \theta_2 y + \theta_3\,\sqrt{(x-x_g)^2 + (y-y_g)^2}\ .\)

View Answer