Implement the REINFORCE and PEGASUS algorithms and apply them to the $4\times 3$ world, using a policy family of your own choosing. Comment on the results.
      Answer
      Improve This Solution
    
    
  View Answer
 
  
Implement the REINFORCE and PEGASUS algorithms and apply them to the $4\times 3$ world, using a policy family of your own choosing. Comment on the results.