Statistics and Data Science Seminar
Matthew Bourque (PhD Candidate)
University of Illinois at Chicago
Patience Pays Off: A Policy Improvement Algorithm for Stochastic Games of Perfect Information with Average Payoffs
Abstract: Stochastic games model a competitive situation between two players in discrete time steps over an infinite horizon, in which players' payoffs at each stage depend on both players' action choice. They can be seen as generalizations of both repeated games and Markov decision processes (MDPs). Policy improvement algorithms are an important category of fast algorithms for solving MDPs. In this talk, we will give an introduction to stochastic games, in particular zero-sum games of perfect information and with ARAT structure, and discuss a policy improvement algorithm for finding optimal policies for both players for these categories of games when players are evaluating their payoff steams via a limiting average.
Wednesday September 12, 2012 at 4:00 PM in SEO 636