Serviceeinschränkungen vom 12.-22.02.2026 - weitere Infos auf der UB-Homepage

Treffer: Pathwise uniform value in gambling houses and Partially Observable Markov Decision Processes

Title:
Pathwise uniform value in gambling houses and Partially Observable Markov Decision Processes
Collection:
RePEc (Research Papers in Economics)
Document Type:
Report report
Language:
unknown
Accession Number:
edsbas.D01078EB
Database:
BASE

Weitere Informationen

In several standard models of dynamic programming (gambling houses, MDPs, POMDPs), we prove the existence of a robust notion of value for the infinitely repeated problem, namely the pathwise uniform value. This solves two open problems. First, this shows that for any ǫ > 0, the decision-maker has a pure strategy σ which is ǫ-optimal in any n-stage game, provided that n is big enough (this result was only known for behavior strategies, that is, strategies which use randomization). Second, the strategy σ can be chosen such that under the long-run average payoff criterion, the decision-maker has more than the limit of the n-stage values. ; Dynamic programming,Markov decision processes,Partial Observation,Uniform value,Long-run average payoff