TY - GEN
T1 - Abstraction-Based Decision Making for Statistical Properties
AU - Cano, Filip
AU - Henzinger, Thomas A.
AU - Könighofer, Bettina
AU - Kueffner, Konstantin
AU - Mallik, Kaushik
PY - 2024/7/5
Y1 - 2024/7/5
N2 - Sequential decision-making in probabilistic environments is a fundamental problem with many applications in AI and economics. In this paper, we present an algorithm for synthesizing sequential decision-making agents that optimize statistical properties such as maximum and average response times. In the general setting of sequential decision-making, the environment is modeled as a random process that generates inputs. The agent responds to each input, aiming to maximize rewards and minimize costs within a specified time horizon. The corresponding synthesis problem is known to be PSPACE-hard. We consider the special case where the input distribution, reward, and cost depend on input-output statistics specified by counter automata. For such problems, this paper presents the first PTIME synthesis algorithms. We introduce the notion of statistical abstraction, which clusters statistically indistinguishable input-output sequences into equivalence classes. This abstraction allows for a dynamic programming algorithm whose complexity grows polynomially with the considered horizon, making the statistical case exponentially more efficient than the general case. We evaluate our algorithm on three different application scenarios of a client-server protocol, where multiple clients compete via bidding to gain access to the service offered by the server. The synthesized policies optimize profit while guaranteeing that none of the server's clients is disproportionately starved of the service.
AB - Sequential decision-making in probabilistic environments is a fundamental problem with many applications in AI and economics. In this paper, we present an algorithm for synthesizing sequential decision-making agents that optimize statistical properties such as maximum and average response times. In the general setting of sequential decision-making, the environment is modeled as a random process that generates inputs. The agent responds to each input, aiming to maximize rewards and minimize costs within a specified time horizon. The corresponding synthesis problem is known to be PSPACE-hard. We consider the special case where the input distribution, reward, and cost depend on input-output statistics specified by counter automata. For such problems, this paper presents the first PTIME synthesis algorithms. We introduce the notion of statistical abstraction, which clusters statistically indistinguishable input-output sequences into equivalence classes. This abstraction allows for a dynamic programming algorithm whose complexity grows polynomially with the considered horizon, making the statistical case exponentially more efficient than the general case. We evaluate our algorithm on three different application scenarios of a client-server protocol, where multiple clients compete via bidding to gain access to the service offered by the server. The synthesized policies optimize profit while guaranteeing that none of the server's clients is disproportionately starved of the service.
KW - Abstract interpretation
KW - Sequential decision making
KW - counter machines
KW - Counter machines
UR - http://www.scopus.com/inward/record.url?scp=85198860574&partnerID=8YFLogxK
U2 - 10.4230/LIPIcs.FSCD.2024.2
DO - 10.4230/LIPIcs.FSCD.2024.2
M3 - Conference paper
T3 - Leibniz International Proceedings in Informatics, LIPIcs
SP - 2:1–2:17
BT - 9th International Conference on Formal Structures for Computation and Deduction, FSCD 2024
A2 - Rehof, Jakob
PB - Schloss Dagstuhl - Leibniz-Zentrum für Informatik
T2 - 9th International Conference on Formal Structures for Computation and Deduction
Y2 - 10 July 2024 through 13 July 2024
ER -