Learning Environment Models with Continuous Stochastic Dynamics - With an Application to Deep RL Testing

Martin Tappler, Edi Muskardin, Bernhard K. Aichernig, Bettina Koninghofer

Research output: Chapter in Book/Report/Conference proceedingConference paperpeer-review

Abstract

Techniques like deep reinforcement learning (DRL) enable autonomous agents to solve tasks in complex environments automatically through learning. Despite their potential, neural-network-based decision-making policies are hard to understand and test. To ease the adoption of such techniques, we learn automata models of environmental behavior under the control of an agent. These models provide insights into the decisions faced by agents and a basis for testing. To scale automata learning to environments with complex and continuous dynamics, we compute an abstract state-space representation through dimensionality reduction and clustering of observed environmental states. The stochastic transitions are learned via passive automata learning from agent-environment interactions. Furthermore, we iteratively sample additional tra-jectories to enhance the learned model's accuracy. We demonstrate the potential of our automata learning frame-work by (1) solving popular RL benchmark problems and (2) applying it for differential testing of DRL agents. Our results show that the learned models are sufficiently precise to compute policies that solve the respective control tasks. Yet the models are sufficiently general for coverage-guided testing, where we reveal significant differences in the functional failure frequency of pairs of DRL agents.

Original languageEnglish
Title of host publicationProceedings - 2024 IEEE Conference on Software Testing, Verification and Validation, ICST 2024
PublisherIEEE
Pages197-208
Number of pages12
ISBN (Electronic)9798350308181
DOIs
Publication statusPublished - 27 Aug 2024
Event17th IEEE Conference on Software Testing, Verification and Validation: ICST 2024 - Toronto, Canada
Duration: 27 May 202431 May 2024

Conference

Conference17th IEEE Conference on Software Testing, Verification and Validation
Country/TerritoryCanada
CityToronto
Period27/05/2431/05/24

Keywords

  • Automata Learning
  • Differential Testing
  • Markov Decision Processes
  • Reinforcement Learning

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software
  • Safety, Risk, Reliability and Quality
  • Modelling and Simulation

Fingerprint

Dive into the research topics of 'Learning Environment Models with Continuous Stochastic Dynamics - With an Application to Deep RL Testing'. Together they form a unique fingerprint.

Cite this