Mutation Testing for Artificial Neural Networks: An Empirical Evaluation

Lorenz Klampfl*, Nour Chetouane, Franz Wotawa

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference paperpeer-review

Abstract

Testing AI-based systems and especially when they rely on machine learning is considered a challenging task. In this paper, we contribute to this challenge considering testing neural networks utilizing mutation testing. A former paper focused on applying mutation testing to the configuration of neural networks leading to the conclusion that mutation testing can be effectively used. In this paper, we discuss a substantially extended empirical evaluation where we considered different test data and the source code of neural network implementations. In particular, we discuss whether a mutated neural network can be distinguished from the original one after learning, only considering a test evaluation. Unfortunately, this is rarely the case leading to a low mutation score. As a consequence, we see that the testing method, which works well at the configuration level of a neural network, is not sufficient to test neural network libraries requiring substantially more testing effort for assuring quality.
Original languageEnglish
Title of host publicationProceedings - 2020 IEEE 20th International Conference on Software Quality, Reliability, and Security, QRS 2020
PublisherInstitute of Electrical and Electronics Engineers
Pages356-365
Number of pages10
ISBN (Electronic)9781728189130
DOIs
Publication statusPublished - 11 Dec 2020
Event20th IEEE International Conference on Software Quality, Reliability, and Security, QRS 2020 - Virtual, Macau, China
Duration: 11 Dec 202014 Dec 2020

Conference

Conference20th IEEE International Conference on Software Quality, Reliability, and Security, QRS 2020
Country/TerritoryChina
CityVirtual, Macau
Period11/12/2014/12/20

Keywords

  • deep neural networks
  • mutation testing

ASJC Scopus subject areas

  • Software
  • Artificial Intelligence
  • Safety, Risk, Reliability and Quality
  • Computer Networks and Communications
  • Modelling and Simulation

Cite this