An extensive comparison of preprocessing methods in the context of configuration space learning

Damian Garber*, Alexander Felfernig, Viet-Man Le, Tamim Burgstaller, Merfat El Mansi

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference paperpeer-review

Abstract

One of the core goals in the research field of configuration space learning is building precise predictive models that allow for reliably estimating the performance of a configuration without requiring costly tests. The models used for this purpose are usually machine learning-based. However, the models show significant deviations in their performance depending on the investigated Software Product Line (SPL), the applied data preprocessing, and the number of sample configurations collected. Thus, we investigate the impact of different preprocessing methods and their behavior when using different SPLs, machine learning models, and sample sizes. Performance comparisons on this scale are usually not conducted due to their prohibitively expensive execution time requirements, even for smaller SPLs. Thus, we used three fully enumerated spaces as our training data, which allows for more generalized results. Our results show that the average factors between the worst and best-performing preprocessing methods are 2.05 (BerkeleyDBC), 1.17 (7z), and 1.84 (VP9). Further, no single preprocessing method tested was able to outperform all others, nor was this the case within one specific SPL or model type. This underlines the importance of testing new approaches with multiple preprocessing methods.
Original languageEnglish
Title of host publicationProceedings of the 26th International Workshop on Configuration (ConfWS 2024) co-located with the 30th International Conference on Principles and Practice of Constraint Programming (CP 2024)
PublisherCEUR Workshop Proceedings
Pages81-90
Volume3812
Publication statusPublished - 30 Oct 2024
Event26th International Workshop on Configuration: ConfWS 2024 - University of Girona, co-located with CP 2024, Girona, Spain
Duration: 2 Sept 20243 Sept 2024
https://confws.github.io

Workshop

Workshop26th International Workshop on Configuration
Abbreviated titleConfWS 2024
Country/TerritorySpain
CityGirona
Period2/09/243/09/24
Internet address

Fingerprint

Dive into the research topics of 'An extensive comparison of preprocessing methods in the context of configuration space learning'. Together they form a unique fingerprint.

Cite this