On the application of clustering for extracting driving scenarios from vehicle data

Nour Chetouane, Franz Wotawa*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review


If we want to extract test cases from driving data for the purpose of testing vehicles, we want to avoid using similar test cases. In this paper, we focus on this topic. We provide a method for extracting driving episodes from data utilizing clustering algorithms. This method starts with clustering driving data. Afterward, data points representing time-ordered sequences are obtained from the cluster forming a driving episode. Besides outlying the foundations, we present the results of an experimental evaluation where we considered six different clustering algorithms and available driving data from three German cities. To evaluate the cluster quality, we utilize three cluster validity metrics. In addition, we introduce a measure for the quality of extracted episodes relying on the Pearson coefficient. Experimental evaluation showed that the Pearson coefficient can rank clustering algorithms better than the three cluster validity metrics. We can extract meaningful episodes from driving data using any clustering algorithm considering four to eight clusters. Combining k-means clustering with auto-encoders leads to the best Pearson correlation. SOM is the slowest clustering method, and Canopy is the fastest
Original languageEnglish
Article number100377
JournalMachine Learning with Applications
Publication statusPublished - 2022

Fields of Expertise

  • Information, Communication & Computing


Dive into the research topics of 'On the application of clustering for extracting driving scenarios from vehicle data'. Together they form a unique fingerprint.

Cite this