Selection of entropy-measure parameters for knowledge discovery in heart rate variability data

Christopher Mayer, Martin Bachler, Matthias Hörtenhuber, Christof Stocker, Andreas Holzinger, Sigi Wassertheurer

Research output: Contribution to journalArticlepeer-review

Abstract

Background

Heart rate variability is the variation of the time interval between consecutive heartbeats. Entropy is a commonly used tool to describe the regularity of data sets. Entropy functions are defined using multiple parameters, the selection of which is controversial and depends on the intended purpose. This study describes the results of tests conducted to support parameter selection, towards the goal of enabling further biomarker discovery.

Methods

This study deals with approximate, sample, fuzzy, and fuzzy measure entropies. All data were obtained from PhysioNet, a free-access, on-line archive of physiological signals, and represent various medical conditions. Five tests were defined and conducted to examine the influence of: varying the threshold value r (as multiples of the sample standard deviation σ, or the entropy-maximizing rChon), the data length N, the weighting factors n for fuzzy and fuzzy measure entropies, and the thresholds r F and r L for fuzzy measure entropy. The results were tested for normality using Lilliefors' composite goodness-of-fit test. Consequently, the p-value was calculated with either a two sample t-test or a Wilcoxon rank sum test.

Results

The first test shows a cross-over of entropy values with regard to a change of r. Thus, a clear statement that a higher entropy corresponds to a high irregularity is not possible, but is rather an indicator of differences in regularity. N should be at least 200 data points for r = 0.2 σ and should even exceed a length of 1000 for r = rChon. The results for the weighting parameters n for the fuzzy membership function show different behavior when coupled with different r values, therefore the weighting parameters have been chosen independently for the different threshold values. The tests concerning r F and r L showed that there is no optimal choice, but r = r F = r L is reasonable with r = rChon or r = 0.2σ.

Conclusions

Some of the tests showed a dependency of the test significance on the data at hand. Nevertheless, as the medical conditions are unknown beforehand, compromises had to be made. Optimal parameter combinations are suggested for the methods considered. Yet, due to the high number of potential parameter combinations, further investigations of entropy for heart rate variability data will be necessary.
Original languageEnglish
Pages (from-to)1-11
JournalBMC Bioinformatics
Volume15
Issue numberS2
DOIs
Publication statusPublished - 2014

Keywords

  • Entropy-based data mining
  • Knowledge Discovery
  • Health Informatics
  • Parameter selection
  • entropy

ASJC Scopus subject areas

  • Information Systems
  • Statistics, Probability and Uncertainty

Fields of Expertise

  • Information, Communication & Computing

Treatment code (Nähere Zuordnung)

  • Basic - Fundamental (Grundlagenforschung)

Fingerprint

Dive into the research topics of 'Selection of entropy-measure parameters for knowledge discovery in heart rate variability data'. Together they form a unique fingerprint.
  • On Entropy-Based Data Mining

    Holzinger, A., Hörtenhuber, M., Mayer, C., Bachler, M., Wassertheurer, S., Pinho, A. & Koslicki, D., 2014, Interactive Knowledge Discovery and Data Mining in Biomedical Informatics, LNCS 8401. 1 ed. Heidelberg, Berlin, New York: Springer, Vol. 8401. p. 209-226

    Research output: Chapter in Book/Report/Conference proceedingChapter

    Open Access
    File

Cite this