Evolutionary Propositionalization of Multi- Relational Data

Franz Wotawa; Valentin Kassarnig

doi:10.1142/S0218194018400260

Evolutionary Propositionalization of Multi- Relational Data

Franz Wotawa, Valentin Kassarnig

Institute of Software Technology (7160)

Research output: Contribution to journal › Article › peer-review

Abstract

Propositionalization has been proven to be a very effective solution for multi-relational data mining problems. The approaches usually follow a two-step principle: transforming the relational data into a single, flat table and applying a propositional learning algorithm. During the transformation, the target table gets expanded by adding many new features summarizing the information of the non-target tables. Based on the used feature construction strategy, this leads to a table of very high dimensionality with a lot of irrelevant and/or redundant features that can negatively affect the predictive performance. In this paper, we propose a modification of the traditional two-step framework to overcome such problems. The proposed approach evaluates the features during the construction phase and reports only a subset of highly predictive features to the propositional learner. We present an implementation of this approach using a genetic algorithm to search for an optimal feature subset. Our experiments on a number of benchmark datasets suggest that the modified framework can help propositionalization methods to significantly improve their predictive performance.

Original language	English
Pages (from-to)	1739–1754
Journal	International Journal of Software Engineering and Knowledge Engineering
Volume	28
Issue number	11/12
DOIs	https://doi.org/10.1142/S0218194018400260
Publication status	Published - 2018

Access to Document

10.1142/S0218194018400260

Cite this

@article{177fda83ffe149eba32951dc39673cc1,

title = "Evolutionary Propositionalization of Multi- Relational Data",

abstract = "Propositionalization has been proven to be a very effective solution for multi-relational data mining problems. The approaches usually follow a two-step principle: transforming the relational data into a single, flat table and applying a propositional learning algorithm. During the transformation, the target table gets expanded by adding many new features summarizing the information of the non-target tables. Based on the used feature construction strategy, this leads to a table of very high dimensionality with a lot of irrelevant and/or redundant features that can negatively affect the predictive performance. In this paper, we propose a modification of the traditional two-step framework to overcome such problems. The proposed approach evaluates the features during the construction phase and reports only a subset of highly predictive features to the propositional learner. We present an implementation of this approach using a genetic algorithm to search for an optimal feature subset. Our experiments on a number of benchmark datasets suggest that the modified framework can help propositionalization methods to significantly improve their predictive performance.",

author = "Franz Wotawa and Valentin Kassarnig",

year = "2018",

doi = "10.1142/S0218194018400260",

language = "English",

volume = "28",

pages = "1739–1754",

journal = "International Journal of Software Engineering and Knowledge Engineering ",

issn = "0218-1940",

publisher = "World Scientific ",

number = "11/12",

}

TY - JOUR

T1 - Evolutionary Propositionalization of Multi- Relational Data

AU - Wotawa, Franz

AU - Kassarnig, Valentin

PY - 2018

Y1 - 2018

N2 - Propositionalization has been proven to be a very effective solution for multi-relational data mining problems. The approaches usually follow a two-step principle: transforming the relational data into a single, flat table and applying a propositional learning algorithm. During the transformation, the target table gets expanded by adding many new features summarizing the information of the non-target tables. Based on the used feature construction strategy, this leads to a table of very high dimensionality with a lot of irrelevant and/or redundant features that can negatively affect the predictive performance. In this paper, we propose a modification of the traditional two-step framework to overcome such problems. The proposed approach evaluates the features during the construction phase and reports only a subset of highly predictive features to the propositional learner. We present an implementation of this approach using a genetic algorithm to search for an optimal feature subset. Our experiments on a number of benchmark datasets suggest that the modified framework can help propositionalization methods to significantly improve their predictive performance.

AB - Propositionalization has been proven to be a very effective solution for multi-relational data mining problems. The approaches usually follow a two-step principle: transforming the relational data into a single, flat table and applying a propositional learning algorithm. During the transformation, the target table gets expanded by adding many new features summarizing the information of the non-target tables. Based on the used feature construction strategy, this leads to a table of very high dimensionality with a lot of irrelevant and/or redundant features that can negatively affect the predictive performance. In this paper, we propose a modification of the traditional two-step framework to overcome such problems. The proposed approach evaluates the features during the construction phase and reports only a subset of highly predictive features to the propositional learner. We present an implementation of this approach using a genetic algorithm to search for an optimal feature subset. Our experiments on a number of benchmark datasets suggest that the modified framework can help propositionalization methods to significantly improve their predictive performance.

U2 - 10.1142/S0218194018400260

DO - 10.1142/S0218194018400260

M3 - Article

SN - 0218-1940

VL - 28

SP - 1739

EP - 1754

JO - International Journal of Software Engineering and Knowledge Engineering

JF - International Journal of Software Engineering and Knowledge Engineering

IS - 11/12

ER -

Evolutionary Propositionalization of Multi- Relational Data

Abstract

Access to Document

Fingerprint

Cite this