Optimizing Named Entity Recognition for Improving Logical Formulae Abstraction from Technical Requirements Documents

Alexander Perko*, Haoran Zhao, Franz Wotawa

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference paperpeer-review

Abstract

Requirements engineering involves obtaining a complete and consistent set of requirements for a particular system. Requirement engineers often formulate requirements in a textual form where automated proving consistency and completeness is impossible. Converting textual requirements into logical formulae would enable analysis and reasoning tasks. In this work, we contribute to converting textual requirements into logical sentences. In particular, we focus on the named entity recognition task on industrial requirements documents to improve semantic abstraction using logical formalisms. We found that using general-purpose models for named entity recognition is not working well for specialized domains. Hence, we focused on retraining such models and investigated the ratio of domain-specific versus general-purpose data for retraining. Our results demonstrate significant improvements in F1-Scores compared to the respective pre-trained baseline models when performing domain-specific tasks.

Original languageEnglish
Title of host publicationProceedings - 2023 10th International Conference on Dependable Systems and Their Applications, DSA 2023
PublisherInstitute of Electrical and Electronics Engineers
Pages211-222
Number of pages12
ISBN (Electronic)9798350304770
DOIs
Publication statusPublished - 2023
Event10th International Conference on Dependable Systems and Their Applications: DSA 2023 - Tokyo, Japan
Duration: 10 Aug 202311 Aug 2023
https://dsa23.techconf.org/

Conference

Conference10th International Conference on Dependable Systems and Their Applications
Abbreviated titleDSA 2023
Country/TerritoryJapan
CityTokyo
Period10/08/2311/08/23
Internet address

Keywords

  • domain-specific data set
  • named entity recognition
  • Natural language processing
  • pre-trained language models
  • rehearsal sampling strategy

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Computer Science Applications
  • Software
  • Information Systems
  • Safety, Risk, Reliability and Quality

Fingerprint

Dive into the research topics of 'Optimizing Named Entity Recognition for Improving Logical Formulae Abstraction from Technical Requirements Documents'. Together they form a unique fingerprint.

Cite this