Disease-disease relationships for rheumatic diseases Web-based biomedical textmining and knowledge discovery to assist medical decision making

Andreas Holzinger, Klaus-Martin Simonic, Pinar Yildirim

Research output: Chapter in Book/Report/Conference proceedingConference paperpeer-review

Abstract

The MEDLINE database (Medical Literature Analysis and Retrieval System Online) contains an enormously increasing volume of biomedical articles. There is urgent need for techniques which enable the discovery, the extraction, the integration and the use of hidden knowledge in those articles. Text mining aims at developing technologies to help cope with the interpretation of these large volumes of publications. Co-occurrence analysis is a technique applied in text mining and the methodologies and statistical models are used to evaluate the significance of the relationship between entities such as disease names, drug names, and keywords in titles, abstracts or even entire publications. In this paper we present a method and an evaluation on knowledge discovery of disease-disease relationships for rheumatic diseases. This has huge medical relevance, since rheumatic diseases affect hundreds of millions of people worldwide and lead to substantial loss of functioning and mobility. In this study, we interviewed medical experts and searched the ACR (American College of Rheumatology) web site in order to select the most observed rheumatic diseases to explore disease-disease relationships. We used a web based text-mining tool to find disease names and their co-occurrence frequencies in MEDLINE articles for each disease. After finding disease names and frequencies, we normalized the names by interviewing medical experts and by utilizing biomedical resources. Frequencies are normally a good indicator of the relevance of a concept but they tend to overestimate the importance of common concepts. We also used Pointwise Mutual Information (PMI) measure to discover the strength of a relationship. PMI provides an indication of how more often the query and concept co-occur than expected by change. After finding PMI values for each disease, we ranked these values and frequencies together. The results reveal hidden knowledge in articles regarding rheumatic diseases indexed by MEDLINE, thereby exposing relationships that can provide important additional information for medical experts and researchers for medical decision-making.
Original languageEnglish
Title of host publicationIEEE COMPSAC 36th International Conference on Computer Software and Applications
Place of PublicationNew York
PublisherInstitute of Electrical and Electronics Engineers
Pages573-580
ISBN (Print)978-076954736-7
DOIs
Publication statusPublished - 2012
Event36th Annual International Computer Software and Applications Conference: COMPSAC 2012 - Izmir, Turkey
Duration: 14 Mar 201220 Jul 2012

Conference

Conference36th Annual International Computer Software and Applications Conference
Country/TerritoryTurkey
CityIzmir
Period14/03/1220/07/12

Fields of Expertise

  • Human- & Biotechnology

Treatment code (Nähere Zuordnung)

  • Basic - Fundamental (Grundlagenforschung)
  • Application
  • Experimental

Fingerprint

Dive into the research topics of 'Disease-disease relationships for rheumatic diseases Web-based biomedical textmining and knowledge discovery to assist medical decision making'. Together they form a unique fingerprint.

Cite this