Testing ChatGPT’s Performance on Medical Diagnostic Tasks

Research output: Chapter in Book/Report/Conference proceedingConference paperpeer-review

Abstract

Large Language Models and chat interfaces like ChatGPT have become increasingly important recently, receiving a lot of attention even from the general public. People use these tools not only to summarize or translate text but also to answer questions, including medical ones. For the latter, giving reliable feedback is of utmost importance, which is hard to assess. Therefore, we focus on validating the feedback of ChatGPT and propose a testing procedure utilizing other medical sources to determine the quality of feedback for more straightforward medical diagnostic tasks. This paper outlines the problem, discusses available sources, and introduces the validation method. Moreover, we present the first results obtained when applying the testing framework to ChatGPT.
Original languageEnglish
Title of host publicationProceedings of the 27th International Multiconference Information Society – IS 2024
ChapterK
Pages914-919
Number of pages6
DOIs
Publication statusPublished - Oct 2024
Event27th International Multiconference Information Society – IS 2024 - Ljubljana, Slovenia
Duration: 7 Oct 202411 Oct 2024

Conference

Conference27th International Multiconference Information Society – IS 2024
Abbreviated titleIS 2024
Country/TerritorySlovenia
CityLjubljana
Period7/10/2411/10/24

Keywords

  • Large Language Models
  • ChatGPT
  • NetDoktor
  • Testing
  • Validation

ASJC Scopus subject areas

  • Artificial Intelligence

Fields of Expertise

  • Information, Communication & Computing

Fingerprint

Dive into the research topics of 'Testing ChatGPT’s Performance on Medical Diagnostic Tasks'. Together they form a unique fingerprint.

Cite this