Optical Character Recognition of Old Fonts – A Case Study

Johanna Pirker, Gerhard Wurzinger

Research output: Contribution to journalArticlepeer-review

Abstract

Extracting text from images of text is a challenging task. Among the greatest
challenges of text recognition is the optical character recognition (OCR) of historic books.
Due to low-quality images, rare fonts, and unknown dictionary standard OCR, software often fails in recognizing these texts. In this paper, we discuss existing OCR systems with a focus on learning strategies, and present an OCR model which is optimized to recognize old books. Additionally, we describe the process to
measure the quality of the outcome.
Original languageEnglish
Number of pages5
JournalThe IPSI BgD Transactions on Advanced Research
Volume12
Issue number1
Publication statusPublished - 2016

Keywords

  • ocr

ASJC Scopus subject areas

  • Artificial Intelligence

Cite this