Abstract
Extracting text from images of text is a challenging task. Among the greatest
challenges of text recognition is the optical character recognition (OCR) of historic books.
Due to low-quality images, rare fonts, and unknown dictionary standard OCR, software often fails in recognizing these texts. In this paper, we discuss existing OCR systems with a focus on learning strategies, and present an OCR model which is optimized to recognize old books. Additionally, we describe the process to
measure the quality of the outcome.
challenges of text recognition is the optical character recognition (OCR) of historic books.
Due to low-quality images, rare fonts, and unknown dictionary standard OCR, software often fails in recognizing these texts. In this paper, we discuss existing OCR systems with a focus on learning strategies, and present an OCR model which is optimized to recognize old books. Additionally, we describe the process to
measure the quality of the outcome.
Original language | English |
---|---|
Number of pages | 5 |
Journal | The IPSI BgD Transactions on Advanced Research |
Volume | 12 |
Issue number | 1 |
Publication status | Published - 2016 |
Keywords
- ocr
ASJC Scopus subject areas
- Artificial Intelligence