Dating the Historical Documents from Digitalized Books by Orthography Recognition
Contributo in Atti di convegno
Data di Pubblicazione:
2017
Abstract:
This paper introduces a new method for automatically dating Serbian and Croatian historical documents. It is based on the concept that the documents in a certain script or language evolving in different historical periods are characterized by differences in orthography rules. Accordingly, we propose three stages of script coding, texture analysis and classification for capturing such a difference. Hence, the input document is transformed into a sequence of numerical codes, each representing an intensity value, determining an image. Then, texture analysis extracts features from the image to create a feature vector. Finally, it is classified for orthography recognition. Results obtained on two databases of historical documents in angular Glagolitic script and Slavonic-Serbian and Serbian languages extracted from digitalized books demonstrate the efficacy of the proposed method.
Tipologia CRIS:
4.1 Contributo in Atti di convegno
Keywords:
Classification; Digital book; Historical documents; Image processing; Orthography recognition
Elenco autori:
Brodic, D.; Amelio, A.
Link alla scheda completa:
Titolo del libro:
Communications in Computer and Information Science
Pubblicato in: