Classification of the scripts in medieval documents from Balkan region by run-length texture analysis
Contributo in Atti di convegno
Data di Pubblicazione:
2015
Abstract:
The paper presents a script classification method of the medieval documents originated from the Balkan region. It consists in a multi-step procedure which includes the text mapping according to typographical features, creation of equivalent image patterns, run-length pattern analysis in order to establish a feature vector and state-of-the art classification method Genetic Algorithms Image Clustering for Document Analysis (GA-ICDA) which successfully disseminates the documents written in different scripts. The proposed method is evaluated on custom oriented document databases, which include the handprinted or printed documents written in old Cyrillic, angular and round Glagolitic, ancient Latin and Greek scripts. The experiment demonstrates very good results.
Tipologia CRIS:
4.1 Contributo in Atti di convegno
Keywords:
Classification; Historical document; Optical character recognition; Pattern recognition; Run-length statistics; Script identification
Elenco autori:
Brodic, D.; Amelio, A.; Milivojevic, Z. N.
Link alla scheda completa:
Titolo del libro:
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pubblicato in: