A new image analysis framework for Latin and Italian language discrimination
Contributo in Atti di convegno
Data di Pubblicazione:
2016
Abstract:
The paper presents a new framework for discrimination of Latin and Italian languages. The first phase maps the text in the given language into a uniformly coded text. It is based on the position of each letter of the script in the text line and its height, derived from its energy profile. The second phase extracts run-length texture measures from the coded text given as 1-D image, by producing a feature vector of 11 values. The obtained feature vectors are adopted for language discrimination by using a clustering algorithm. As a result, the distinction between the two languages is perfectly realized with an accuracy of 100% on a complex database of documents in Latin and Italian languages.
Tipologia CRIS:
4.1 Contributo in Atti di convegno
Keywords:
Clustering; Document analysis; Image processing; Information retrieval; Italian language; Statistical analysis
Elenco autori:
Brodic, D.; Amelio, A.; Milivojevic, Z. N.
Link alla scheda completa:
Titolo del libro:
CEUR Workshop Proceedings
Pubblicato in: