A new image analysis framework for Latin and Italian language discrimination

Contributo in Atti di convegno

Data di Pubblicazione:

2016

Abstract:

The paper presents a new framework for discrimination of Latin and Italian languages. The first phase maps the text in the given language into a uniformly coded text. It is based on the position of each letter of the script in the text line and its height, derived from its energy profile. The second phase extracts run-length texture measures from the coded text given as 1-D image, by producing a feature vector of 11 values. The obtained feature vectors are adopted for language discrimination by using a clustering algorithm. As a result, the distinction between the two languages is perfectly realized with an accuracy of 100% on a complex database of documents in Latin and Italian languages.

Tipologia CRIS:

4.1 Contributo in Atti di convegno

Keywords:

Clustering; Document analysis; Image processing; Information retrieval; Italian language; Statistical analysis

Elenco autori:

Brodic, D.; Amelio, A.; Milivojevic, Z. N.

Autori di Ateneo:

AMELIO Alessia

Link alla scheda completa:

https://ricerca.unich.it/handle/11564/770248

Titolo del libro:

CEUR Workshop Proceedings

Pubblicato in:

CEUR WORKSHOP PROCEEDINGS

Journal

CEUR WORKSHOP PROCEEDINGS

Series