An image texture analysis method for minority language identification

Contributo in Atti di convegno

Data di Pubblicazione:

2017

Abstract:

This paper introduces an image texture analysis method for minority language identification. In the first stage, each letter is associated with a given script type according to its energy status in the text-line area. Mapping is carried out by extracting unicode text and transforming it into coded text. There are four different script types, which correspond to four grey levels of an image. Then, the obtained image is subjected to a feature extraction process performed by the texture analysis. This way, the grey level co-occurrence matrix and its derivative features are calculated. Extracted features are compared and classified using the KNearest Neighbors and Naive Bayes methods to establish a difference that can identify a minority language such as Serbian language among other world languages in the text. Very good accuracy results prove the efficiency of the proposed approach, when compared to other state-of the-art methods.

Tipologia CRIS:

4.1 Contributo in Atti di convegno

Keywords:

Classification; Feature extraction; Image processing; Natural language processing; Statistical analysis

Elenco autori:

Brodic, D.; Amelio, A.; Milivojevic, Z. N.

Autori di Ateneo:

AMELIO Alessia

Link alla scheda completa:

https://ricerca.unich.it/handle/11564/770258

Titolo del libro:

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Pubblicato in:

LECTURE NOTES IN ARTIFICIAL INTELLIGENCE

Journal

LECTURE NOTES IN ARTIFICIAL INTELLIGENCE

Series