An image texture analysis method for minority language identification
Contributo in Atti di convegno
Data di Pubblicazione:
2017
Abstract:
This paper introduces an image texture analysis method for minority language identification. In the first stage, each letter is associated with a given script type according to its energy status in the text-line area. Mapping is carried out by extracting unicode text and transforming it into coded text. There are four different script types, which correspond to four grey levels of an image. Then, the obtained image is subjected to a feature extraction process performed by the texture analysis. This way, the grey level co-occurrence matrix and its derivative features are calculated. Extracted features are compared and classified using the KNearest Neighbors and Naive Bayes methods to establish a difference that can identify a minority language such as Serbian language among other world languages in the text. Very good accuracy results prove the efficiency of the proposed approach, when compared to other state-of the-art methods.
Tipologia CRIS:
4.1 Contributo in Atti di convegno
Keywords:
Classification; Feature extraction; Image processing; Natural language processing; Statistical analysis
Elenco autori:
Brodic, D.; Amelio, A.; Milivojevic, Z. N.
Link alla scheda completa:
Titolo del libro:
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pubblicato in: