img

Notice détaillée

The Contribution of Selected Linguistic Markers for Unsupervised Arabic Verb Sense Disambiguation

Article Ecrit par: Azzoune, Hamid ; Aliane, Hassina ; Djaidri, Asma ;

Résumé: Word sense disambiguation (WSD) is the task of automatically determining the meaning of a polysemous word in a specific context. Word sense induction is the unsupervised clustering of word usages in a different context to distinguish senses and perform unsupervised WSD. Most studies consider function words as stop words and delete them in the pre-processing step. However, function words can encode meaningful information that can help to improve the performance of WSD approaches. We propose in this work a novel approach to solve Arabic verb sense disambiguation that is based on a preposition-based classification that is used in an automatic word sense induction step to build sense inventories to disambiguate Arabic verbs. However, in the wake of the success of neural language models, recent works obtained encouraging results using BERT pre-trained models for English-language WSD approaches. Hence, we use contextualized word embeddings for an unsupervised Arabic WSD that is based on linguistic markers and uses sentence-BERT Transformer pre-trained models, which yields encouraging results that outperform other existing unsupervised neural AWSD approaches.


Langue: Anglais
Thème Informatique

Mots clés:
Clustering
natural language processing
Word sense disambiguation
Arabic language
Word sense induction
linguistic markers
Contextualized word embeddings
SBERT