Enhancing automatic plagiarism detection
Using Doc2vec model
مقال من تأليف: Aliane, Hassina ; Setha, Imene ;
ملخص: Academic institutions define plagiarism as an act of cheating and stealing other's ideas to pass as their own. Therefore, a huge interest is conducted into plagiarism detection field u sing m ultiple t echniques. I nt his a rticle, wep ropose a method to automatically detect different types of plagiarism from two languages. This method is based on sentence modelling to try to extract plagiarized parts from documents using Doc2Vec model which predicts semantic similarity between documents and phrases.We use the PAN corpus for English plagiarism detection and AraPlagDet for Arabic. Both PAN and AraPlagDet corporas provide a set of suspicious documents that are manually and artificially plagiarized along with their sources.
لغة:
إنجليزية
الموضوع
الإعلام الآلي
الكلمات الدالة:
Word2Vec
Paragraph vector
Doc2Vec
LSA
PAN corpus
AraPlagDet corpus