img

Notice détaillée

Deep entity matching with adversarial active learning

Article Ecrit par: Huang, Jiacheng ; Hu, Wei ; Bao, Zhifeng ; Chen, Qijin ; Qu, Yuzhong ;

Résumé: Entity matching (EM), as a fundamental task in data cleansing and integration, aims to identify the data records in databases that refer to the same real-world entity. While recent deep learning technologies significantly improve the performance of EM, they are often restrained by large-scale noisy data and insufficient labeled examples. In this paper, we present a novel EM approach based on deep neural networks and adversarial active learning. Specifically, we design a deep EM model to automatically complete missing textual values and capture both similarity and difference between records. Given that learning massive parameters in the deep model needs expensive labeling cost, we propose an adversarial active learning framework, which leverages active learning to collect a small amount of "good" examples and adversarial learning to augment the examples for stability enhancement. Additionally, to deal with large-scale databases, we present a dynamic blocking method that can be interactively tuned with the deep EM model. Our experiments on benchmark datasets demonstrate the superior accuracy of our approach and validate the effectiveness of all the proposed modules.


Langue: Anglais