تفاصيل البطاقة الفهرسية

Integrated document segmentation and region identification

textual, equation and graphical

مقال من تأليف: Thiyam, Jennil ; Ranbir Singh, Sanasam ; Kumar Bora, Prabin ;

ملخص: With the advancement in the world of digitization, storing information in the form of scanned copies, images, etc. becomes a new normal. This new normal leads to the need for a system that can extract accurate information from the scanned documents or images with respect to every component they may have, such as textual, graphical, etc. The first step in extracting document information is to segment the document layout: divide the document into textual and non-textual regions of interest. There have been various studies over document layout segmentation, and this study observed that the majority of the existing studies face one common challenge, i.e., accurate segmentation of graphical components with sparsely clustered pixels such as flowcharts, block diagrams, etc. The study addresses it with a two-tier feedback-based framework. The first tier segments and classifies the textual and mathematical equation components, while the second tier segments and classifies the graphical regions using the feedback information from the first tier. The information provided by the first tier is the regional information of the equation and textual components to get a different copy of the original input document image in such a way that most of the foreground pixels are part of graphical regions. The proposed framework outperforms various existing studies (when evaluated against multiple data sets).

لغة: إنجليزية