تفاصيل البطاقة الفهرسية

Printed Ottoman text recognition using synthetic data and data augmentation

مقال من تأليف: Bilgin Tasdemir, Esma F. ;

ملخص: The Ottoman script, which was in use for over five centuries, is an Arabic alphabet-based writing system. It became obsolete after the change of alphabet in Turkey. There are plenty of Ottoman documents, overwhelmingly printed in Naskh style. This work presents a DL-based character recognition system for the printed Ottoman script. We first generate a synthetic text image dataset from a text corpus and then augment it using some image processing methods. We develop a hybrid convolutional neural network-bidirectional long short-term memory recognizer and train it with the original and the augmented datasets. Finally, we apply a transfer learning procedure for adapting the system to real image data. The proposed system obtains 0.11 CER on synthetic data and 0.16 CER on real data comprising of line images from a printed historical Ottoman book

لغة: إنجليزية

أسئلة شائعة

ماهي أنواع الموارد الوثائقية المحصاة في فهرس مكتبتكم؟-

الوثائق المحصاة في الفهرس هي: الدوريات ، مقالات الدوريات ، الكتب ، أطروحات ما بعد التخرج (ماجستير ودكتوراه) ، تقرير بحثي ، وثيقة سمعية بصري

ما هي أوقات عمل المكتبة خلال السنة؟

المكتبة مفتوحة من الأحد إلى الخميس من الساعة 8:30 إلى الساعة16:30 أحيانا يتم غلقها لأسباب إدارية.

ﺃﻳﻦ تقع مكتبة السيريست?

تقع المكتبة في الطابق الأرضي من المبنى B أنظر خرائط Google لموقع الويب لتحديد العنوان