Subword-based approaches for spoken document retrieval*1
مقال من تأليف: Ng, Kenney ; Victor, W. Zue ;
ملخص: This paper explores approaches to the problem of spoken document retrieval (SDR), which is the task of automatically indexing and then retrieving relevant items from a large collection of recorded speech messages in response to a user specified natural language text query. We investigate the use of subword unit representations for SDR as an alternative to words generated by either keyword spotting or continuous speech recognition. In this study, we explore the space of possible subword units to determine the complexity of the subword units needed for SDR; describe the development and application of a phonetic recognition system to extract subword units from the speech signal; examine the behavior and sensitivity of the subword units to speech recognition errors; measure the effect of speech recognition performance on retrieval performance; and investigate a number of robust indexing and retrieval methods in an effort to improve retrieval performance in the presence of speech recognition errors. We find that with the appropriate subword units, it is possible to achieve performance comparable to that of text-based word units if the underlying phonetic units are recognized correctly. In the presence of speech recognition errors, retrieval performance degrades to 60% of the clean reference level. This performance can be improved by 23% (to 74% of the clean reference) with use of the robust methods.
لغة:
إنجليزية