Abstract
Developing natural language processing applications for Arabic must consider the different linguistic characteristics found in speech and translate those characteristics to script in order to reduce computational complexity and therefore reduce the word error rate (WER). Suprasegmental features are fundamental properties of speech that can enhance the performance of many natural speech processing applications. The present study considered stress as a prosodic feature comprising the prominence of syllables in speech by developing a tool that generated phonetic transcriptions and predicted the stress position. The generated transcription was used to create the phonetic dictionary necessary for developing an automatic speech recognition (ASR) system. This tool had to be accurate, linguistically motivated, and applicationally useful; therefore, the effectiveness of the generated stress-marked phonetic dictionary was tested by comparing the performance of a standard fixed dictionary-based system with that of one using the automatically generated dictionary. The research reported a 5.6% reduction in WER when using a dictionary with stress markers attached to each phone in the stressed syllable and a 3.5% reduction in WER when using a dictionary with stress markers assigned only to stressed vowels. These results encourage future studies to employ prosodic features of speech when developing different speech processing applications.
Recommended Citation
Alsharhan, Eiman and Alnajem, Salah
(2021)
"Developing a Stress Prediction Tool for Arabic Speech Recognition Tasks,"
Scientific Journal of King Faisal University: Humanities and Management Sciences: Vol. 22:
Iss.
2, Article 44.
https://doi.org/https://doi.org/10.37575/h/lng/2323
Available at:
https://sjkfuh.researchcommons.org/journal/vol22/iss2/44
