Using question-specific vocabularies to support speech data collection with SALAAM

Chibuye, Kayokwa Nick

Using question-specific vocabularies to support speech data collection with SALAAM

dc.contributor.advisor	De Renzi, Brian
dc.contributor.author	Chibuye, Kayokwa Nick
dc.date.accessioned	2020-03-10T13:58:34Z
dc.date.available	2020-03-10T13:58:34Z
dc.date.issued	2019
dc.date.updated	2020-03-10T13:45:53Z
dc.description.abstract	There has been an increasing use of small-vocabulary spoken dialogue systems in low-resource settings for information dissemination and data collection. This provides an opportunity to reduce the information gap in low-resource settings in which low-literacy is a huge hindrance to the adoption of Information Communication Technologies (ICTs). Since the languages spoken in these areas are computationally low-resourced, they rely on techniques such as crosslanguage phoneme mapping to facilitate fast development of small-vocabulary speech recognisers. Despite the success of this technique, there has been a lack of guidance on how to deploy such systems across a range of languages. This study presents a systematic exploration of the suitability and limitations of using crosslanguage phoneme mapping for the development of small-vocabulary speech recognisers for computationally low-resource languages, particularly Bantu languages. Five target languages and four source languages were used in the study. Speech-based Accent Learning And Articulation Mapping (SALAAM), a cross-language phoneme mapping algorithm was used to aid the study based on its implementation in an open-source tool Lex4All. The following research questions guided our investigations: i) What impact does source language choice have on recognition accuracy, ii) What impact does gender composition of the training data set have on recognition accuracy and iii) What impact do varied alternative pronunciations per word type have on recognition accuracy. Data for the target languages was collected from 104 university student volunteers consisting of 58 female and 46 male students. The results showed that target and source language phonetic similarity as well as gender composition of the training datasets affects recognition accuracy of speech applications developed using cross-language phoneme mapping techniques. They also showed that increasing the number of alternative pronunciations per word in the vocabulary generally increases recognition accuracy although with a slower system response time. This study provides evidence that a careful selection of the source language, gender composition of the training data and the number of alternative pronunciations per word can improve the recognition accuracy of speech recognisers developed using cross-language phoneme mapping.
dc.identifier.apacitation	Chibuye, K. N. (2019). <i>Using question-specific vocabularies to support speech data collection with SALAAM</i>. (). ,Faculty of Science ,Department of Computer Science. Retrieved from http://hdl.handle.net/11427/31535	en_ZA
dc.identifier.chicagocitation	Chibuye, Kayokwa Nick. <i>"Using question-specific vocabularies to support speech data collection with SALAAM."</i> ., ,Faculty of Science ,Department of Computer Science, 2019. http://hdl.handle.net/11427/31535	en_ZA
dc.identifier.citation	Chibuye, K.N. 2019. Using question-specific vocabularies to support speech data collection with SALAAM. . ,Faculty of Science ,Department of Computer Science. http://hdl.handle.net/11427/31535	en_ZA
dc.identifier.ris	TY - Thesis / Dissertation AU - Chibuye, Kayokwa Nick AB - There has been an increasing use of small-vocabulary spoken dialogue systems in low-resource settings for information dissemination and data collection. This provides an opportunity to reduce the information gap in low-resource settings in which low-literacy is a huge hindrance to the adoption of Information Communication Technologies (ICTs). Since the languages spoken in these areas are computationally low-resourced, they rely on techniques such as crosslanguage phoneme mapping to facilitate fast development of small-vocabulary speech recognisers. Despite the success of this technique, there has been a lack of guidance on how to deploy such systems across a range of languages. This study presents a systematic exploration of the suitability and limitations of using crosslanguage phoneme mapping for the development of small-vocabulary speech recognisers for computationally low-resource languages, particularly Bantu languages. Five target languages and four source languages were used in the study. Speech-based Accent Learning And Articulation Mapping (SALAAM), a cross-language phoneme mapping algorithm was used to aid the study based on its implementation in an open-source tool Lex4All. The following research questions guided our investigations: i) What impact does source language choice have on recognition accuracy, ii) What impact does gender composition of the training data set have on recognition accuracy and iii) What impact do varied alternative pronunciations per word type have on recognition accuracy. Data for the target languages was collected from 104 university student volunteers consisting of 58 female and 46 male students. The results showed that target and source language phonetic similarity as well as gender composition of the training datasets affects recognition accuracy of speech applications developed using cross-language phoneme mapping techniques. They also showed that increasing the number of alternative pronunciations per word in the vocabulary generally increases recognition accuracy although with a slower system response time. This study provides evidence that a careful selection of the source language, gender composition of the training data and the number of alternative pronunciations per word can improve the recognition accuracy of speech recognisers developed using cross-language phoneme mapping. DA - 2019 DB - OpenUCT DP - University of Cape Town KW - computer science LK - https://open.uct.ac.za PY - 2019 T1 - Using question-specific vocabularies to support speech data collection with SALAAM TI - Using question-specific vocabularies to support speech data collection with SALAAM UR - http://hdl.handle.net/11427/31535 ER -	en_ZA
dc.identifier.uri	http://hdl.handle.net/11427/31535
dc.identifier.vancouvercitation	Chibuye KN. Using question-specific vocabularies to support speech data collection with SALAAM. []. ,Faculty of Science ,Department of Computer Science, 2019 [cited yyyy month dd]. Available from: http://hdl.handle.net/11427/31535	en_ZA
dc.language.rfc3066	eng
dc.publisher.department	Department of Computer Science
dc.publisher.faculty	Faculty of Science
dc.subject	computer science
dc.title	Using question-specific vocabularies to support speech data collection with SALAAM
dc.type	Master Thesis
dc.type.qualificationlevel	Masters
dc.type.qualificationname	MSc

Files

Original bundle

Now showing 1 - 1 of 1

Name:: thesis_sci_2019_chibuye_kayokwa_nick.pdf
Size:: 1.79 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 0 B
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Masters