Adopting Speech Recognition in EFL/ESL Contexts: Are We There Yet?

Lan Vu, Thom Thibeault, Phu Vu


This paper reviews the advancement of using speech recognition (SR) technology in EFL/ESL classrooms in the last few decades, addresses researchers’ and educators' concerns about the limitation of this technology and examines how far SR technology has been evolving in its own field. Finally, potential pedagogical implications of SR technology for EFL/ESL, its limitations and suggestions for further studies are discussed. 

Full Text:



Benzeghiba, M., De Mori, R., Deroo, O., Dupont, S., Erbes, T., Jouvet, D., ... & Rose, R. (2007). Automatic speech recognition and speech variability: A review. Speech communication, 49(10), 763-786.

Clarke, C. C. (1918). The phonograph in modern language teaching. The Modern Language Journal, 3(3), 116-122.

Chen, H. H. (2011). Developing and evaluating an oral skills training website supported by automatic speech recognition technology. ReCALL : The Journal of EUROCALL, 23(1), 59-78.

Chen, H. H. J. (2016, October). Developing a Speaking Practice Website by Using Automatic Speech Recognition Technology. In International Symposium on Emerging Technologies for Education (pp. 671-676). Springer, Cham.

Coniam, D. (1999). Voice recognition software accuracy with second language speakers of English. System, 27(1), 49-64.

Derwing, T. M., Munro, M. J., & Carbonaro, M. (2000). Does popular speech recognition software work with ESL speech?. TESOL quarterly, 34(3), 592-603.

Elimat, A. K., & AbuSeileek, A. F. (2014). Automatic Speech Recognition Technology as an Effective Means for Teaching Pronunciation. JALT CALL Journal, 10(1), 21-47.

Hincks, R. (2002). Speech recognition for language teaching and evaluating: a study of existing commercial products. In INTERSPEECH.

Lee, S. M. (2016). User experience of a mobile speaking application with automatic speech recognition for EFL learning. British Journal of Educational Technology, 47(4), 778-786.

Liaw, M. L. (2014). The affordance of speech recognition technology for EFL learning in an elementary school setting. Innovation in Language Learning and Teaching, 8(1), 79-93.

Mirzaei. M. S., Meshgi, K., Akita, Y., & Kawahara, T. (2015). Errors in automatic speech

recognition versus difficulties in second language listening. In F. Helm, L. Bradley, M. Guarda, & S. Thouësny (Eds), Critical CALL – Proceedings of the 2015 EUROCALL Conference, Padova, Italy (pp. 410-415). Dublin:

Neri, A., Cucchiarini, C., & Strik, W. (2003, August). Automatic speech recognition for second language learning: how and why it actually works. In Proc. ICPhS (pp. 1157-1160).

Tatman, R., & Kasten, C. (2017). Effects of Talker Dialect, Gender & Race on Accuracy of Bing Speech and YouTube Automatic Captions. Proc. Interspeech 2017, 934-938.

Vu, P., Fredrickson, S. & Gaskill, M. (2019). One-To-One initiative implementation from insiders’ perspectives. Tech Trends, 63(1).


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.