Oral corpus of Spanish speakers learning English as L2 forced alignment
PDF (Spanish)
XML (Spanish)

Keywords

ad-hoc corpus
aligners
initial clusters
interphonology

How to Cite

Oral corpus of Spanish speakers learning English as L2 forced alignment. (2025). Semas, 5(9), 7-25. https://doi.org/10.61820/semas.2683-3301.v5n9.132

Abstract

The compilation of L2 learner oral corpora entails a series of methodological challenges regarding both their design and annotation. Researchers argue that automatic phonetic segmentation is a necessary step in their creation, since manual segmentation can be time consuming, expensive, and inconsistent. Furthermore, previous results have shown that forced alignment algorithms can be valuable tools for the corpus-based study of interlanguage. This article describes the use of forced alignment software for the characterization of the formant structure of epenthetic vowels in Mexican learners of English as L2. The collected oral data, and its subsequent analysis, constitutes the first acoustic characterization of epenthetic vowels in the population of interest. The obtained results constitute a starting point for further inquiry considering a higher number of speakers, other L2 proficiency levels, and different speech styles. Results show that epenthetic vowel is closer than the Spanish allophone.

PDF (Spanish)
XML (Spanish)

References

Bailey, G. (2016). “Automatic detection of sociolinguistic variation using forced alignment”. University of Pennsylvania Working Papers in Linguistics: Selected Papers from New Ways of Analyzing Variation (NWAV 44), pp. 11-20. https://eprints.whiterose.ac.uk/139456/1/Automatic_Detection_of_Variation.pdf

Bohn, O. S., & Flege, J. E. (1992). “The production of new and similar vowels by adult German learners of English”. Studies in Second Language Acquisition, 14, pp. 131-158. https://doi.org/10.1017/S0272263100010792

Boersma, P., & Weenink, D. (2018). Praat: doing phonetics by computer [Computer program]. Versión 6.0.40. Consultado el 3 de mayo del 2023 de http://www.praat.org

Clegg, J., y Fails, W. (2017). Manual de fonética y fonología españolas. Routledge: Nueva York.

Coto-Solano, R., y Solórzano, S. F. (2016). “Alineación forzada sin entrenamiento para la anotación automática de corpus orales de las lenguas indígenas de Costa Rica”. Kánina, 40(4), pp. 175-199. https://doi.org/10.15517/rk.v40i4.30234

Colina, S. (2003). “The status of word-final [e] in Spanish”. Southwest Journal of Linguistics, 22(1), pp. 87-108.

Colina, S. (2009). Spanish phonology: A syllabic perspective. Georgetown University Press: Washington.

Goldman, J. P. (2011). “EasyAlign: an automatic phonetic alignment tool under Praat”. Interspeech’11, 12th Annual Conference of the International Speech Communication Association. https://www.iscaspeech.org/archive_v0/archive_papers/interspeech_2011/i11_3233.pdf

Guion, S. G., Flege, J. E., Liu, S. H., y Yeni-Komshian, G. H. (2000). “Age of learning effects on the duration of sentence produced in a second language”. Applied Psycholinguistics, 21(2), pp. 205-228. http://jimfege.com/fles/Guion_ Flege_age_efects_AP_2000.pdf

Hincapié, D. (2018). “Corpus de Aprendientes de Español como Lengua Extranjera y Segunda Lengua (caele/2): el componente escrito”. Forma y Función, 31(2), pp. 129-143. https://revistas.unal.edu.co/index.php/formayfuncion/article/view/74659/67643

Kager, R. (2004). Optimality Theory. Cambridge: Cambridge University Press.

Kang, Y. (2011). “Loanword phonology”. The Blackwell companion to phonology, IV, pp. 1-25. https://onlinelibrary.wiley.com/doi/abs/10.1002/9781444335262.wbctp0095

Krashen, S. (1982). Principles and practice in second language acquisition. Oxford: Pergamon Press Inc.

Lennes, M. (2001). SpeCT-The Speech Corpus Toolkit for Praat (previously” Mietta’s Praat scripts”).

Llisterri, J. (1991). Introducción a la fonética: el método experimental. Barcelona: Anthropos.

Madrid, E. y Marín, M. (2001) “Estructura formántica de las vocales del español de la ciudad de México”. En E. Herrera (Ed.). Temas de fonética instrumental (p). México, D. F.: El Colegio de México.

MacKenzie, L., & Turton, D. (2020). “Assessing the accuracy of existing forced alignment software on varieties of British English”. Linguistics Vanguard, 6(1), pp. 1-14. https://doi.org/10.1515/lingvan-2018-0061

Mateos, A. V. (2012). Análisis de errores de aprendientes de francés lengua extranjera (FLE) basado en corpus orales (Tesis doctoral). Universidad Autónoma de Madrid, Madrid. https://dialnet.unirioja.es/servlet/dctes?codigo=36228

Paradis, C. & D. La Charité. (1997). “Preservation and minimality in loanword adaptation”. Journal of Linguistics, 33, pp. 379-430. https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=481575e149b01fd24fc60bfe5440b7a7a10dfa84

Peñate, M. y P. Arnaiz. (2004). “El papel de la producción oral (output) en el proceso de aprendizaje de una lengua extranjera (LE)”. Porta Linguarum: revista internacional de didáctica de las lenguas extranjeras, (1), pp. 37-59. https://dialnet.unirioja.es/servlet/articulo?codigo=1129989

Rufat, A. S. (2015). “Análisis contrastivo de interlengua y corpus de aprendientes: precisiones metodológicas”. Pragmalingüística, (23), pp. 191-210. https://core.ac.uk/download/pdf/230887407.pdf

Sánchez, J. A. (2014). “Análisis exploratorio de las vocales medias en el español del Valle de Toluca”. Verbum et lingua: Didáctica, lengua y cultura, (4), pp. 69-79. http://verbumetlingua.cucsh.udg.mx/index.php/VerLin/article/view/37

Sandes, E. I., y Llisterri, J. (2008). “Estudio acústico de las vocales epentéticas en la interlengua de estudiantes brasileños de E/LE”. V Congresso Brasileiro de Hispanistas/I Congresso Internacional da Associação Brasileira de Hispanistas, pp. 2521-2529. https://www.joaquimllisterri.cat/publicacions/Sandes_Llisterri_08_Vocales_Epenteticas_ELE.pdf

Tarone, E., & Cohen, A. D., Guy, D, (1983). “A closer look at some interlanguage terminology: a framework for communication strategies”. In C. Færch & G. Kasper (Eds.), Strategies in interlanguage communication (pp. 4-14). New York: Longman. Recuperado de https://eric.ed.gov/?id=ED125313

Villayandre Llamazares, M. (2008). “Lingüística con corpus” (I). Estudios Humanísticos. Filología, (30), pp. 329-349. https://dialnet.unirioja.es/servlet/articulo?codigo=3332675

Wilbanks, E. (2022). faseAlign (Version 1.1.14) [Computer software]. https://github.com/EricWilbanks/faseAlign

Yuan, J. y Liberman M. (2009). “Investigating /l/ variation in English through forced alignment”. En: Proceedings of InterSpeech, pp. 2215-2218. https://www.isca-speech.org/archive/interspeech_2009/yuan09_interspeech.html

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Copyright (c) 2024 Semas