Making an AI enabled pronunciation bot with Teachable Machine

Srunika Kannan
2 min readJun 18, 2021

--

As a language teacher in India, training Indian L2 French students in the nuances of French pronunciations, I believe that AI enabled solutions can work wonders in foreign language classrooms in India. Indian classrooms are a unique challenge, in that the classes sometimes have up to 70 students with only one teacher. With one teacher having to cater to such a big number of students, individual attention, error correction and individualised practice becomes a very hard goal to achieve. For an Indian student who is used to English as a medium of instruction and language of communication there is a lot of interference between English and French. For those who do not speak English there is mother language interference as well.

This is where an AI solution like speech recognition can come in handy, as it can (even at a basic level) identify syllable pronunciations made by students and tell them if they are on the right track. In an advanced setting it should also be able to suggest syllable combinations based on what students got right and the sounds they struggle with.

French Sounds and English Interferences students have problems with

If a model is taught to ‘listen’ to specific sounds that the students make and show them the whether they pronounced it right, it can then be trained also to progressively train them (with immediate feedback) on perfecting the hard to pronounce syllables. All of this while the student can avoid the potential embarrassment of having to ask the teacher several times or not being able to reach the teacher for personal help.

With the help of Google’s Teachable Machine I was able to make a small ‘proof of concept’ of this idea to demonstrate the core principles. I recorded sound samples of the French word ‘Nice’ the city (Home of M.Sc Smart Edtech :) ) and sound samples of ‘Nice’ the English adjective. The word is a perfect example of English interference, French language students often have difficulty differentiating between these two words while reading out loud. I also recorded background noise and then trained the model. The final output was a bunch of code that I embedded on a static one page site using Glitch.

Here is a link to the live site : https://ajar-distinct-library.glitch.me

Here is a link to the code : https://glitch.com/edit/#!/ajar-distinct-library

Here is a link to the tutorial that helped me learn this method : https://medium.com/@warronbebster/teachable-machine-tutorial-snap-clap-whistle-4212fd7f3555

--

--

No responses yet