Scooby: AI-interactive Speech Practice Platform for Non-native English Speakers

Project summary

Despite the variety of existing AI pronunciation practicing models, there is no AI-interactive platform for non-native English speakers to evaluate and practice English pronunciation skills with personalized scripts . To solve this problem, we propose Scooby, a AI powered platform where users input their scripts, practice the pronunciation and receive the feedback from AI to further improve the English speech skills. Unique features of Scooby include ii) enabling a user’s personal text input, ii) visualizing the speech-to-text results with wrongly spoken parts colored, iii) providing sophisticated phonetic-level analysis from AI, iv) scoring the user speeches. The platform has intuitive, visually appealing interface. Overall users’ usability score is 4.2/5.0 with users making comments as following:

“It is helpful because it gives syllable level feedback, and makes it easier to correct myself.”

“I like the way the program allows the users to see the correct pronunciation so that they can practice the correct way to pronounce words.”

Libraries and frameworks

Backend

  • Django framework
  • SpeechAce API (https://docs.speechace.com/)
  • Google Cloud STT and TTS engines:
    • google-api-python-client
    • google-cloud-speech
    • google-cloud-texttospeech
    • some other google libraries
  • Audio processing libraries:
    • SoundFile
    • pydub
    • simpleaudio

Frontend

  • ReactJS
  • Grommet - for styling UI components
  • Axios - REST framework
  • React Redux
This is done as a course project in KAIST CS492 Human-AI Interaction, Fall 2020.
Yewon Kim
Yewon Kim
Master’s Student @ KAIST

Human-AI Interaction, Generative AI, Writing with AI