Scooby: AI-interactive Speech Practice Platform for Non-native English Speakers

Last updated on Aug 17, 2022

Project summary

Despite the variety of existing AI pronunciation practicing models, there is no AI-interactive platform for non-native English speakers to evaluate and practice English pronunciation skills with personalized scripts . To solve this problem, we propose Scooby, a AI powered platform where users input their scripts, practice the pronunciation and receive the feedback from AI to further improve the English speech skills. Unique features of Scooby include ii) enabling a user’s personal text input, ii) visualizing the speech-to-text results with wrongly spoken parts colored, iii) providing sophisticated phonetic-level analysis from AI, iv) scoring the user speeches. The platform has intuitive, visually appealing interface. Overall users’ usability score is 4.2/5.0 with users making comments as following:

“It is helpful because it gives syllable level feedback, and makes it easier to correct myself.”

“I like the way the program allows the users to see the correct pronunciation so that they can practice the correct way to pronounce words.”

Libraries and frameworks

Backend

Django framework
SpeechAce API (https://docs.speechace.com/)
Google Cloud STT and TTS engines:
- google-api-python-client
- google-cloud-speech
- google-cloud-texttospeech
- some other google libraries
Audio processing libraries:
- SoundFile
- pydub
- simpleaudio

Frontend

ReactJS
Grommet - for styling UI components
Axios - REST framework
React Redux

This is done as a course project in KAIST CS492 Human-AI Interaction, Fall 2020.

AI HCI

Scooby: AI-interactive Speech Practice Platform for Non-native English Speakers

Project summary

Libraries and frameworks

Yewon Kim

Master’s Student @ KAIST