AI-powered Speech Recognition Is Creating Interactive Learning Experiences For Children And Adults

AI-powered speech recognition is creating interactive learning experiences for children and adults

AWS EdStart Members and founders Long Qin of Singsound and Elnaz Sarraf of ROYBI are using artificial intelligence (AI) to revolutionize early language learning.

Long, based out of Beijing, China is dedicated to providing accessible and quality English education to every family throughout China. Elnaz, based out of San Francisco, CA, United States, is dedicated to changing the one-size-fits-all approach of our global education system.

Singsound leverages AI to provide English speaking and writing assessments to students in China

Long Qin is the co-founder and chief executive officer (CEO) of Singsound and an expert in AI, with more than 15 years of academic and industry experience. Before founding Singsound, Long was a senior research scientist at Duolingo, working on speech assessment and adaptive learning.

After growing up in remote China, Long saw inequitable access across China to quality education and the long-term impact it can have on university acceptance and entry into the job market. With the belief that education is the most important investment one can make, Long dedicated himself to helping families gain equitable access to educational resources. As a result, Long launched Singsound to help students learn English using AI to provide speech and writing assessments, natural language processing, and adaptive learning.

Singsound provides students with an instant feedback tool for written and spoken English. To immerse students in the language learning process, Long wanted students to hear language samples with varied speaking styles. However, it was difficult and expensive to recruit suitable native speakers. After struggling to find a solution, the team was introduced to Amazon Polly, a service that turns text into lifelike speech, allowing users to create applications that talk, and build entirely new categories of speech-enabled products. Amazon Polly’s text-to-speech (TTS) service uses advanced deep learning technologies to synthesize natural sounding human speech. With dozens of lifelike voices across a broad set of languages, users can build speech-enabled applications that work in several countries. With the discovery of Amazon Polly and the ability to automate customized speech patterns, Long’s team was also able to lower costs.

“A strong and reliable infrastructure is essential as we continue to research, develop, and build our AI-based language learning scenarios, which requires uninterrupted speech streaming and instant system feedback. Compared to other cloud offerings, we believe that the industry-leading services of AWS help us strengthen and keep our services more reliable. AWS develops innovative technologies that help provide reliable and affordable AI technologies to our users,” said Long.

With the help of AWS, Singsound has been able to provide a smooth, uninterrupted service to their customers, even as web traffic has tripled. Singsound was also the first AWS EdStart Member in China and has been a part of the AWS EdStart accelerator since September 2018. “The AWS EdStart program has helped us connect with over 75 customers and industry partners. It’s a great way to network within the education sector. This has enabled us to scale our marketing efforts across a broad audience including educational enterprises and schools,” said Long.

Like what we do?

The Latest EdTech News To Your Inbox

Follow us:


Subscribe to our Newsletters.