MLCommons

People’s Speech

Data is the raw input that ultimately unlocks new machine learning capabilities. Machine learning for speech is particularly critically important for the future, because smart speakers and assistants will reach most of the population of the planet by 2025, encompassing over 300 languages each spoken by more than a million people.

People's Speech Infographic

The People’s Speech Dataset is the world’s largest labeled open speech dataset and includes 87,000+ hours of transcribed speech in 59 different languages with a diverse set of speakers. This open dataset is large enough to train speech-to-text systems and crucially will be available with a permissive license. Just as ImageNet catalyzed machine learning for vision, the People’s Speech will unleash innovation in speech research and products that are available to users across the globe.

Join Us

Join our Mailing List and stay informed.