The researchers achieve a word error rate of 6.3 percent, getting it closer to what they say is the next generation of interaction with machines.
Microsoft says its researchers are one step closer to building software that understands speech as well as humans do.
IBM recently touted an error rate of 6.6 percent, Microsoft said. Just a few years ago, the technology industry couldn’t do better than a 10 percent error rate.
Software that can fully understand human speech, some technologists say, will enable a next generation of interaction with machines, one that doesn’t require a keyboard, mouse, or touch input.
Most Read Stories
- 2017 NFL draft: Live Seahawks updates from the second and third rounds
- Seahawks trade with Falcons, 49ers to move out of first round of 2017 NFL Draft, now have 10 picks WATCH
- Starbucks' Dragon Frappuccino is new 'secret' drink craze
- Woman stabbed to death in Ballard
- First reaction: Seahawks select 6 players in second and third rounds of NFL Draft
Early examples of that are visible in the limited tasks people can ask digital assistants to perform already, like searching the web with Google’s Now, asking Microsoft’s Cortana to make a calendar appointment, or prompting Amazon.com’s Alexa to turn on music.
Microsoft says its progress was aided by the use of deep neural networks, or software inspired by the brain’s wiring that is better able to detect patterns in speech. Another component, they say, is using powerful graphics processing units, originally designed for high-performance computer graphics for video games and other applications, to speed up the algorithms that underlie speech recognition.
“This new milestone benefited from a wide range of new technologies developed by the (artificial intelligence) community from many different organizations over the past 20 years,” Xuedong Huang, Microsoft’s chief speech scientist, said in a blog post.
The research, by Huang and seven other authors, was published on Tuesday.