The researchers achieve a word error rate of 6.3 percent, getting it closer to what they say is the next generation of interaction with machines.
Microsoft says its researchers are one step closer to building software that understands speech as well as humans do.
IBM recently touted an error rate of 6.6 percent, Microsoft said. Just a few years ago, the technology industry couldn’t do better than a 10 percent error rate.
Software that can fully understand human speech, some technologists say, will enable a next generation of interaction with machines, one that doesn’t require a keyboard, mouse, or touch input.
Most Read Stories
- Friends honor artist’s last wishes with water ballet in a Seattle kiddie pool WATCH
- Battling demons in a community looking to Trump for change VIEW
- Conspiracy monger Alex Jones roams Seattle streets, gets coffee dumped on him
- Experts answer your burning questions about the 2017 solar eclipse
- Russell Wilson shines in Seahawks’ win over Vikings, but George Fant’s injury is a concern
Early examples of that are visible in the limited tasks people can ask digital assistants to perform already, like searching the web with Google’s Now, asking Microsoft’s Cortana to make a calendar appointment, or prompting Amazon.com’s Alexa to turn on music.
Microsoft says its progress was aided by the use of deep neural networks, or software inspired by the brain’s wiring that is better able to detect patterns in speech. Another component, they say, is using powerful graphics processing units, originally designed for high-performance computer graphics for video games and other applications, to speed up the algorithms that underlie speech recognition.
“This new milestone benefited from a wide range of new technologies developed by the (artificial intelligence) community from many different organizations over the past 20 years,” Xuedong Huang, Microsoft’s chief speech scientist, said in a blog post.
The research, by Huang and seven other authors, was published on Tuesday.