Microsoft Shows Off Breakthrough In Speech Translation Technology

Rick Rashid, Microsoft’s Chief Research Officer gave a demonstration in Tianjin, China at Microsoft Research Asia’s 21st Century Computing event. He discussed about the Speech recognition in computing and the recent breakthrough Microsoft has in it.

Until recently though, even the best speech systems still had word error rates of 20-25% on arbitrary speech.

Just over two years ago, researchers at Microsoft Research and the University of Toronto made another breakthrough. By using a technique called Deep Neural Networks, which is patterned after human brain behaviour, researchers were able to train more discriminative and better speech recognizers than previous methods.

During my October 25 presentation in China, I had the opportunity to showcase the latest results of this work. We have been able to reduce the word error rate for speech by over 30% compared to previous methods. This means that rather than having one word in 4 or 5 incorrect, now the error rate is one word in 7 or 8. While still far from perfect, this is the most dramatic change in accuracy since the introduction of hidden Markov modeling in 1979, and as we add more data to the training we believe that we will get even better results.

He later did a live demo of the results of what they are working on by translating whatever he spoke in English to Chinese within few seconds delay. Watch the video above to see the magic!

via: Next at Microsoft

  • blackhawk556

    imagine in 10years well be able to finally talk to Tech support from India with no problems! The translation will be done on the fly by using 64core processors that fit into 15mm thin phone 😉

    • XB_Mod


    • tomakali

      with current US economy in 10 years, India will have Tech Support in US

  • Adriel D. Mingo

    This is why I love Microsoft and Microsoft Research. They put more work and energy into these ridiculous new technologies and innovations than any other company out there.

    • Gavin Tom

      yup remember when they had the surface table, way before anything relevant. I love innovation!

  • Boris Zakharin

    Reduction by 30% of 20%-25% = 7%-8%?

    If MS can’t even do math, how can they make breakthroughs like this?

    (the actual answer is 14%-18% error)

    • Max

      Nowhere in the article does it say 7-8%. It says reduced to 1 word in 7 or 8 will be an error instead of 1 in 5 or 6 words which is about 14%. If you can’t even read why would you bash an extremely successful company doing amazing things.

  • NegLewis

    A few years ago, a scientist made an experiment:
    It took a Picture, Xerox it and then Xerox the Xeroxed image and applied this over and over again.
    The conclusion is that after a few scans the image start losing it’s quality… but after a certain point the initial image were starting to emerge again… well .. not quite the same image… but still.

    I guess this could be applied here too… translate the translation… 1000 times. :)
    Then read it :).