Watch
Listen
Read
(Intro Music)
Professor Hans Rosling voiceover
I've spent my life working with stats, but I'm still amazed by many of the practical applications of statistics. Take language for instance, this is a field that at first glance doesn’t seem to have very much to do with numbers at all, but at Google’s California headquarters, computer scientists are overcoming the world’s language barriers with statistical machine translation.
Peter Norvig, Director of Research, Google
We wanted to provide access to all the web’s information, no matter what language you spoke. There’s just so much information on the internet, you couldn’t hope to translate it all by hand into every possible language. We figured we’d have to be able to do machine translation.
Franz Och, Head of Machine Translation, Google
What the computer is doing when he’s learning how to translate is to learn correlations between words and correlations between phrases. So we feed the system very large amounts of data and then the system is seeing that a certain word or a certain phrase correlates very often to the other language.
Professor Hans Rosling voiceover
Google’s website currently offers translation between any of 57 different languages. It does this purely statistically, having correlated a huge collection of multilingual texts.
Franz Och
The people that built the system don’t need to know Chinese in order to build a Chinese to English system, they don’t need to know Arabic, but the expertise that’s needed is basically knowledge of statistics, knowledge of computer science, knowledge of infrastructure to build those very large computational systems that we are building for doing that.
Professor Hans Rosling voiceover
I hooked up with Google from my office in
Professor Hans Rosling
I will type some Swedish sentences.
Franz Och
Okay.
Professor Hans Rosling
(typing) Sveriges finansminister har hastsvans och en guidring I orat.
Franz Och
Okay. Okay, so it says
Professor Hans Rosling
Almost exactly correct, it’s amazing. He comes from the Conservative Party and that’s the kind of
(typing) I sitt samkonade partnerskap har Stockholms nya biskop
Franz Och
In his same-sex partnerships has
Professor Hans Rosling
It’s almost perfect; it missed one important thing, it’s her. It’s a lesbian partnership.
Franz Och
Okay, so that’s, those kinds of words, his and her are one of the challenges in translation to get really those right in the machine.
Professor Hans Rosling
And especially when it comes to bishops, one can excuse it.
Franz Och
Right, so I guess more often than not it would probably be a ‘his’.
Professor Hans Rosling
I would write one more sentence.
(typing) Nar Sverige deltar I olympiader ar mallet inte att vinna utan att sla Norge.
Franz Och
Okay, when
Professor Hans Rosling
Yes, this is what it is, and they're very good in winter Olympics so we can't make it but we are trying.
Franz Och
Very good, very good.
Professor Hans Rosling
This is absolutely amazing, you know, and I was especially impressed that it picked up word like same sex partnerships, which are very new to the language…
Professor Hans Rosling voiceover
If you think that’s great, Google are now working on connecting this up with statistical voice recognition software.
Peter Norvig
Now we have the capability of having instant conversation between two people that don’t speak a common language. That I can talk to you in my language, you hear me in your language, and you can answer back in real time, we can make that translation, and can bring two people together and allow them to speak.
Professor Hans Rosling
To find out more about the joy of stats, visit the Open University’s open learn website.
(Outro music)
(4’08”)
More Joy Of Stats
Have you got a passion for statistics?
Find out about how you can study statistics with The Open University - and try the StatsChoices website to create your path through study.
We invite you to discuss this subject, but remember this is a public forum.
Please be polite, and avoid your passions turning into contempt for others. We may delete posts that are rude or aggressive, or edit posts containing contact details or links to other websites.