The Joy Of Stats: How statistics make understanding foreign words simpler

Featuring: Video Video Audio Audio

Through a massive statistical analysis, Google have taught a machine to translate across dozens of languages pretty effectively.

By: Professor Hans Rosling (Gapminder) , Franz Och (Google) , Peter Norvig (Google)

  • Duration 5 mins
  • Updated Tuesday 7th December 2010
  • Introductory level
  • Posted under TV, Joy of Stats
Share on Google Plus Share on LinkedIn Share on Reddit View article Comments
Print

Watch

Copyright The Open University

Listen

Copyright The Open University

Read

 

(Intro Music)

Professor Hans Rosling voiceover

I've spent my life working with stats, but I'm still amazed by many of the practical applications of statistics.  Take language for instance, this is a field that at first glance doesn’t seem to have very much to do with numbers at all, but at Google’s California headquarters, computer scientists are overcoming the world’s language barriers with statistical machine translation.

Peter Norvig, Director of Research, Google

We wanted to provide access to all the web’s information, no matter what language you spoke.  There’s just so much information on the internet, you couldn’t hope to translate it all by hand into every possible language.  We figured we’d have to be able to do machine translation. 

Franz Och, Head of Machine Translation, Google

What the computer is doing when he’s learning how to translate is to learn correlations between words and correlations between phrases.  So we feed the system very large amounts of data and then the system is seeing that a certain word or a certain phrase correlates very often to the other language.

Professor Hans Rosling voiceover

Google’s website currently offers translation between any of 57 different languages.  It does this purely statistically, having correlated a huge collection of multilingual texts.

Franz Och

The people that built the system don’t need to know Chinese in order to build a Chinese to English system, they don’t need to know Arabic, but the expertise that’s needed is basically knowledge of statistics, knowledge of computer science, knowledge of infrastructure to build those very large computational systems that we are building for doing that.

Professor Hans Rosling voiceover

I hooked up with Google from my office in Stockholm to try the translator for myself.

Professor Hans Rosling

I will type some Swedish sentences.

Franz Och

Okay.

Professor Hans Rosling

(typing) Sveriges finansminister har hastsvans och en guidring I orat.

Franz Och

Okay.  Okay, so it says Sweden’s finance minister has a ponytail and a gold ring in your ear, so I guess it probably means in his ear.

Professor Hans Rosling

Almost exactly correct, it’s amazing.  He comes from the Conservative Party and that’s the kind of Sweden we have today.  I will type one more sentence.

(typing) I sitt samkonade partnerskap har Stockholms nya biskop

Franz Och

In his same-sex partnerships has Stockholm’s new bishop and his partners a three-year son. That’s…unusual…

Professor Hans Rosling

It’s almost perfect; it missed one important thing, it’s her.  It’s a lesbian partnership.

Franz Och

Okay, so that’s, those kinds of words, his and her are one of the challenges in translation to get really those right in the machine.

Professor Hans Rosling

And especially when it comes to bishops, one can excuse it.

Franz Och

Right, so I guess more often than not it would probably be a ‘his’.

Professor Hans Rosling

I would write one more sentence.

(typing) Nar Sverige deltar  I olympiader ar mallet inte att vinna utan att sla Norge.

Franz Och

Okay, when Sweden is taking part in Olympic goal, it is not to win but to beat Norway.

Professor Hans Rosling

Yes, this is what it is, and they're very good in winter Olympics so we can't make it but we are trying.

Franz Och

Very good, very good.

Professor Hans Rosling

This is absolutely amazing, you know, and I was especially impressed that it picked up word like same sex partnerships, which are very new to the language…

Professor Hans Rosling voiceover

If you think that’s great, Google are now working on connecting this up with statistical voice recognition software.

Peter Norvig

Now we have the capability of having instant conversation between two people that don’t speak a common language.  That I can talk to you in my language, you hear me in your language, and you can answer back in real time, we can make that translation, and can bring two people together and allow them to speak.

Professor Hans Rosling

To find out more about the joy of stats, visit the Open University’s open learn website.

(Outro music)

(4’08”)

 

 

More Joy Of Stats

Have you got a passion for statistics?

Find out about how you can study statistics with The Open University - and try the StatsChoices website to create your path through study.

Weblinks

Gapminder
Hans Rosling talk at TED

More like this