Machine Translation (General discussion) Language Learning Forum

Machine Translation
Tags: Google \| Joke \| Internet \| Translation
Share with: Delicious Digg reddit Facebook StumbleUpon
Language Learning Forum : General discussion

25 messages over 4 pages: 1 2 3 4 Next >>

furrykef
Senior Member
United States
furrykef.com/
Joined 6281 days ago
681 posts - 862 votes

Speaks: English*
Studies: Spanish, Japanese, Latin, Italian

Message 17 of 25

15 December 2010 at 4:33pm | IP Logged

I don't think machine translation will ever be decent until computers develop intelligence. By "intelligence" I mean a working knowledge of the world and some conversational skill. A computer you can't talk to is a computer that cannot understand language; a computer that cannot understand language cannot hope to translate it.

1 person has voted this message useful

egill
Diglot
Senior Member
United States
Joined 5505 days ago
418 posts - 791 votes

Speaks: Mandarin, English*
Studies: German, Spanish, Dutch

Message 18 of 25

15 December 2010 at 9:37pm | IP Logged

furrykef wrote:

I don't think machine translation will ever be decent until
computers develop intelligence. By "intelligence" I mean a working knowledge of the
world and some conversational skill. A computer you can't talk to is a computer
that cannot understand language; a computer that cannot understand language cannot hope
to translate it.

I don't think that conversational skill is at all a prerequisite for decent machine
translation, in fact I think the former problem is several orders of magnitude harder than the
latter. But of course, that hinges on what we mean when we say "decent" and even what we mean
by "understand".

The thing is that with intelligence, the goalposts are constantly shifting—every new
advance in AI is considered "just" brute-force or "merely" mechanical, i.e. non-intelligent.
Just a couple of decades ago, no one would have believed that a computer could beat a
grandmaster at chess. Nowadays chess programs running on commodity hardware can best
most strong players.

I would certainly consider this a form of intelligence. It may be of a different kind,
but it performs an intelligent function and it performs it well. Machine translation is
pretty decent already and improving fast, I would also consider it intelligent.

I guess my main point here, is that there are shades of grey and different forms of
intelligence. To define it to only be some rarefied ideal or exactly how humans brains
work is myopic in my opinion.

Edited by egill on 15 December 2010 at 9:47pm
2 persons have voted this message useful

Spanky
Senior Member
Canada
Joined 5765 days ago
1021 posts - 1714 votes

Studies: French

Message 19 of 25

15 December 2010 at 9:51pm | IP Logged

I have a few standing predictions on this forum relating to the improving quality of machine translation standards in the not distant future. I am firmly convinced that very excellent machine translation capability will be readily available within the next couple of decades.

I am prepared to bet $50 Spanky bucks* on this prediction, and in this way hopefully reverse the dark cloud that has shadowed my family history ever since my great-great-grand-uncle Jebodiah Spanky lost a bottle of moonshine and two silver buttons betting against the progress of technology in the past (he figured incorrectly as it turns out that there just aint no dern way the new fangled automobile could ever travel faster than a horse).

(Conversion rate: 1 Spanky buck = 1.4 Grand Duchy of Fenwick kroners)
1 person has voted this message useful

furrykef
Senior Member
United States
furrykef.com/
Joined 6281 days ago
681 posts - 862 votes

Speaks: English*
Studies: Spanish, Japanese, Latin, Italian

Message 20 of 25

16 December 2010 at 6:47am | IP Logged

egill wrote:

I would certainly consider this a form of intelligence. It may be of a different kind,
but it performs an intelligent function and it performs it well. Machine translation is
pretty decent already and improving fast, I would also consider it intelligent.

I guess my main point here, is that there are shades of grey and different forms of
intelligence. To define it to only be some rarefied ideal or exactly how humans brains
work is myopic in my opinion.

I would agree that machine intelligence needn't be the same thing as human intelligence. But here we're concerned with the specific task of translation of language, and language is made by humans and for humans. You cannot translate something without having an encyclopedic knowledge of the human world, because everything we say tends to rely on a broader context. Words do not exist in a vacuum, but that's exactly what machine translators tend to assume.

To put it another way: how can a machine be sure if a human would find its translation acceptable if it cannot think like a human?

1 person has voted this message useful

egill
Diglot
Senior Member
United States
Joined 5505 days ago
418 posts - 791 votes

Speaks: Mandarin, English*
Studies: German, Spanish, Dutch

Message 21 of 25

16 December 2010 at 7:49am | IP Logged

furrykef wrote:

egill wrote:

I would agree that machine intelligence needn't be the same thing as human intelligence. But here we're concerned with the specific task
of translation of language, and language is made by humans and for humans. You cannot translate something without having an encyclopedic
knowledge of the human world, because everything we say tends to rely on a broader context. Words do not exist in a vacuum, but that's
exactly what machine translators tend to assume.

To put it another way: how can a machine be sure if a human would find its translation acceptable if it cannot think like a human?

But MT doesn't at all assume that words exist in a vacuum. In fact, most current implementations rely extremely heavily on context. It is
not the case that they're simply looking up and substituting words one for one.

For the sake of argument, imagine a program that has every single phrase ever uttered and will be uttered in language A and its
translation in language B. Clearly this program is able to produce acceptable translations: it's a simple matter of looking it up.

Now imagine this program only has 90% of all utterances. Obviously we can't do quite as well now since there are some sentences we don't
have an exact match with. But we can look at the sentence from language A in question, call it a, and look at the context and say
out of our bilingual database, we have 1000 examples of similar sentences in similar contexts and 950 of tend to match with sentence
b in language B. We return sentence b. I think most would agree this would still produce acceptable results.

Obviously in reality we do not have nearly that much information, but we can continue to decrease the size of our bilingual corpus by
sacrificing quality. The question then becomes, what is the minimum amount of data needed to still produce acceptable results? More and more
data is being gathered and generated and techniques are being created and improved upon all the time. Ultimately, the amount of data
needed may in fact be infeasible, and this approach may certainly fall short.

But nowhere along the line is the computer required to "think like a human".

Edited by egill on 16 December 2010 at 7:50am
2 persons have voted this message useful

furrykef
Senior Member
United States
furrykef.com/
Joined 6281 days ago
681 posts - 862 votes

Speaks: English*
Studies: Spanish, Japanese, Latin, Italian

Message 22 of 25

16 December 2010 at 8:53am | IP Logged

egill wrote:

But they don't understand the context. They don't apply any reasoning at all. Google Translate's method is completely statistical. Its actual understanding of any text is zero.

egill wrote:

For the sake of argument, imagine a program that has every single phrase ever uttered and will be uttered in language A and its
translation in language B. Clearly this program is able to produce acceptable translations: it's a simple matter of looking it up.

Doubtful. You don't always translate the same sentence the same way. "Está bien" might mean "he's fine", or "everything's OK", or "good idea". For your hypothetical program, you would need not only every sentence in the two languages, but the complete context that the sentences occurred in.

I'm skeptical that the approach you describe could ever overcome this obstacle with brute force or mere computational techniques. Hypothetically, yes, but in practice, not so much.

1 person has voted this message useful

egill
Diglot
Senior Member
United States
Joined 5505 days ago
418 posts - 791 votes

Speaks: Mandarin, English*
Studies: German, Spanish, Dutch

Message 23 of 25

16 December 2010 at 9:57pm | IP Logged

furrykef wrote:

...
But they don't understand the context. They don't apply any reasoning at all. Google Translate's method
is completely statistical. Its actual understanding of any text is zero.

I never insisted that they had to understand anything. I think that term is too vaguely defined in this
context to be of use for our discussion. I'm saying it's possible to envision a system that is not based on
replicating human understanding in the same manner.

furrykef wrote:

...
Doubtful. You don't always translate the same sentence the same way. "Está bien" might mean "he's fine", or
"everything's OK", or "good idea". For your hypothetical program, you would need not only every sentence in
the two languages, but the complete context that the sentences occurred in.

I'm skeptical that the approach you describe could ever overcome this obstacle with brute force or mere
computational techniques. Hypothetically, yes, but in practice, not so much.

Of course, it's a horrible approach, it's naive and no sane person would implement it this way. It was merely
meant to be a thought experiment. In practice we don't have every single sentence ever, but we certainly do
have the context for everything in our knowledge base. These sentences have to come from some document,
and we know what came before and after each one in the document.

The specific example you cite deals with the problem of polysemy, where words/phrases can mean more than one
thing. This is a common problem in information retrieval and can be dealt with by using latent semantic
analysis techniques (like LDA). I'm not saying these approaches are sufficient/feasible or even will
necessarily become so, but I'm saying they might help, and that polysemy is not an instant insurmountable road
block.

Latent Semantic Analysis
Latent Dirichet Allocation

To sum it up and at the risk of sounding long-winded: my point is that it's not out of the question for an
acceptable system to be constructed that doesn't have to first completely emulate human intelligence, which
I'm claiming is a harder problem, and here are some ways in might be done.

I think I understand your position a little better now, and correct me if I'm wrong: you're saying that you
don't see how any approach that doesn't first achieve human intelligence could ever give acceptable results,
i.e. the translation problem is no easier than the human intelligence problem. I think that's also a perfectly
reasonable view.

And the end we may simply have to agree to disagree.
1 person has voted this message useful

Pleiades
Diglot
Newbie
United Kingdom
Joined 4905 days ago
10 posts - 15 votes
Speaks: English*, Welsh

Message 24 of 25

21 December 2010 at 5:22pm | IP Logged

I'd never trust an online translator! I implore you to translate a few simple sentences into a random language on Google Translate and then translate it back. You will be amazed at how distorted the result will be, it's often hillarious. Neither can they cope with figurative speech, idioms or any other subleties of expression, it's purely literal.

1 person has voted this message useful

This discussion contains 25 messages over 4 pages: << Prev 1 2 3 4 Next >>

Printable version

You cannot post new topics in this forum - You cannot reply to topics in this forum - You cannot delete your posts in this forum
You cannot edit your posts in this forum - You cannot create polls in this forum - You cannot vote in polls in this forum

This page was generated in 0.4063 seconds.