Thursday, July 31, 2014

Language Log » The state of the machine translation art

And here’s an example of poor unstructured data yielding useless results.

However, to be fair to the statistical machine translation industry, we must allow for any defects in the quality of the input. And after the above paragraphs were posted, Daniel Sterman, an experienced editor with a thorough knowledge of Hebrew, gave me this very useful analysis, which makes a considerable difference:

The original Hebrew is riddled with spelling and grammatical errors, which is why machine translation didn't work. You mentioned in your post "with limited errors" – this sentence's errors go well beyond that, and far into the realm of "my translation software was never designed to handle this level of idiocy".

Language Log » The state of the machine translation art

No comments:

Post a Comment