Thursday, July 31, 2014

Language Log » The state of the machine translation art

And here’s an example of poor unstructured data yielding useless results.

However, to be fair to the statistical machine translation industry, we must allow for any defects in the quality of the input. And after the above paragraphs were posted, Daniel Sterman, an experienced editor with a thorough knowledge of Hebrew, gave me this very useful analysis, which makes a considerable difference:

The original Hebrew is riddled with spelling and grammatical errors, which is why machine translation didn't work. You mentioned in your post "with limited errors" – this sentence's errors go well beyond that, and far into the realm of "my translation software was never designed to handle this level of idiocy".

Language Log » The state of the machine translation art

Extemporaneous Comments on Data Quality

At the MIT Chief Data Officer & Information Quality Symposium last week, I sat down for an on-camera chat with theCube, a production of siliconangle.  Topics included:

  • Why the state of the art in unstructured data quality lags so far behind that for structured data quality.
  • A few ways to apply structured-data DQ techniques to unstructured data.
  • Why Big Data is not revolutionary, and why every Chief Data Officer needs to recognize that.

Watch the video.

Wednesday, July 30, 2014

It's Time to Push Back When Government Controls the Message -

Public relations experts and their clients will frame this as an attempt to honor data quality by keeping the message tight and on point.  But those who see this as a threat to data quality are correct.

Rick Santorum was talking — but not quite without interruption. In a 2005 interview with Mark Leibovich, then of the Washington Post and now of the New York Times, the Republican senator from Pennsylvania was describing how he felt at the funeral of Pope John Paul II.

As Mr. Leibovich wrote it, part of the Santorum interview went like this:

“It’s part of the awe of this job that I do,” he says. “Every day. You’re making these decisions and … ” He fights for the right words. “It’s a great — —”

“Is it humbling, senator?” Robert Traynham, his communications aide, interjects.

“Yes, it’s very humbling!”

“And it’s uniquely American, isn’t it, Senator?” prompts Traynham.

“Oh, absolutely.”

That telling snippet — superbly handled by Mr. Leibovich — was pointed out recently by Ron Fournier, the National Journal columnist and former Associated Press Washington bureau chief, who is one of many journalists pushing back against a pervasive practice: Interviews with government officials that include public relations “minders.”

It's Time to Push Back When Government Controls the Message -