Sunday, November 27, 2011

Understanding The Use/Mention Distinction, But Hoping Your Listeners Don’t

An earlier post referred to the problems that can arise when someone fails to grasp the use/mention distinction.  Here, we have someone who surely grasps it fully but, in a ghastly display of cynicism, ignores it anyway to agitate a political base.

In the ad [for Mitt Romney], Mr. Obama is heard declaring that “if we keep talking about the economy, we’re going to lose.”

Cut out was the context of Mr. Obama’s comment, which was made during the 2008 presidential election, about John McCain, his presidential rival: “Senator McCain’s campaign actually said, and I quote, if we keep talking about the economy, we’re going to lose.”

Democrats Say Romney Ad Distorts Obama's Comments - NYTimes.com

Saturday, November 26, 2011

On Data Quality: Cameras vs. Eyeballs and Long-Term vs. Short-Term Feedback

When observed by humans, medical personal take note and wash their hands more scrupulously than when observed by electronic cameras.  The fix?  Not more human monitoring, but more immediate feedback to the medical personnel.

Hospitals do impossible things like heart surgery on a fetus, but they are apparently stymied by the task of getting health care workers to wash their hands. Most hospitals report compliance of around 40 percent — and that’s using a far more lax measure than North Shore uses.   I.C.U.’s, where health care workers are the most harried, usually have the lowest rates — between 30 and 40 percent.  But these are the places where patients are the sickest and most endangered by infection.

How do hospitals even know their rates?   Some hospitals track how much soap and alcohol gel gets used — a very rough measure.  The current standard of care is to send around the hospital equivalent of secret shoppers — staff members who secretly observe their colleagues and record whether they wash their hands.   This has serious drawbacks:  it is expensive and the results are distorted if health care workers figure out they’re being observed.   One reason the North Shore staff was so shocked by the 6.5 percent hand-washing rate the video cameras found was that measured by the secret shoppers, the rate was 60 percent.

What makes the system function is not the videotaping alone — it’s the feedback.  The nurse manager gets an e-mail message three hours into the shift with detailed information about hand hygiene rates, and again at the end.  The L.E.D. signs are a constant presence in both the surgical and medical I.C.U.s.  “They look at the rates,” said Isabel Law, nurse manager of the surgical I.C.U..  “It becomes a positive competition.  Seeing “Great Shift!!” is important.  It’s human nature that we all want to do well.  Now we have a picture to see how we’re doing.”

An Electronic Eye on Hospital Hand-Washing - NYTimes.com

Saturday, November 19, 2011

Language Log » Justin Bieber Brings Natural Language Processing to the Masses

Philip Resnik on natural language processing (NLP) and sentiment analysis.

My worry is compounded by the fact that social media sentiment analyses are being presented without the basic caveats you invoke in related polling scenarios. When you analyze social media you have not only a surfeit of conventional accuracy concerns like sampling error and selection bias (how well does the population of people whose posts you're analyzing represent the population you're trying to describe?), but also the problem of "automation bias" — in this case trusting that the automatic text analysis is correct. Yet the very same news organization that reports traditional opinion poll results with error bars and a careful note about the sample size will present Twitter sentiment analysis numbers as raw percentages, without the slightest hint of qualification.

What's the alternative? Twenty years ago the NLP community managed to break past the failures of the knowledge engineering era by making a major methodological shift from knowledge engineering to machine learning and statistical approaches. Instead of building expert knowledge into systems manually, we discovered the power of having human experts annotate or label language data, allowing a supervised learning system to train on examples of the inputs it will see, paired with the answers we want it to produce. (We call such algorithms "supervised" because the training examples include the answers we're looking for.) Today's state of the art NLP still incorporates manually constructed knowledge prudently where it helps, but it is fundamentally an enterprise driven by labeled training data. As Pang and Lee discuss in their widely read survey of the field, sentiment analysis is no exception, and it has correspondingly seen "a large shift in direction towards data-driven approaches", including a "very active line of work" applying supervised text categorization algorithms.

Nonetheless, I've argued recently that NLP's first statistical revolution is now being followed by a second technological revolution, one driven in large part by the needs of large scale social media analysis. The problem is that, faced with an ocean of wildly diverse language, there's no way to annotate enough training data so that supervised machine learning systems work well on whatever you throw at them. As a result, we are seeing the rise of semi-supervised methods. These let you bootstrap your learning using smaller quantities of high quality annotated training examples (that's the "supervised"), together with lots of unannotated examples of the inputs your system will see (that's the "semi").

As for sentiment analysis, by all means, let's continue to be excited about bringing NLP to the masses, and let's get them excited about it, too. But at the same time, let's avoid extravagant claims about computers understanding the meaning of text or the intent behind it. At this stage of the game, machine analysis should be a tool to support human insight, and its proper use should involve a clear recognition of its limitations.

Language Log » Justin Bieber Brings Natural Language Processing to the Masses

Thursday, November 17, 2011

You Must Remember This…

Is remember a synonym of persist, store, or write down?  To some programmers it is, but to civilians it is emphatically not.  In fact, to civilians, storing data is what you do instead of remembering it: (“I didn’t remember your phone number, but I did jot it down.”)

The distinction between remembering stuff and writing it down was recognized in antiquity.  Of writing, Plato said:

“for this discovery of yours will create forgetfulness in the learners' souls, because they will not use their memories; they will trust to the external written characters and not remember of themselves.”  -Plato, Phaedrus, available here.

Programmers would do well to remember this point, or, short of that, write it down.  It is a particular instance of the general phenomenon I described in a blog entry earlier this year (When Specialist Appropriate Words).

I have witnessed costly miscommunication between IT personnel and their clients because the former insisted on interpreting the word remember as a requirement to store data on disk—a requirement the users did not actually have.

Saturday, November 5, 2011

Gold Star For Information Responsibility: A Case of a Scientist Using Data To Prove Himself Wrong

Here an excerpt from the full story reported by the Associated Press.  This is an example of a principle of information responsibility I blogged about earlier this year: If you value truth, prove yourself wrong.  Of course, if the folks at the Cultural Cognition Project are to be believed (and it unfortunately seems that they are), Richard Muller’s data-supported apostasy will have little effect on the debate.

WASHINGTON (AP) -- A prominent physicist and skeptic of global warming spent two years trying to find out if mainstream climate scientists were wrong. In the end, he determined they were right: Temperatures really are rising rapidly.

The study of the world's surface temperatures by Richard Muller was partially bankrolled by a foundation connected to global warming deniers. He pursued long-held skeptic theories in analyzing the data. He was spurred to action because of "Climategate," a British scandal involving hacked emails of scientists.

Thursday, November 3, 2011

What's that Sentiment? Text Analytics for Context 11/3/11

I’ll be appearing today (3:00 pm EDT) on DM Radio, discussing text analytics with three industry/vendor representatives.  Here’s the description.  You can register for the program here

When the customer is happy, everyone's happy. But how can a manager, director or VP know when most customers are happy? Besides waiting until too many people are unhappy (at which point it's likely too late), one option is to employ text analytics. This discipline can give companies a competitive edge by tipping off management before a key trend line turns south. Register for this episode of DM Radio to hear Hosts Eric Kavanagh and Jim Ericson as they interview Analyst Joe Maguire along with guests Fiona McNeil of SAS, Olivier Jouve of SPSS/IBM, and Usama Fayyad of Open Insights.

What's that Sentiment? Text Analytics for Context 11/3/11

Noted Dutch Psychologist, Stapel, Accused of Research Fraud - NYTimes.com

Sounds like this phenomenon is common enough to deserve a study by psychologists, but wait…

The scandal, involving about a decade of work, is the latest in a string of embarrassments in a field that critics and statisticians say badly needs to overhaul how it treats research results. In recent years, psychologists have reported a raft of findings on race biases, brain imaging and even extrasensory perception that have not stood up to scrutiny. Outright fraud may be rare, these experts say, but they contend that Dr. Stapel took advantage of a system that allows researchers to operate in near secrecy and massage data to find what they want to find, without much fear of being challenged.

Dr. Stapel was able to operate for so long, the committee said, in large measure because he was “lord of the data,” the only person who saw the experimental evidence that had been gathered (or fabricated). This is a widespread problem in psychology, said Jelte M. Wicherts, a psychologist at the University of Amsterdam. In a recent survey, two-thirds of Dutch research psychologists said they did not make their raw data available for other researchers to see. “This is in violation of ethical rules established in the field,” Dr. Wicherts said.

In a survey of more than 2,000 American psychologists scheduled to be published this year, Leslie John of Harvard Business School and two colleagues found that 70 percent had acknowledged, anonymously, to cutting some corners in reporting data. About a third said they had reported an unexpected finding as predicted from the start, and about 1 percent admitted to falsifying data.

Also common is a self-serving statistical sloppiness. In an analysis published this year, Dr. Wicherts and Marjan Bakker, also at the University of Amsterdam, searched a random sample of 281 psychology papers for statistical errors. They found that about half of the papers in high-end journals contained some statistical error, and that about 15 percent of all papers had at least one error that changed a reported finding — almost always in opposition to the authors’ hypothesis.

Noted Dutch Psychologist, Stapel, Accused of Research Fraud - NYTimes.com