Saturday, December 31, 2011

N.Y.P.D. Leaves Offenses Unrecorded to Keep Crime Rates Down - NYTimes.com

Institutional information irresponsibility, Law-and-Order Division:

Crime victims in New York sometimes struggle to persuade the police to write down what happened on an official report. The reasons are varied. Police officers are often busy, and few relish paperwork. But in interviews, more than half a dozen police officers, detectives and commanders also cited departmental pressure to keep crime statistics low.

N.Y.P.D. Leaves Offenses Unrecorded to Keep Crime Rates Down - NYTimes.com

Thursday, December 22, 2011

Patients Want To Read Doctors' Notes, But Many Doctors Balk : Shots - Health Blog : NPR

Whose IV line is it anyway?

The patients overwhelming thought that open visit notes were a great idea. They said it would give them more control and be better prepared for appointments. They also said it would help them do a better job following doctors' orders. More than 37,000 patients took part in the survey, and 92 to 97 percent endorsed access. That's a lot of enthusiasm.

The doctors, however, aren't so sure. Most thought the patients would be more confused and worried if they saw their notes. The doctors also thought they'd have to work more as a result.

About one-third of the 173 doctors polled decided not to take part in the OpenNotes project. For them, the prospect of patients peering over their shoulders meant they would have to be less candid, particularly when writing about such touchy subjects as cancer, obesity, substance abuse and mental health. And unlike the doctors who signed on to the experiment, they didn't think all this hassle would make patient care better or safer.

But for now, here's a clue. In May 2009, the University of Texas M.D. Anderson Cancer Center started letting patients read their electronic medical records online. More than 84 percent of current patients have looked at their records. Referring physicians are using them, too. The two most common requests they make are that doctors fix something written down incorrectly, and for how best to translate medical jargon.

"As a result, they are more informed about their care plan and diagnostic results and ask smarter, more focused questions," Thomas Feeley, vice president of medical operations at M.D. Anderson, and Kenneth Shine, executive vice chancellor for health affairs at the University of Texas system, wrote in an accompanying editorial in Annals. And yes, the doctors do complain about the time it takes to explain what they wrote. But all and all, they're happy with it.

Patients Want To Read Doctors' Notes, But Many Doctors Balk : Shots - Health Blog : NPR

Tuesday, December 20, 2011

What Scolds Around Comes Around

Hmm… An incisive column by Gail Collins got me thinking about scoldism, so I blogged about it earlier today.  Now, less than a half-day later, I encounter this piece from NPR calling Ms. Collins for excessive zeal in scolding Mitt Romney for something he did 28 years ago.  That’s eight years above the statute of limitations that Ms. Collins herself suggested for tut-tutting a public figure about an embarrassing romantic partner.

I do not share Brendan Nyhan’s opinion that strapping a dog to the roof of a car for a long trip is “inconsequential.”  And there’s an apples-and-oranges things going on here: Holding one accountable for his own decades-old actions is not equivalent to holding one accountable for the actions of a decades-old romantic partner.

Nevertheless, this blog is about information responsibility, and I am duty-bound to mention that the very columnist who got me thinking about scoldism can appear to others to be something of a scold herself.  Here is an excerpt from the NPR website:

Plenty of folks have their unshakable obsessions. Indiana Jones sought the Holy Grail. Captain Ahab pursued the Great White Whale. For New York Times columnist Gail Collins, it's her fixation on the voyages of an Irish Setter named Seamus.

"For some reason, the idea that you've got this guy who would drive all the way to Canada with an Irish setter sitting on the top of the car — it absolutely fascinated me," Collins said.

By "this guy," Collins means Mitt Romney — as in the Republican presidential candidate — and the trip is a family vacation back in 1983 when Romney put the dog in a crate tied to the top of the family station wagon and drove off.

Collins mentioned the dog so often that Dartmouth political science professor Brendan Nyhan started keeping a running tally. "She's trying to be funny — I get that. I appreciate a good campaign story as much as the next person," Nyhan said. "But I do think it's representative of the way that the media focuses on trivia, things that are so inconsequential. Mitt Romney is not running for dogcatcher — he's running for president of the United States."

Nyhan is a Democrat and blogger for the Columbia Journalism Review — and he says he's not a Romney supporter.

"The deeper problem here is the way that pundits want to put candidates on the couch and psychoanalyze them, so this is being used to illustrate some sort of deeper underlying flaw in Mitt Romney's personality," Nyhan said. "But Gail Collins is not a psychologist and I'm not sure how much this really tells us about whether he'd be a good president."

Why Is Times Columnist Gail Collins So Obsessed With Mitt Romney's Dog? : It's All Politics : NPR

Scoldism Is The New Black

I’ll declare that “Scoldism”—apparently I’ve committed a neologism with that one—is an especially pernicious and aggressive form of sanctimony. The say-anything, reveal-everything ethos of the web provides a target-rich environment for scolds.

As scoldism thrives, what will happen?  Perhaps folks will temper their online behavior.  Individuals might do that as they mature, but the web milieu will always harbor the rhetorical style known as “Too Much Information.” That’s because teenagers are a renewable resource.

Perhaps only the powerful will be able to control their information.  (Where are George W. Bush’s driving records?)

Or perhaps the world will gradually learn to forgive certain indiscretions, youthful or otherwise. The second directive of “Forgive and forget” is no longer possible, so we might need to develop the capacity to forgive anyway. Call it “Forgive and whatever.”

I’m not promoting idiocy and I’m not dismissing consequences.  Rather, I’m encouraging folks to recognize that lives have narrative arcs.  Can you remember your most foolish moment?  Is it on Facebook?

The Golden Age of Scoldism

It’s probably too late to rescue your privacy. (Some folks are trying; see the previous post.) It says here that facts about your past can and will be used cynically against you by your political enemies. This from a recent column by Gail Collins:

New unnerving development in Congress: Some senators are claiming that a woman nominated to be ambassador to El Salvador can’t have the job because they don’t like a boyfriend she lived with almost 20 years ago.

These days, it’s hard enough to get kids to understand the possible future employment consequences of appearing naked on Facebook. If they hear about this one, they’ll give up entirely.

However, who of us does not have a difficult significant other in the distant past? There has to be a statute of limitations on this sort of thing, and my vote would be for a decade, max.

The Ghost of Boyfriends Past - NYTimes.com

Please Stop Sharing: A Tweet (Or More) Too Far - NYTimes.com

Nice idea, but probably too little too late.

But there is only one difference between the knuckleheads of yore — me, for example — who did numerous stupid things between the onset of puberty and a late adolescence lasting to nearly 30, and those Twit-iots of the 21st century.

And that is technology. Facebook, Twitter, cell phone text messages and palm-size appliances yet to sprout from Apple’s labs allow all of us to be banal in real time.

“I’m a moron, Siri,” I can tell my new iPhone 4S robo-assistant. “Please share with everyone.”

Let the counterrevolt begin; the shying of America would be a welcome thing.

Please Stop Sharing: A Tweet (Or More) Too Far - NYTimes.com

Monday, December 19, 2011

Anthropomorphism in names: More product loyalty and more cow’s milk

A recent article in Slate reminded me of the 2009 Ig Nobel Prize in Veterinary Medicine. 

Excerpt from Slate:

The idea of a talking machine with a human-sounding name isn’t new, of course, but Siri’s predecessors were mostly fictional. Think of the arch KITT, the silicon brain of a Pontiac Trans Am in the TV series Knight Rider; Joshua, the troubled NORAD computer in the film War Games; and most famously, the eerily calm HAL of 2001: A Space Odyssey. These were mere characters, but they also reflected a universal human impulse: When we talk to something, or when it talks to us, we want to call it by a name. Have you noticed how many drivers give names to their GPS devices?

Using a human-style name reflects our relationship with the thing being named, and shapes it, too. Indoor pets, for instance, tend to be given more human names than outdoor animals. Assigning a name to a car or other possession is both a sign of growing affection and a spur to further bonding. Around my house, I've found that it's nearly impossible to throw out any object that my kids have named. Names give objects emotional life. You say, "the iPhone" and "my iPhone," but not "the Siri." It—she—is simply Siri. The name makes the act of conversing with a metal slab feel natural. And that emotional connection seems to invite a powerful kind of consumer loyalty.

Excerpt from the website of the Annals of Improbable Research, where the 2009 Ig Nobel Prize in Veterinary Medicine is described:

VETERINARY MEDICINE PRIZE: Catherine Douglas and Peter Rowlinson of Newcastle University, Newcastle-Upon-Tyne, UK, for showing that cows who have names give more milk than cows that are nameless.

REFERENCE: "Exploring Stock Managers' Perceptions of the Human-Animal Relationship on Dairy Farms and an Association with Milk Production," Catherine Bertenshaw [Douglas] and Peter Rowlinson, Anthrozoos, vol. 22, no. 1, March 2009, pp. 59-69. DOI: 10.2752/175303708X390473.

Thursday, December 8, 2011

Facebook: “There are no published cases of NoSQL databases operating at the scale of Facebook’s MySQL database.”

From a recent GigaOM report.  So if Facebook doesn’t need NoSQL, who does?

Callaghan was more open to using NoSQL databases, but said they’re still not quite ready for primetime, especially for mission-critical workloads such as Facebook’s user database. The implementations just aren’t as mature, he said, and there are no published cases of NoSQL databases operating at the scale of Facebook’s MySQL database. And, Callaghan noted, the HBase engineering team at Facebook is quite a bit larger than the MySQL engineering team, suggesting that tuning HBase to meet Facebook’s needs is more resource-intensive process than is tuning MySQL at this point.

Facebook shares some secrets on making MySQL scale — Cloud Computing News

Sunday, November 27, 2011

Understanding The Use/Mention Distinction, But Hoping Your Listeners Don’t

An earlier post referred to the problems that can arise when someone fails to grasp the use/mention distinction.  Here, we have someone who surely grasps it fully but, in a ghastly display of cynicism, ignores it anyway to agitate a political base.

In the ad [for Mitt Romney], Mr. Obama is heard declaring that “if we keep talking about the economy, we’re going to lose.”

Cut out was the context of Mr. Obama’s comment, which was made during the 2008 presidential election, about John McCain, his presidential rival: “Senator McCain’s campaign actually said, and I quote, if we keep talking about the economy, we’re going to lose.”

Democrats Say Romney Ad Distorts Obama's Comments - NYTimes.com

Saturday, November 26, 2011

On Data Quality: Cameras vs. Eyeballs and Long-Term vs. Short-Term Feedback

When observed by humans, medical personal take note and wash their hands more scrupulously than when observed by electronic cameras.  The fix?  Not more human monitoring, but more immediate feedback to the medical personnel.

Hospitals do impossible things like heart surgery on a fetus, but they are apparently stymied by the task of getting health care workers to wash their hands. Most hospitals report compliance of around 40 percent — and that’s using a far more lax measure than North Shore uses.   I.C.U.’s, where health care workers are the most harried, usually have the lowest rates — between 30 and 40 percent.  But these are the places where patients are the sickest and most endangered by infection.

How do hospitals even know their rates?   Some hospitals track how much soap and alcohol gel gets used — a very rough measure.  The current standard of care is to send around the hospital equivalent of secret shoppers — staff members who secretly observe their colleagues and record whether they wash their hands.   This has serious drawbacks:  it is expensive and the results are distorted if health care workers figure out they’re being observed.   One reason the North Shore staff was so shocked by the 6.5 percent hand-washing rate the video cameras found was that measured by the secret shoppers, the rate was 60 percent.

What makes the system function is not the videotaping alone — it’s the feedback.  The nurse manager gets an e-mail message three hours into the shift with detailed information about hand hygiene rates, and again at the end.  The L.E.D. signs are a constant presence in both the surgical and medical I.C.U.s.  “They look at the rates,” said Isabel Law, nurse manager of the surgical I.C.U..  “It becomes a positive competition.  Seeing “Great Shift!!” is important.  It’s human nature that we all want to do well.  Now we have a picture to see how we’re doing.”

An Electronic Eye on Hospital Hand-Washing - NYTimes.com

Saturday, November 19, 2011

Language Log » Justin Bieber Brings Natural Language Processing to the Masses

Philip Resnik on natural language processing (NLP) and sentiment analysis.

My worry is compounded by the fact that social media sentiment analyses are being presented without the basic caveats you invoke in related polling scenarios. When you analyze social media you have not only a surfeit of conventional accuracy concerns like sampling error and selection bias (how well does the population of people whose posts you're analyzing represent the population you're trying to describe?), but also the problem of "automation bias" — in this case trusting that the automatic text analysis is correct. Yet the very same news organization that reports traditional opinion poll results with error bars and a careful note about the sample size will present Twitter sentiment analysis numbers as raw percentages, without the slightest hint of qualification.

What's the alternative? Twenty years ago the NLP community managed to break past the failures of the knowledge engineering era by making a major methodological shift from knowledge engineering to machine learning and statistical approaches. Instead of building expert knowledge into systems manually, we discovered the power of having human experts annotate or label language data, allowing a supervised learning system to train on examples of the inputs it will see, paired with the answers we want it to produce. (We call such algorithms "supervised" because the training examples include the answers we're looking for.) Today's state of the art NLP still incorporates manually constructed knowledge prudently where it helps, but it is fundamentally an enterprise driven by labeled training data. As Pang and Lee discuss in their widely read survey of the field, sentiment analysis is no exception, and it has correspondingly seen "a large shift in direction towards data-driven approaches", including a "very active line of work" applying supervised text categorization algorithms.

Nonetheless, I've argued recently that NLP's first statistical revolution is now being followed by a second technological revolution, one driven in large part by the needs of large scale social media analysis. The problem is that, faced with an ocean of wildly diverse language, there's no way to annotate enough training data so that supervised machine learning systems work well on whatever you throw at them. As a result, we are seeing the rise of semi-supervised methods. These let you bootstrap your learning using smaller quantities of high quality annotated training examples (that's the "supervised"), together with lots of unannotated examples of the inputs your system will see (that's the "semi").

As for sentiment analysis, by all means, let's continue to be excited about bringing NLP to the masses, and let's get them excited about it, too. But at the same time, let's avoid extravagant claims about computers understanding the meaning of text or the intent behind it. At this stage of the game, machine analysis should be a tool to support human insight, and its proper use should involve a clear recognition of its limitations.

Language Log » Justin Bieber Brings Natural Language Processing to the Masses

Thursday, November 17, 2011

You Must Remember This…

Is remember a synonym of persist, store, or write down?  To some programmers it is, but to civilians it is emphatically not.  In fact, to civilians, storing data is what you do instead of remembering it: (“I didn’t remember your phone number, but I did jot it down.”)

The distinction between remembering stuff and writing it down was recognized in antiquity.  Of writing, Plato said:

“for this discovery of yours will create forgetfulness in the learners' souls, because they will not use their memories; they will trust to the external written characters and not remember of themselves.”  -Plato, Phaedrus, available here.

Programmers would do well to remember this point, or, short of that, write it down.  It is a particular instance of the general phenomenon I described in a blog entry earlier this year (When Specialist Appropriate Words).

I have witnessed costly miscommunication between IT personnel and their clients because the former insisted on interpreting the word remember as a requirement to store data on disk—a requirement the users did not actually have.

Saturday, November 5, 2011

Gold Star For Information Responsibility: A Case of a Scientist Using Data To Prove Himself Wrong

Here an excerpt from the full story reported by the Associated Press.  This is an example of a principle of information responsibility I blogged about earlier this year: If you value truth, prove yourself wrong.  Of course, if the folks at the Cultural Cognition Project are to be believed (and it unfortunately seems that they are), Richard Muller’s data-supported apostasy will have little effect on the debate.

WASHINGTON (AP) -- A prominent physicist and skeptic of global warming spent two years trying to find out if mainstream climate scientists were wrong. In the end, he determined they were right: Temperatures really are rising rapidly.

The study of the world's surface temperatures by Richard Muller was partially bankrolled by a foundation connected to global warming deniers. He pursued long-held skeptic theories in analyzing the data. He was spurred to action because of "Climategate," a British scandal involving hacked emails of scientists.

Thursday, November 3, 2011

What's that Sentiment? Text Analytics for Context 11/3/11

I’ll be appearing today (3:00 pm EDT) on DM Radio, discussing text analytics with three industry/vendor representatives.  Here’s the description.  You can register for the program here

When the customer is happy, everyone's happy. But how can a manager, director or VP know when most customers are happy? Besides waiting until too many people are unhappy (at which point it's likely too late), one option is to employ text analytics. This discipline can give companies a competitive edge by tipping off management before a key trend line turns south. Register for this episode of DM Radio to hear Hosts Eric Kavanagh and Jim Ericson as they interview Analyst Joe Maguire along with guests Fiona McNeil of SAS, Olivier Jouve of SPSS/IBM, and Usama Fayyad of Open Insights.

What's that Sentiment? Text Analytics for Context 11/3/11

Noted Dutch Psychologist, Stapel, Accused of Research Fraud - NYTimes.com

Sounds like this phenomenon is common enough to deserve a study by psychologists, but wait…

The scandal, involving about a decade of work, is the latest in a string of embarrassments in a field that critics and statisticians say badly needs to overhaul how it treats research results. In recent years, psychologists have reported a raft of findings on race biases, brain imaging and even extrasensory perception that have not stood up to scrutiny. Outright fraud may be rare, these experts say, but they contend that Dr. Stapel took advantage of a system that allows researchers to operate in near secrecy and massage data to find what they want to find, without much fear of being challenged.

Dr. Stapel was able to operate for so long, the committee said, in large measure because he was “lord of the data,” the only person who saw the experimental evidence that had been gathered (or fabricated). This is a widespread problem in psychology, said Jelte M. Wicherts, a psychologist at the University of Amsterdam. In a recent survey, two-thirds of Dutch research psychologists said they did not make their raw data available for other researchers to see. “This is in violation of ethical rules established in the field,” Dr. Wicherts said.

In a survey of more than 2,000 American psychologists scheduled to be published this year, Leslie John of Harvard Business School and two colleagues found that 70 percent had acknowledged, anonymously, to cutting some corners in reporting data. About a third said they had reported an unexpected finding as predicted from the start, and about 1 percent admitted to falsifying data.

Also common is a self-serving statistical sloppiness. In an analysis published this year, Dr. Wicherts and Marjan Bakker, also at the University of Amsterdam, searched a random sample of 281 psychology papers for statistical errors. They found that about half of the papers in high-end journals contained some statistical error, and that about 15 percent of all papers had at least one error that changed a reported finding — almost always in opposition to the authors’ hypothesis.

Noted Dutch Psychologist, Stapel, Accused of Research Fraud - NYTimes.com

Monday, October 31, 2011

Language Log » On the front lines of Twitter linguistics

In this post on language log, Ben Zimmer elaborates on this op-ed piece that appeared in Sunday’s New York Times.  The language log post elaborates on and links to some of the research that was only briefly mentioned in the op-ed piece.  Here’s a charming example of the elaboration:

  • Eisenstein and Bamman are currently conducting research with Tyler Schnoebelen of Stanford University that looks at how gender plays a role in language variation on Twitter. But they're going well beyond simply analyzing which language forms are associated with women and which are associated with men. Using information on people's Twitter followers, they can also take into consideration the gender makeup of people's networks. Thus, a man with a predominantly female network may show different linguistic patterns compared to a man with a male or mixed network. Earlier today, at NWAV 40, Schnoebelen presented some of his research on one aspect of Twitter discourse, emoticons. The abstract of his paper includes this great line: "Emoticons with noses are historically older." It's true! Not only that, but emoticons with noses, like :-), show distinctly different patterns of distribution than the noseless kind, like :) . Noseless emoticons tend to be used by younger Twitter users and are associated with more informal discourse. Women use them more than men, too, but women use more of all types of emoticons. I'll be looking forward to the definitive study of emoticon nosedness.
  • Language Log » On the front lines of Twitter linguistics

    Thursday, October 27, 2011

    Hugh’s on First: How Claude Shannon Could Help Baseball Managers

    Game six of the World Series between the Texas Rangers and the St. Louis Cardinals was postponed because of rain. Cardinals manager Tony La Russa said he would use the night off to see “Moneyball,” the movie about data analytics in the major leagues. 

    Moneyball, Schmoneyball.  What La Russa really needs is a primer on information theory a la Claude Shannon.  Some reporters covering the series could benefit too.

    During game five of the series, some costly telephonic miscommunications occurred between La Russa or pitching coach Dave Duncan (in the dugout) and bullpen coach Derek Lilliquist (in the bullpen).  At various times during two phone conversations, three pitchers’ surnames were mentioned or heard or both.  Those surnames are Motte, Lynn, and Rzepczynski (zep-CHIN-ski).  The miscommunications yielded the farcical situation of a pitcher entering the game to issue an intentional walk to the only batter he would face.  (A chronology of this comedy of errors appears at the end of this post.)

    Much has already been written about these events, including commentary on La Russa’s aggressiveness in deploying relief pitchers, the quaintness of land lines that connect dugouts to bullpens in major-league ball parks, and the difficulty of hearing a phone call while 50,000 nearby fans are screaming.

    Because game six was postponed, sportswriters continue to discuss the game-five debacle.  Thus we get this story in Thursday’s New York Times, accompanied by the following graphic:

                  image

    The story—and the caption in the graphic above—suggest that of the three names, Rzepczynski’s is the most likely to be mis-communicated because it is hardest to say.  This analysis does not comport with what we know about information theory.  It is thoroughly un-amazing that Rzepczynski’s name was conveyed successfully.

    Likewise, it is not terribly surprising that Motte’s name was misheard.  You’re just asking for trouble if your name rhymes with “not.”  Lynn’s name rhymes with “win,” “in,” and “I cannot hear your instructions over the stadium din.”

    But it’s not really about rhyming; it’s about information distance.  The problem with Motte’s name is that it is a short information distance to any number of other words that might reasonably uttered during a conversation about baseball: not, hot, got, dot.  Lynn’s name has the same problem.  Rzepczynski’s name emphatically does not.

    If the bullpen coach mishears a single consonant of Motte’s name or Lynn’s name, he might reasonably believe that he heard another word; miscommunication can result.  But if he mishears a consonant or other small portion of Rzepcnynski’s name, so what?  He’s very likely to get the message anyway.

    Case in point:  Did you even notice that I mis-spelled Rzepczynski’s name at the end of the last paragraph? 

    Background for baseball nerds:

    And now, as promised, here is the reported chronology of some events during the eighth inning of game five:

    1. The Rangers are batting, facing Cardinals pitcher Octavio Dotel.  From the Cardinals’ dugout, La Russa or Duncan calls the bullpen and requests that two pitchers start warming up:  Rzepczynski and Motte.
    2. In the bullpen, coach Lilliquist hears only Rzepczynski’s name. (La Russa later said that Lilliquist had hung up too early, before he said Motte’s name.)
    3. Rzepczynski comes in the game to face the left-handed-hitting David Murhpy, who hits an infield single.
    4. La Russa notices that Motte is not warming up in the bullpen, and calls again, reiterating that request.
    5. In the bullpen, coach Lilliquest mishears La Russa’s words as a request for Lynn to begin warming up.
    6. Because Motte is not ready, Rzepczynski remains in the game to face right-handed-hitting Mike Napoli, who hits a two-run double.  (La Russa had wanted Motte to pitch to Napoli.)  Rzepczynski then strikes out the next batter, Mitch Moreland.
    7. After Moreland strikes out, La Russa removes Rzepczynski from the game and requests the next relief pitcher.  La Russa believes he is summoning Motte, and is surprised when Lynn walks to the pitcher’s mound from the bullpen.
    8. Because the rules of baseball require that any pitcher who enters the game must face at least one batter and because Lynn is supposed to be resting his arm that day, La Russa instructs Lynn to intentionally walk the next batter with four gentle, low-stress pitches.
    9. Motte finally enters the game, three batters later than La Russa had envisioned.

    Thursday, October 6, 2011

    Don’t Get Mad, Get Lucid

    A recent kerfuffle has reminded me of a not-so-recent one.

    Recent Kerfuffle:

    Last month, upon hearing her professor cite a noxious, indefensible anti-Semitic opinion as an example of a noxious, indefensible opinion, York University student Sarah Grunfeld heedlessly stormed out of class and accused her professor of anti-Semitism.  Later, Grunfeld doubled down on her mistake, saying “The words, ‘Jews should be sterilized’ still came out of his mouth, so regardless of the context I still think that’s pretty serious.”

    Not-So-Recent Kerfuffle:

    In 2002, the creators of the New York State Regents exams in high-school English were found to have sanitized literary excerpts used in those exams. Removed from the passages were any references to race, religion, ethnicity, sex, nudity, alcohol, and even words like “fat” and “skinny.” These modifications were made without the authors’ knowledge or permission, and were not acknowledged on the exams.

    Connection:

    When students are not required to think seriously about disturbing ideas, they end up incapable of thinking clearly and responsibly about disturbing ideas. Is this what happened to Sarah Grunfeld?

    If your reading comprehension nosedives whenever you get agitated about something, you are insufficiently equipped to participate in civil society. You probably make a lousy employee too, and I’m unlikely to enjoy sitting near you at a dinner party.

    Note to high school English students: The New York State Board of Regents is saying that the thing of it is, if you can keep your head while all about you are losing theirs, like, so what?

    Note To Sarah Grunfeld:  If you insist on blaming educators for your mistake (“The words … came out of his mouth...”), don’t blame your current educator whose words made you, like, lose your head; blame your previous educators who failed to teach you, like, how to keep it.

    The remainder of this post contains background information on the recent kerfuffle and the old one.

    Background (from 2002) on the Regents Exams:

    The New York Times covered the story here.  Some excerpts:

    In a feat of literary sleuth work, Ms. Heifetz, the mother of a high school senior and a weaver from Brooklyn, inspected 10 high school English exams from the past three years and discovered that the vast majority of the passages -- drawn from the works of Isaac Bashevis Singer, Anton Chekhov and William Maxwell, among others -- had been sanitized of virtually any reference to race, religion, ethnicity, sex, nudity, alcohol, even the mildest profanity and just about anything that might offend someone for some reason. Students had to write essays and answer questions based on these doctored versions -- versions that were clearly marked as the work of the widely known authors.

    In an excerpt from the work of Mr. Singer, for instance, all mention of Judaism is eliminated, even though it is so much the essence of his writing. His reference to ''Most Jewish women'' becomes ''Most women'' on the Regents, and ''even the Polish schools were closed'' becomes ''even the schools were closed.'' Out entirely goes the line ''Jews are Jews and Gentiles are Gentiles.'' In a passage from Annie Dillard's memoir, ''An American Childhood,'' racial references are edited out of a description of her childhood trips to a library in the black section of town where she is almost the only white visitor, even though the point of the passage is to emphasize race and the insights she learned about blacks.

    The modifications to the passages ranged widely. In the Chekhov story ''The Upheaval,'' the exam takes out the portion in which a wealthy woman looking for a missing brooch strip-searches all of the house's staff members. Students are then asked to use the story to write an essay on the meaning of human dignity.

    A paragraph in John Holt's ''Learning All the Time'' is truncated to eliminate some of the reasons Suzuki violin instruction differs in Japan and the United States, apparently not to offend anyone who might find the particulars somehow insulting. Students are nonetheless then asked to answer questions about those differences.

    One passage was derived from Frank Conroy's memoir, ''Stop-Time.'' The changes include replacing ''hell'' with ''heck'' in one sentence and excising references to sex, religion, nudity and potential violence (in the form of the declared intent of two boys to kill a snake) that are essential to an understanding of the passage.

    ''I was just completely shocked,'' Mr. Conroy said. ''It's going through and taking out the flavor of the month. It's terrible.''

    A number of the writers and scholars Ms. Heifetz contacted have written indignant letters that have also been submitted to the education commissioner. Mr. Conroy wrote in part: ''Who are these people who think they have a right to 'tidy up' my prose? The New York State Political Police? The Correct Theme Authority?''

    Background on the Bogus Anti-Semitism Claim:

    Excerpt from the Toronto Star (full story here):

    A half-listening student, a hypersensitive campus and the speed at which gossip travels on the Internet conspired to create a very damaging game of broken telephone for one York University professor this week.

    Cameron Johnston, who has been teaching at York for more than 30 years, has been forced to respond to allegations that he made anti-Semitic remarks in a lecture on Monday afternoon after a student misunderstood his comments and began sending emails to Jewish groups and the media.

    Johnston was giving his introductory lecture to Social Sciences 1140: “Self, Culture and Society,” when he explained to the nearly 500 students that the course was going to focus on texts, not opinions, and despite what they may have heard elsewhere, everyone is not entitled to their opinion.

    “All Jews should be sterilized” would be an example of an unacceptable and dangerous opinion, Johnston told the students.

    He didn’t notice Sarah Grunfeld storm out. Grunfeld, a 22-year-old in her final year at York, understood Johnston’s example to be his personal opinion.

    Excerpt from commentary by the Atlantic Monthly (full post here):

    Grunfeld, who is not the world's best listener, was off to complain to Jewish groups about the outrageous opinion that her professor didn't have. Soon campus Jewish groups contacted Johnston for an explanation, and the whole thing turned into a very big deal on campus. Johnston, understandably, says he's "terribly upset." Grunfeld, meanwhile, is sticking to her guns: “The words, ‘Jews should be sterilized’ still came out of his mouth, so regardless of the context I still think that’s pretty serious," were words that came out of her mouth.

    Excerpt from commentary on The Language Log, which points out some of the technical aspects of “use/mention distinction,” upon which this whole mess hinges:

    It's a dangerous path one treads when one tries to give examples of obnoxious propositions in a classroom where not all the students have a firm grasp of the fundamental distinction between the use and the mention of a linguistic expression.

    Lest readers of the “Information Pleas” Blog think I’m bashing Sarah Grunfeld because she is a young adult (everything I’ve read seems to mention her age), here is an excerpt from comments by Jonathan Kay of the National Post about how older users of the internet can be more credulous than younger, and are often implicated in propagating internet falsehoods such as the bogus claim of anti-Semitism at York University:

    Putting Sarah herself to one side, I found it interesting how quickly the story made the rounds of the internet. Within the space of a few hours on Monday night, five different middle-aged or senior-citizen Jewish correspondents sent me variations on this story. “York U [prof] makes anti-Semetic remark. Verifiable,” read one subject line. Another woman asked: “Do we think he’d say ‘All Muslims are terrorists,’ or ‘All blacks should be slaves’?” With every cycle of mass email forwarding, the story was getting more sensational.

    This is part of a trend. When I started this job in 1998, most of the bogus stories I got by email were from younger correspondents — because there just weren’t that many older people online. But then two things happened.

    First, young web surfers taught themselves how to check facts, by using Wikipedia and Snopes and other reputable sites. To avoid making reply-all fools of themselves, they stopped mass-forwarding bogus stories of the York U variety.

    Second, when those young adults started going off to college, or moving away — their parents had to figure out email and Facebook and Webcams in order to communicate with their kids and view pictures of their grandchildren. But these 50-, 60-, and 70-year old Internauts, having grown up in the age of print, never figured out that most of what you read online is made up. So when their sister-in-law’s hairdresser sends them something shocking, they uncriticially pass it on to their friends.

    This explains why many middle-aged people and senior citizens I meet are actually more misinformed and radicalized than their children. Many Tea Party fanatics, in particular, are older white people who have cobbled a political philosophy together from nonsense Internet stories claiming that Barack Obama is Muslim, that global warming has been “debunked” or that universal health care means sending grandma to a “death panel.”

    Tuesday, September 20, 2011

    One Statement from Bachmann, Two Steps Back for HPV Vaccine - NYTimes.com

    Information irresponsibility, medico-politico division…

    During a debate last week for Republican presidential candidates and in interviews after it, Representative Michele Bachmann called the vaccine to prevent cervical cancer “dangerous.” Medical experts fired back quickly. Her statements were false, they said, emphasizing that the vaccine is safe and can save lives. Mrs. Bachmann was soon on the defensive, acknowledging that she was not a doctor or a scientist.

    But the harm to public health may have already been done. When politicians or celebrities raise alarms about vaccines, even false alarms, vaccination rates drop.

    One Statement from Bachmann, Two Steps Back for HPV Vaccine - NYTimes.com

    Wednesday, September 14, 2011

    'Find My Car' iPhone app finds anyone’s car • The Register

    A bit of information creepiness.

    An iPhone app released a few days ago called “Find My Car” has just turned into a PR disaster for shopping centre operator Westfield.

    The idea seemed neat enough: download the app, and if you lose your car, just enter the number plate, which Westfield’s cameras had captured and indexed. Someone forgetting where they’d parked their car can then be shown a photo of where the car is.

    As blogger Troy Hunt points out in this blog post, anyone can view anyone’s car.

    Worse, he writes, the application can easily be unpicked to download the location, plates, entry and exit times of every vehicle in the Bondi shopping centre in which the service was first rolled out.

    Picking the application apart, he says, shows that Westfield is “storing and making publicly accessible the time of entry and number plate of every single vehicle in the centre.”

    'Find My Car' iPhone app finds anyone’s car • The Register

    Wednesday, September 7, 2011

    Using a bike computer as a black box

    Data forensics: Recreating the details of an accident using a bike computer as a black box.  Cool.  This from the New York Times:

    Late last year Ryan Sabga, another top American bike racer, was hit by a car while crossing an intersection in Denver at the beginning of a training ride. The driver, coming out of an alley, was looking over her right shoulder; she stepped on the gas and made a left turn directly into Mr. Sabga as he reached the middle of the street.

    The driver told the police she didn’t think she had hit Mr. Sabga. Though her car had a telltale dent, the officer said that without proof of where the cyclist had entered the intersection, he would not be able to write a citation against the driver. That meant Mr. Sabga, who was relatively unscathed, would not be able to get her insurance company to cover the damage to his bike, which was now in pieces.

    Back at home, he realized that he might have the proof he needed in the data stored in the Garmin GPS device he used for training.

    “Clear as day, you could see where I stopped at the stop sign, where I got hit by the car and where my bike came to rest,” he wrote. “On the corresponding time stamp, you could see the speeds, the stops and even where my heart rate spiked as she hit me.”

    The police were unwilling to pursue the case, but they suggested that he send the data to the driver’s insurance company. He did so within a day, and the company took responsibility for the accident.

    Filling In the Details Wiped Away by a Bike Crash - NYTimes.com

    Tuesday, September 6, 2011

    Extraneous factors in judicial decisions

    Illustration that information responsibility (e.g., looking at the data) promotes ethical responsibility (e.g., equitable justice).

    Are judicial rulings based solely on laws and facts? Legal formalism holds that judges apply legal reasons to the facts of a case in a rational, mechanical, and deliberative manner. In contrast, legal realists argue that the rational application of legal reasons does not sufficiently explain the decisions of judges and that psychological, political, and social factors influence judicial rulings. We test the common caricature of realism that justice is “what the judge ate for breakfast” in sequential parole decisions made by experienced judges. We record the judges’ two daily food breaks, which result in segmenting the deliberations of the day into three distinct “decision sessions.” We find that the percentage of favorable rulings drops gradually from ≈65% to nearly zero within each decision session and returns abruptly to ≈65% after a break. Our findings suggest that judicial rulings can be swayed by extraneous variables that should have no bearing on legal decisions.

    For those without access to the Proceeding of the National Academy of Sciences, here are some links to related stories in the mainstream press:

    Extraneous factors in judicial decisions

    Print vs. Online Newspapers

    More on the previous theme: dead-tree versions of newspapers might have some advantages over on-line versions.

    Although the number of readers tested in the study is small—just 45—the paper confirms my print-superiority bias, at least when it comes to reading the Times. The paper explores several theories for why print rules. Online newspapers tend to give few cues about a story's importance, and the "agenda-setting function" of newspapers gets lost in the process. "Online readers are apt to acquire less information about national, international and political events than print newsreaders because of the lack of salience cues; they generally are not being told what to read via story placement and prominence—an enduring feature of the print product," the researchers write. The paper finds no evidence that the "dynamic online story forms" (you know, multimedia stuff) have made stories more memorable.

    The paper cites other researchers on the subject who have theorized that the layout of online pages—which often insert ads mid-story or force readers to click additional pages to finish the story—may alter the reading experience. A print story, even one that jumps to another page, is not as difficult to chase to its conclusion. Newspapers are less distracting—as anybody who has endured an annoying online ad while reading a news story on the Web knows. Also, and I'm channeling the paper a little bit here, by virtue of habit and culture a newspaper commands a different sort of respect, engagement, and focus from readers.

    Print vs. Online: How the print edition of the New York Times trumps the online version. - By Jack Shafer - Slate Magazine

    The Mechanic Muse — From Scroll to Screen - NYTimes.com

    From codex to eBook…  Two steps forward, one step back…

    The codex also came with a fringe benefit: It created a very different reading experience. With a codex, for the first time, you could jump to any point in a text instantly, nonlinearly. You could flip back and forth between two pages and even study them both at once. You could cross-check passages and compare them and bookmark them. You could skim if you were bored, and jump back to reread your favorite parts. It was the paper equivalent of random-access memory, and it must have been almost supernaturally empowering. With a scroll you could only trudge through texts the long way, linearly. (Some ancients found temporary fixes for this bug — Suetonius apparently suggested that Julius Caesar created a proto-notebook by stacking sheets of papyrus one on top of another.)

    But so far the great e-book debate has barely touched on the most important feature that the codex introduced: the nonlinear reading that so impressed St. Augustine. If the fable of the scroll and codex has a moral, this is it. We usually associate digital technology with nonlinearity, the forking paths that Web surfers beat through the Internet’s underbrush as they click from link to link. But e-books and nonlinearity don’t turn out to be very compatible. Trying to jump from place to place in a long document like a novel is painfully awkward on an e-reader, like trying to play the piano with numb fingers. You either creep through the book incrementally, page by page, or leap wildly from point to point and search term to search term.

    The Mechanic Muse — From Scroll to Screen - NYTimes.com

    A Case of Information Naivety

    Never mind the responsibility/irresponsibility of hiding knowledge from your colleagues…  Here we are highlighting a special form of organizational/management naivety: Assuming that information sharing will just happen because the software that allows it is so cool, or because the utopian macroscopic benefits will outweigh the disadvantages perceived by individual cutthroats.

    "We've had years of research in organizations about the benefits of knowledge-sharing but an important issue is the fact that people don't necessarily want to share their knowledge," says David Zweig, a professor of organizational behaviour and human resources management at the University of Toronto's Rotman School of Management and the University of Toronto at Scarborough.

    "A lot of companies have jumped on the bandwagon of knowledge-sharing," such as spending money on developing knowledge-sharing software, says Prof. Zweig. "It was a case of, 'If you build it they will come.' But they didn't come."

    The paper identifies three ways employees hide what they know from co-workers: being evasive, rationalized hiding -- such as saying a report is confidential -- and playing dumb.

    Why do they do it? Two big predictors are basic distrust and a poor knowledge-sharing climate within the company. Companies may be able to overcome that through strategies such as more direct contact and less e-mail communication, highlighting examples of trustworthiness, and avoiding "betrayal" incentives, like rewards for salespeople who poach each other's clients.

    Employees don't always share well with others, says new paper exposing 'knowledge hiding'

    Wednesday, August 31, 2011

    The Data Are Always Messy

    A case of information responsibility, public-policy division…

    “There’s a certain kind of academic that comes to Washington and can’t survive,” [former White House economic adviser Austan] Goolsbee said. “They’re the ones starting each sentence with ‘The economic model says …’ They are prone to silver-bullet-style answers, which demonstrate very sophisticated thinking about the model but very unsophisticated thinking about the real world.” The model may be missing a few things that are found in the real world—not least, the institutional and political obstacles that make some problems silver-bullet-proof. “If you’re going to be an academic who’s involved in the world of policy, you have to be involved in the world that exists,” Goolsbee told me. “I was always a data guy, not a theorist. Theorists can maintain total purity. The data are always messy.”

    Devil’s Advocate - Magazine - The Atlantic

    Nine Lives,… and Three Deaths (Statistically Speaking)

    Sigh… From the New York Times Book Review. A case of information irresponsibility, statistics-and-probability division.

    Heller flew 60 bombing missions between May and October 1944, a feat that should have killed him three times over, statistically speaking, since the average personnel loss was 5 percent per mission.

    Reality:  A pilot’s chances of surviving 60 such bombing missions is about 4 percent.  Was there no one in the editorial process who perceived the silliness of being killed “three times over, statistically speaking?”

    Being realistic:  I don’t expect everyone to be facile with simple statistics and probability.  But I expect writers and editors to know their limitations. If you’re going to write (or publish) a statistical claim, ask someone competent to check it for you.

    The Enigma of Joseph Heller - NYTimes.com

    Falser Words Were Never Spoken - NYTimes.com

    Information irresponsibility, bumper-sticker division:

    In a coffee shop not long ago, I [that is, NY Times Op-Ed contributor Brian Morton] saw a mug with an inscription from Henry David Thoreau: “Go confidently in the direction of your dreams! Live the life you’ve imagined.”

    At least it said the words were Thoreau’s. But the attribution seemed a bit suspect. Thoreau, after all, was not known for his liberal use of exclamation points. When I got home, I looked up the passage (it’s from “Walden”): “I learned this, at least, by my experiment: that if one advances confidently in the direction of his dreams, and endeavors to live the life which he has imagined, he will meet with a success unexpected in common hours.”

    ...

    Thoreau, Gandhi, Mandela — it’s easy to see why their words and ideas have been massaged into gauzy slogans. They were inspirational figures, dreamers of beautiful dreams. But what goes missing in the slogans is that they were also sober, steely men. Each of them knew that thoroughgoing change, whether personal or social, involves humility and sacrifice, and that the effort to change oneself or the world always exacts a price.

    Falser Words Were Never Spoken - NYTimes.com

    Thursday, August 11, 2011

    The n-th Circle of Hell

    Many of the complaints about Google+ Circles can be paraphrased as Nice idea, but too much of a hassle. 

    Peter Pachal described the problem in an article at PCMag.com.:

    The main problem with Google Circles is that it's tedious. While I agree that most people separate their contacts into various groups in real life, doing so in a social network is a chore. It's one of the reasons we have different social networks (LinkedIn for work, Facebook for friends, etc.). Asking people to do this kind of organizing proactively, on a single network, vastly overestimates the patience of Web users. Sure, some people are very organized and left-brained (like the engineers who created Google+), with spotless inboxes and well-maintained lists of contacts, but my feeling is that the vast majority aren't. And of all the things that have turned people off of Facebook over the years, the lack of focus on friend-organizing tools isn't one of them.

    And here’s Andrew Gent (author of the misnamed Incredibly Dull blog) on the same topic:

    My second issue is around circles. I understand they sound like a good idea. My personal (and professional) relationships are more complex than Facebook's simplistic friends / non-friends model.So being able to define your relationships in more detail sounds like a positive step.

    The problem is, it's far more difficult than it sounds. I have friend friends and I have professional friends. I have professional friends and professional acquaintances. Some work for my old employer; some used to; some never did. Some know I am interested in poetry and video games (among other things); some don't. A few have met my wife; some may not even know I am married.

    When I start to break it down, it is not only not binary, it is more complex than even I can describe. Which is what makes Google+'s circles so frustrating. They require too much thinking. This is not a technical issue, per se, but a failure to be able to turn an implicit organic process into an explicit concrete categorization.

    The lesson here for data modelers and requirements analysts:  Modeling a phenomenon can be the (comparatively) easy part.  What’s hard is collecting, cleansing, maintaining, and archiving the data that populates the model.

    This lesson cannot be taught to modeling novices; it is one of the lessons that modelers-in-training can never get until they have learned the basics—until they have become intermediate modelers.  Upon becoming competent, a freshly minted modeler can be swept away by the power of the technique.  Armed with a new facility with content-neutral data-model shapes, the modeler thinks “Wow, this is powerful—I can model anything!  No matter what the users ask for I can express it on a model!”

    Such enthusiasm about newly acquired knowledge or skill is always dangerous. 

    In this case, the well-meaning modeler overlooks an ugly truth: every data model will, when implemented, impose a burden on the user community to populate that model with data.  If the model oversimplifies the phenomenon, the users will experience a burden much like what the users of Google+ Circles are reporting.  Being able to express something on a data model is a good thing.  But it is only the beginning.  And often, there is little correlation between how easy it is to develop the model and how easy it will be to populate it with instances.

    Thursday, August 4, 2011

    Language Log » Xtreme nerdview

    Why conceptual modeling will always be needed….

    I don't do surveys, so don't ask. I cannot afford a quarter of an hour answering an ill-designed list of questions for you so that your manager can use the scientifically worthless results to make out a case that your service unit is doing a good job. And don't call me on the phone and tell me you're doing some social science research, because I just know there will be a follow-up call trying to sell me carpets or enrol me in a political action committee. However, my colleague Bob Ladd encouraged me to do a survey about the new building in which the School of Philosophy, Psychology and Language Sciences lives its generally happy life at the University of Edinburgh. He told me there would be a treat at the end in terms of what I have dubbed nerdview. And boy, was there a treat.

    The defining feature of nerdview is the confusing of the viewpoint of the technical specialist on the inside with that of the general public outside, so that language suited only to the internal/technical perspective gets delivered to an external/layperson audience, resulting in unintelligibility. And here you really see it writ large, to a degree that seems almost moronic. This would scarcely seem plausible in a Dilbert cartoon strip. This is xtreme nerdview.

    Language Log » Xtreme nerdview

    Sunday, July 31, 2011

    #MITIQ: Report from MIT IQIS: The Chief Data Officer

    Earlier this month I spent a week at the 2011 MIT Information Quality Industry Symposium. 

    IQ is a young discipline, and conventional wisdom has not yet congealed into a widely accepted set of best practices.  For example, although most of the gathered experts agreed that an IQ program requires someone in the Chief Data Officer role, the urgency—or the perceived urgency—of the need can vary:

    • If data is part of your service (e.g., your service is a thoroughbred racetrack), you absolutely need a CDO.
    • If data supports your service (e.g., a brokerage), the need for a CDO is real, but some myopic folks might not realize this.
    • If data is not a part of your product, but describes your operations, the need for a CDO is slightly less urgent.  (Even here, the merits of data quality cannot be understated, especially for data that supports regulatory reporting.)

    Furthermore, there were different—widely different—organizational approaches: 

    • CDO should report to the CIO
    • CDO should not report to the CIO, but to someone less focused on technology, such as the COO.

    Other topics yielded equally varied opinions:

    • A good way to launch a data quality / information quality program is to concentrate on saving money.
    • Cost should not be the primary motivator; understanding the business should be.  The goal is not reducing cost, but establishing discipline (e.g., understanding risks and efficiencies in operation…)

    And while we’re talking about money, here’s another topic whose discussion showed widely varying opinions:

    • One function of the CDO is to establish a reliable funding stream for the DQ / IQ program.
    • No.  Don’t rely on a funding stream and don’t fall into the funding-stream mentality, because funding streams dry up when personnel changes.  It is too difficult to defend a budget line item called “Data Quality.”  You must embed data quality into your core business, so that DQ is barely distinguishable from your core business operations.

    The opinions referred to here are not mine.  I’m just reporting on the vibrancy of the discussions at this year’s MIT IQIS conference.  Vibrant discussion is a good sign, because DQ / IQ programs will be a part of our future and we need to figure out what that means.

    Wednesday, July 20, 2011

    Lies, Perjury, Irresponsibility, Gamesmanship

    James B. Stewart suspects that America is suffering from an epidemic of perjury.  It says so in his new book, Tangled Webs: How False Statements Are Undermining America: From Martha Stewart to Bernie Madoff.  Here’s an excerpt (taken from here):

    We know how many murders are committed each year — 1,318,398 in 2009. We know the precise numbers for reported instances of rape, robbery, aggravated assault, burglary, larceny, and vehicle theft. No one keeps statistics for perjury and false statements — lies told under oath or to investigative and other agencies of the U.S. government — even though they are felonies punishable by up to five years in prison. There is simply too much of it, and too little is prosecuted to generate any meaningful statistics.

    Although lying seems to be an inherent part of human nature, the narrow but serious class of lies that undermines the judicial process on which government depends has been a crime as old as civilization itself. Originally prosecuted in England by ecclesiastical courts, by the sixteenth century perjury was firmly embedded as a crime in the English common law.

    Mounting evidence suggests that the broad public commitment to telling the truth under oath has been breaking down, eroding over recent decades, a trend that has been accelerating in recent years. Because there are no statistics, it’s impossible to know for certain how much lying afflicts the judicial process, and whether it’s worse now than in previous decades. Street criminals have always lied when confronted by law enforcement. But prosecutors have told me repeatedly that a surge of concerted, deliberate lying by a different class of criminal — sophisticated, educated, affluent, and represented in many cases by the best lawyers — threatens to swamp the legal system and undermine the prosecution of white-collar crime.

    Stewart’s take on the phenomenon might be a bit simplistic.  Here’s an excerpt from one book review of Tangled Webs:

    “We know how many murders are committed each year — 1,318,398 in 2009,” he writes in the first sentence of “Tangled Webs.”

    At this point, if I were caught up in Stewart’s prosecutorial spirit, I might object that the first sentence of his book is a lie. In fact, according to the F.B.I.’s statistics, an estimated 1,318,398 violent crimes, not murders, were committed in the United States in 2009. And a vast majority of these violent crimes didn’t involve murder; they involved robbery and aggravated assault. But of course, it would be hyperbolic and unfair of me to accuse Stewart of lying without knowing more about the motive behind his false statement. Perhaps it was an inadvertent error, in which case calling it a lie seems much too strong. On the other hand, perhaps it was a deliberate misrepresentation devised to create a more dramatic opening — perhaps, in other words, he felt that comparing lying to robbery would be less vivid than comparing lying to murder. Deliberate misrepresentation seems highly unlikely for a Pulitzer Prize-­winning journalist of his caliber, but without knowing more about his motives, I can’t make a fair-minded judgment about how seriously to treat his false statement.

    Unlike Stewart, the Anglo-American legal system has long been sensitive to these fine distinctions. It has treated some lies more seriously than others, depending on the intent of the speaker and the effect on other people.

    Although Stewart, now a business columnist for The New York Times, claims that lying has been on the rise, a more plausible thesis is that prosecutions for false statements have been rising — not because of growing contempt for the truth but because defendants are increasingly prosecuted for doing nothing more than denying their guilt to investigators.

    Complicating matters is the widely disseminated meme of “three felonies a day”—the idea that American federal law is so onerous that a typical American citizen commits three felonies per day as he or she goes about his or her business. 

    Another complication: Prosecutors themselves—those who report their outrage about perjury to Mr. Stewart—are not above taking the occasional liberty with the truth.  One doesn’t need to look hard: Here’s an example from—delving deep into the archives here—this week:

    Assertions by the prosecution that Casey Anthony conducted extensive computer searches on the word “chloroform” were based on inaccurate data, a software designer who testified at the trial said Monday.

    The designer, John Bradley, said Ms. Anthony had visited what the prosecution said was a crucial Web site only once, not 84 times, as prosecutors had asserted. He came to that conclusion after redesigning his software, and immediately alerted prosecutors and the police about the mistake, he said.

    The finding of 84 visits was used repeatedly during the trial to suggest that Ms. Anthony had planned to murder her 2-year-old daughter, Caylee, who was found dead in 2008. Ms. Anthony, who could have faced the death penalty, was acquitted of the killing on July 5.

    Mr. Bradley’s findings were not presented to the jury and the record was never corrected, he said. Prosecutors are required to reveal all information that is exculpatory to the defense.

    Now is a good chance to call attention to a personal approach to information responsibility.  I do not know whether Casey Anthony committed the crimes with which she was charged and of which she was recently acquitted.  It is my stated policy—motivated by information responsibility—to not have an an opinion about criminal charges unless I am a juror responsible for contemplating those charges.

    Revolutions: Growth in data-related jobs

    Makes sense to me.

    At job-search site indeed.com, you can take a look at trends in the use of keywords used in job postings. As you might expect, job postings containing terms related to making sense from data are on the rise.

    Revolutions: Growth in data-related jobs

    Monday, July 18, 2011

    Language Log » Presupposition and boasting instructions for politicians

    From the department of “Information Responsibility Includes Knowing How Your Brain Can Let You Down:”

    In fact, psychological studies as far back as the seventies have shown that people can be so eager to accommodate presupposed information that they might even tweak their own memories accordingly. In a study led by memory scientist Elizabeth Loftus, people who'd witnessed simulated car crashes were more likely to mistakenly remember a stop sign when asked "Do you remember seeing the stop sign?" as opposed to "Do you remember seeing a stop sign?"

    Language Log » Presupposition and boasting instructions for politicians

    Wednesday, July 13, 2011

    MIT IQIS 2011: Data, Information, Knowledge

    Lots of great stuff presented at MIT IQIS 2011 this week.  Stuff?  It seems that I’m no longer entitled to use the words data, information, and knowledge the ways civilians—the sorts of folks who wouldn’t attend MIT IQIS—might use those words. 

    I admit, I’m repeating a theme from an earlier post.

    Many conference attendees distinguish between raw data, minimally processed data, and thoroughly analyzed data. I’m down with that; those distinctions are legitimate and deserve our attention.  However, it seems that many of the attendees of MIT IQIS have appropriated some English words to express these distinctions.  I’ve heard the following: Raw data is called “data.”  Minimally processed data is called “information.”  Thoroughly analyzed data is called “knowledge.”

    Reminder:  Civilians—even those who recognize the merits of the distinctions among data that is raw, minimally processed, and thoroughly analyzed—would not use the words data, information, and knowledge in this way.  An irony: A recurring theme of information quality is the need to secure business buy-in and support for IQ initiatives.  Hard to get the business to buy in to what you are proposing when you keep distorting the meanings of their perfectly good words.

    MIT IQIS 2011

    Just beginning my 90-minute tutorial on best practices in data modeling for attendees of the MIT IQIS 2011 Conference.  Good room, good crowd, sunny day in Cambridge.  Throughout the rest of this week, watch this spot for more from MIT IQIS.

    Monday, July 11, 2011

    An Underwhelming Bachmann "Gaffe" : CJR

    Likely reality:  A member of Michelle Bachmann’s staff confused Waterloo, Iowa with Winterset, Iowa.

    Gleefully reported story:  Michelle Bachmann confused notorious serial killer John Wayne Gacy with American film icon and airport name-giver John Wayne.

    Vigorous competition for eyeballs notwithstanding, this is information irresponsibility, plain and simple. 

    There is no reason to assume that Bachmann, apparently unaware of the relatively well-known John Wayne Birthplace Museum in Winterset, Iowa, some 150 miles from Waterloo, was better informed about the birthplace of the serial killer (who certainly demonstrated few of the “what America’s all about” qualities the candidate praises in the cowboy). More likely is that Bachmann’s aides were careless in their research before her Waterloo appearance, giving her bad information.

    Making the leap from John Wayne’s misnamed hometown to a mix-up of him and Gacy seems like a vaguely dishonest means to a spicy headline (not to mention innumerable videos of a Pogo-faced Michele Bachmann on YouTube). Granted, the headline grabs the eye more than might “BACHMANN MIXES UP JOHN WAYNE’S HOMETOWN WITH ANOTHER IOWA HAMLET STARTING WITH W,” and thus draws attention to the kind of careless mistake on the part of the candidate that could have an impact on swing voters.

    An Underwhelming Bachmann "Gaffe" : CJR

    Friday, June 24, 2011

    Scary New Cigarette Labels Not Based in Psychology - ScienceInsider

    More on the revolting images that will be added to cigarette packaging.

    Science Insider asked Tavris what the current research in behavioral psychology has to say about the effectiveness of fear imagery.

    " 'Current' research?" she replied in an e-mail. "Social psychologists have decades of research showing that fear communications generally backfire, that people tune them out, and therefore that these tactics are generally not effective."

    Scary New Cigarette Labels Not Based in Psychology - ScienceInsider

    Tuesday, June 21, 2011

    Mediactive - Creating a User's Guide to Democratized Media

    Frivolous post here:  Years ago, in a story about punk rock, the Times insisted on referring to “Mr. Rotten” and “Mr. Vicious.”

    The Times mandates courtesy titles (Mr., Ms., etc.) only in news stories, though it drops them for some dead people and those it arbitrarily considers evil enough not to deserve them. For example, Osama Bin Laden lost his Mr. after US forces killed him in May. But Saddam Hussein was recently still being called Mr. Hussein, as Slate notes.

    Mediactive - Creating a User's Guide to Democratized Media

    U.S. Releases Graphic Images to Deter Smokers - NYTimes.com

    Persuasion by disgust.

    The graphic images will include photographs of horribly damaged teeth and lungs and a man exhaling smoke through a tracheotomy opening in his neck. The Department of Health and Human Services selected nine color images among 36 proposed to accompany larger health warnings

    U.S. Releases Graphic Images to Deter Smokers - NYTimes.com

    Monday, June 20, 2011

    Find Free Money: Almost Every Bill You Get Is Full Of Mistakes - CIO Central - CIO Network - Forbes

    Check your pockets.

    Is your phone bill so phony that it’s secretly robbing you of thousands  of dollars? Could your garbage costs be trashing your company? And are  your payments for security services making you feel insecure?

    If so, then you’re not alone.

    An analysis by SIB Development & Consulting found that 95% of all  regular monthly service bills contain errors – errors which can can add up to  big bucks.

    Find Free Money: Almost Every Bill You Get Is Full Of Mistakes - CIO Central - CIO Network - Forbes

    Sunday, June 19, 2011

    Taymor opens up about 'Spider-Man' - Entertainment News, Legit News, Media - Variety

    I had heard that the problems of Spiderman: Turn Off The Dark were caused by failure to control web technology.  Now I get it.

    As a result of the national spotlight directed at the development of the project, she noted, "You get bored of a show before it even opens because there's been too much talk about it."

    She added, "Twitter, Facebook, blogging just trumps you."

    Taymor opens up about 'Spider-Man' - Entertainment News, Legit News, Media - Variety

    Saturday, June 18, 2011

    Split personality disorder: brought to you by the Internet » OWNI.eu, News, Augmented

    The fellow out-Freys James Frey.

    Recently, the blogosphere has been buzzing about Amina’s blog A Gay Girl in Damacus: An out Syrian lesbian’s thoughts on life, the universe and so on. For years this American-Syrian woman was an inspirational activist, her powerful writing touched the lives of her followers and those in the Middle East fighting for their rights. At the beginning of June Amina disappeared. Her cousin posted on the blog, claiming Amina had been captured and was being held in custody by the Secret Service. There was an outcry for her release, as her faithful followers worried about their heroine’s well-being. A search ensued, only to find…that no one in Damacus had ever heard of the woman. After some digital investigation, it turns out that Amina was in fact not a resolute gay woman fighting in her country for her rights, but a 40-year-old man in Scotland with a brilliant imagination and enough knowledge on the subject to breathe life into the blog. On June 12, Tom MacMaster came clean, posting an apology to his readers

    Split personality disorder: brought to you by the Internet » OWNI.eu, News, Augmented

    Social networking sites and our lives | Pew Internet & American Life Project

    Does MySpace offer a refuge from the internet echo chamber?

    MySpace users are more likely to be open to opposing points of view.

    We measured “perspective taking,” or the ability of people to consider multiple points of view. There is no evidence that SNS users, including those who use Facebook, are any more likely than others to cocoon themselves in social networks of like-minded and similar people, as some have feared.

    Moreover, regression analysis found that those who use MySpace have significantly higher levels of perspective taking. The average adult scored 64/100 on a scale of perspective taking, using regression analysis to control for demographic factors, a MySpace user who uses the site a half dozen times per month tends to score about 8 points higher on the scale.

    Social networking sites and our lives | Pew Internet & American Life Project

    Nieman Reports | Online Comments: Dialogue or Diatribe?

    Last sentence (of this excerpt) is chilling.

    The 90-9-1 principle convinced me that many, not all, comment sections are an exercise in faux democracy. This theory goes that 90 percent of us will read something online and move on. Nine percent—I'm in this group—occasionally take time to comment. That leaves roughly 1 percent who dominate the online conversation, and among this smaller number is found the digital equivalent of the loudest drunk in the bar. Their messages are often rude and accusatory; they indicate little interest in joining a conversation, yet they succeed in scaring off those who might want to truly engage.

    This has occasionally pushed away a news source.

    Nieman Reports | Online Comments: Dialogue or Diatribe?

    Nieman Journalism Lab » Pushing to the Future of Journalism

    Just another click in the the wall…

    The Guardian’s digital leap: The Guardian has long been one of the top newspapers on the web, but this week, the British paper announced a major step in its development as a digital news organization with a transition to a “digital first” operation. So what exactly does that mean? Essentially, that the Guardian will pour more of its resources (especially financial) into its digital operation in an effort to double its digital revenues within the next five years.

    Nieman Journalism Lab » Pushing to the Future of Journalism

    Eat Your Vegetables, and Don't Forget to Tweet - WSJ.com

    When documenting your experience interferes with experiencing your experience, you are officially a victim of your times.  

    Ms. Wilson, an avid cook, photographs most family meals for her "Gotham Gal" blog. "If you try to take a bite of your appetizer before she's taken a picture, you hear her say, 'Wait wait wait wait,' and she'll make you put it back," Josh says.

    Eat Your Vegetables, and Don't Forget to Tweet - WSJ.com

    Tuesday, June 14, 2011

    Letter to the Public Editor of The New York Times

    Here is the text of a letter I just sent to the public editor of the New York Times.

    === === ===

    Dear Public Editor of the New York Times,

    The chess column of 12 June 2011 includes this sentence: “A message sent to Natsidis’s Facebook page asking him for comment was not answered.”

    The sentence motivates these questions: 

    1. Is Facebook a legitimate medium for seeking comment from public figures before publishing stories about them?
    2. If so, when did it become so?  Is LinkedIn also considered legitimate for this purpose?  What procedures are in place to ensure that reporters are not communicating with a spoofed/counterfeit identity?
    3. Do the formal policies of the New York Times include a list of acceptable communications media for reaching the subjects of news stories?  Are reporters now burdened with trying several different media—a policy that would honor the reality that different people prioritize various communications media differently?

    These and other questions bring to mind your column of 02 April 2011, in which you discuss a “move The Times should make, one that would help secure a tighter bond with its audience: publishing The Times’s journalism policies in a searchable format and in a visible location on NYTimes.com.”

    You correctly point out that such a move would “enable readers to see more clearly into the news operation.”  Another benefit: it would reveal the rules of the game to news subjects.  If the new reality is that reporters’ due diligence for seeking comment can be satisfied merely by posting a message on a Facebook page, that reality ought to be formally—and publicly—articulated.  Is it?

    Sincerely,

    Joe Maguire

    Chess - Christoph Natsidis Punished for Cheating - NYTimes.com

    Tut tut…

    A German master was disqualified from his country’s championship tournament this month, done in by his cellphone.

    Officials determined that the master, Christoph Natsidis, 23, had consulted a computer program on his smartphone during his game against the grandmaster Sebastian Siebrecht in the last round.

    A message sent to Natsidis’s Facebook page asking him for comment was not answered.

    Chess - Christoph Natsidis Punished for Cheating - NYTimes.com

    People Argue Just to Win, Scholars Assert. - NYTimes.com

    A recurring theme of information responsibility: Knowing how your brain can let you down (e.g., through confirmation bias).  Now we have a a theory about why your brain lets you down.

    For centuries thinkers have assumed that the uniquely human capacity for reasoning has existed to let people reach beyond mere perception and reflex in the search for truth. Rationality allowed a solitary thinker to blaze a path to philosophical, moral and scientific enlightenment.

    Now some researchers are suggesting that reason evolved for a completely different purpose: to win arguments. Rationality, by this yardstick (and irrationality too, but we’ll get to that) is nothing more or less than a servant of the hard-wired compulsion to triumph in the debating arena. According to this view, bias, lack of logic and other supposed flaws that pollute the stream of reason are instead social adaptations that enable one group to persuade (and defeat) another. Certitude works, however sharply it may depart from the truth.

    People Argue Just to Win, Scholars Assert. - NYTimes.com

    Sunday, June 12, 2011

    U.S. Underwrites Internet Detour Around Censors Abroad - NYTimes.com

    Can you fear me now?

    The Obama administration is leading a global effort to deploy “shadow” Internet and mobile phone systems that dissidents can use to undermine repressive governments that seek to silence them by censoring or shutting down telecommunications networks.

    The Obama administration’s initiative is in one sense a new front in a longstanding diplomatic push to defend free speech and nurture democracy. For decades, the United States has sent radio broadcasts into autocratic countries through Voice of America and other means. More recently, Washington has supported the development of software that preserves the anonymity of users in places like China, and training for citizens who want to pass information along the government-owned Internet without getting caught.

    But the latest initiative depends on creating entirely separate pathways for communication. It has brought together an improbable alliance of diplomats and military engineers, young programmers and dissidents from at least a dozen countries, many of whom variously describe the new approach as more audacious and clever and, yes, cooler

    U.S. Underwrites Internet Detour Around Censors Abroad - NYTimes.com

    Book Review - The Filter Bubble - By Eli Pariser - NYTimes.com

    Note that this review—even-tempered, resisting the urge to amplify the author’s hand-wringing about serendipity—is by a man known for some hand-wringing of his own: none other than Evgeny Morozov, the author of the recently published The Net Delusion (about which more here).

    Such selectivity may eventually trap us inside our own “information cocoons,” as the legal scholar Cass Sunstein put it in his 2001 book “Republic.com.” He posited that this could be one of the Internet’s most pernicious effects on the public sphere. “The Filter Bubble,” Eli Pariser’s important new inquiry into the dangers of excessive personalization, advances a similar argument. But while Sunstein worried that citizens would deliberately use technology to over-customize what they read, Pariser, the board president of the political advocacy group MoveOn.org, worries that technology companies are already silently doing this for us. As a result, he writes, “personalization filters serve up a kind of invisible autopropaganda, indoctrinating us with our own ideas, amplifying our desire for things that are familiar and leaving us oblivious to the dangers lurking in the dark territory of the unknown.”

    Most important, personalization’s effects on serendipity are far more ambiguous than “The Filter Bubble” suggests. Lacking a stable working definition of serendipity, Pariser sometimes equates it with randomness, sometimes with unexpected exposure to new ideas. But serendipity is a subjective concept that cannot be understood in isolation from the searcher’s own quirkiness and previous search history. By knowing which Web sites you like to visit and bookmark, a search engine might immediately point you to useful links that could otherwise get lost on Page 99 of unpersonalized search results. (In a 2009 study of search habits that tested this proposition, researchers for Microsoft found that “rather than harming serendipity, personalization appears to identify interesting results in addition to relevant ones.”) Building on Louis Pasteur’s observation that “chance favors the prepared mind,” one could see how personalization might augment serendipity by helping us maximize our own preparedness.

    Book Review - The Filter Bubble - By Eli Pariser - NYTimes.com

    Emma Forrest, Hollywood L-it girl - The Globe and Mail

    How much privacy is a memoirist allowed to claim?

    Call me a stickler, but deliberate obfuscation of the facts is a dangerous game to play when promoting a memoir of “obsession, heartbreak and slow, stubborn healing” (as Eat Pray Love author Elizabeth Gilbert described it in her cover blurb). It’s simply unreasonable to accept public adulation for laying yourself bare one moment, then behave as though your privacy is being invaded the next. Emotional honesty is the memoirist’s stock and trade, but it’s also a sacred contract with the reader – lest we all forget the example of James Frey.

    Emma Forrest, Hollywood L-it girl - The Globe and Mail

    Thursday, June 9, 2011

    Conservative Group Scanned Weiner’s Posts, Warned Women - NYTimes.com

    Gotcha!

    As Democrats and Republicans embrace Twitter and other social media tools as a way to interact with their constituents and woo voters, many have discovered a downside to online communication: cyberstalkers, who track and criticize their every move.

    Conservative Group Scanned Weiner’s Posts, Warned Women - NYTimes.com

    A Federal Study Finds That Local Reporting Has Waned - NYTimes.com

    Not terribly surprising, but discouraging nonetheless…

    An explosion of online news sources in recent years has not produced a corresponding increase in reporting, particularly quality local reporting, a federal study of the media has found.

    Coverage of state governments and municipalities has receded at such an alarming pace that it has left government with more power than ever to set the agenda and have assertions unchallenged, concluded the study, which is to be released on Thursday.

    A Federal Study Finds That Local Reporting Has Waned - NYTimes.com

    Tuesday, June 7, 2011

    Debating the Value of College in America : The New Yorker

    Being able to write clearly is a part of information responsibility. Without it, access to education is wasted.  That’s what Professor X says:

    When he is not taking on trends in modern thought, Professor X is shrewd about the reasons it’s hard to teach underprepared students how to write. “I have come to think,” he says, “that the two most crucial ingredients in the mysterious mix that makes a good writer may be (1) having read enough throughout a lifetime to have internalized the rhythms of the written word, and (2) refining the ability to mimic those rhythms.” This makes sense. If you read a lot of sentences, then you start to think in sentences, and if you think in sentences, then you can write sentences, because you know what a sentence sounds like. Someone who has reached the age of eighteen or twenty and has never been a reader is not going to become a writer in fifteen weeks.

    Debating the Value of College in America : The New Yorker

    Friday, June 3, 2011

    Mind Control & the Internet by Sue Halpern | The New York Review of Books

    More on a recent topic: Search makes it harder for me to encounter ideas that challenge my world view.

    With personalized search, “now you get the result that Google’s algorithm suggests is best for you in particular—and someone else may see something entirely different. In other words, there is no standard Google anymore.” It’s as if we looked up the same topic in an encyclopedia and each found different entries—but of course we would not assume they were different since we’d be consulting what we thought to be a standard reference.

    Among the many insidious consequences of this individualization is that by tailoring the information you receive to the algorithm’s perception of who you are, a perception that it constructs out of fifty-seven variables, Google directs you to material that is most likely to reinforce your own worldview, ideology, and assumptions.

    Mind Control & the Internet by Sue Halpern | The New York Review of Books

    Thursday, June 2, 2011

    Celebrity Books and Ghostwriters - Noticed - NYTimes.com

    Neglecting to mention the ghostwriter qualifies as information irresponsibility.  But (warning, snarkiness arriving in three, two, …) really, is there any information in a book written by a Kardashian sister? And, sigh, if you’re buying a book purportedly written by Snooki, aren’t you guilty of information irresponsibility yourself?

    Like a branded fragrance or clothing line, the novel — once quaintly considered an artistic endeavor sprung from a single creative voice — has become another piece of merchandise stamped with the name of celebrities, who often pass off the book as their work alone despite the nearly universal involvement of ghostwriters. And the publishing industry has been happy to oblige.

    Celebrity Books and Ghostwriters - Noticed - NYTimes.com