Thursday, February 24, 2011

The Media Blog

Data lineage is the tracking of data values as they flow through an organization and its systems.  Data lineage lets the users of a data warehouse know where a particular value came from (which system, on what date, etc…)

Data lineage applies to unstructured data too.  Keep reading:

The Media Standards Trust has launched a website aimed at highlighting instances of 'churnalism' - a phrase made popular Nick Davies, author of Flat Earth News, which refers to the tendancy among some journalists to cut and paste content from press releases with minimal 'topping and tailing' and little effort to seek independent comment or challenge claims therein. 

Called, appropriately enough, the website will enable visitors to check press releases against more than three million news articles, from national newspapers and the websites of major broadcasters. 

Visitors to the non-profit website will also apparently be able to see the percentage of any press release cut and pasted into news articles as well as tagging examples of 'churn' for others to see, thus creating a database of 'churnalism', and sharing them via Twitter and Facebook.

The Media Blog

Wednesday, February 23, 2011

The Media Blog

Justin Bieber (shriek!) is in a bit of hot water over his response to some questions about abortion put to him by a Rolling Stone reporter.  The tweets are twitchy.

Was there really any value, beyond the publicity it has generated - and therein lies the key perhaps to such irresponsible reporting - in asking that question, or following up with such a highly-charged and confrontational question - to a 16-year-old - about rape, which he so clearly tripped over as the awkward, fumbled wording of his caveated answer suggests?

Much of the anger aimed at Bieber has pointed out how inappropriate it is that such ill-informed opinions fell from the mouth of a teenage role model, or that he would even wade into and skew the debate about such a controversial subject matter. But I can't help thinking, on this occasion, it was the question that was even more inappropriate than the answers.

Lessons in information responsibility:

  • For all of us:  Don’t look to 16-year-old pop stars for leadership on important, polarizing social issues.
  • For all of us:  Don’t punish 16-year-old pop stars for their opinions on such issues.
  • For 16-year-old pop stars:  Answer such questions with:  “I’m a 16-year-old pop star.  My opinions are still forming.”
  • For journalists:  Resist the urge to ask 16-year-old pop stars about today’s pressing moral issues, no matter how it might boost your magazine’s circulation.

UPDATE (01 March 2011):  Rolling Stone has admitted to misquoting Bieber.

The Media Blog

Tuesday, February 22, 2011

Searching for Details Online, Lawyers Facebook the Jury | Ana Campoy and Ashby Jones | Voices | AllThingsD


Facebook is increasingly being used in courts to decide who is–and who isn’t–suitable to serve on a jury, the latest way in which the social-networking site is altering the U.S. court system.

Searching for Details Online, Lawyers Facebook the Jury | Ana Campoy and Ashby Jones | Voices | AllThingsD

Monday, February 21, 2011

UXPin Paper Prototyping Kit Helps Designers Mock Up Web Sites | Drake Martinet | Voices | AllThingsD

A low-tech solution (sticky paper) for preliminary design of high-tech artifacts (web sites)…

…Analogous to requirements analysts working with whiteboards and markers rather than with on-screen electronic design surfaces…

But thanks to Marcin Treder, Kamil Zieba and Wictor Mazur, Poland can now add user experience (UX) design tools to its list of exports.

The three designers, who met at their day jobs working for one of Poland’s biggest e-commerce sites, founded UXPin–the quietly-famous Web site prototyping kit made of specially printed paper and sticky notes, beautifully bundled inside its own portable folder.

After abandoning collaborative wire-framing software as either too slow or too technical for lay people to operate, they began printing wire-frame pieces and mixing them with Post-it-type notes to mock up designs.

UXPin Paper Prototyping Kit Helps Designers Mock Up Web Sites | Drake Martinet | Voices | AllThingsD

Datelines and Road Warriors -

More from the public editor of the New York Times: yet another best practice about narrative data that also applies to structured data.  The date on which the data was created is important, and often overlooked.  (The quoted excerpt is from a letter sent to the Times by one of its readers.)

I’m perturbed by The Times’s decision to omit dates from datelines. In the Jan. 20 print edition, Sheryl Gay Stolberg reported that Nicholas Kristof attended the previous night’s state dinner at the White House, while Mr. Kristof’s column that day carried a dateline of Beijing. Did he hastily fly back from Beijing to Washington on the same day he filed that column?

Datelines and Road Warriors -

Seth Mnookin's The Panic Virus: The story of how so many parents fell for the autism-vaccine link. - By Anna B. Reisman - Slate Magazine

More on the autism/vaccine delusion.

Vaccine paranoia may also be a consequence of print journalism's decline. Health and science reporters are supposed to not only translate scientific jargon into clear language but also comment on whether a particular study's methods are kosher. But in the last 20 years, as Mnookin notes, the number of science reporters and science sections has dropped sharply. Many journalists now treat press releases as gospel, without doing any independent reporting. And then there's journalist David Kirby, whose 2004 book, Evidence of HarmMercury in Vaccines and the Autism Epidemic: A Medical Controversy, explored the purported autism-MMR-thimerosal link. Bitterly sarcastic, Mnookin describes Kirby's narrative as "proud, independent-minded mothers doing battle with greedy drug companies and corrupt government agencies." Although much about the book was misleading, including its title (which as Mnookin notes was taken from a 1999 CDC statement finding no evidence of harm involving thimerosal in vaccines), the media ate it up: When Tim Russert squared Kirby off against Harvey Fineberg, the president of the venerable Institute of Medicine, Kirby's polished comments sparkled in comparison to Fineberg's bumbling attempts to respond to his absurd pronouncements without sounding condescending.

Seth Mnookin's The Panic Virus: The story of how so many parents fell for the autism-vaccine link. - By Anna B. Reisman - Slate Magazine

Saturday, February 19, 2011

Abstract: Reproducible Software versus Reproducible Research (2011 AAAS Annual Meeting (17-21 February 2011))

The development of the stored-program concept is a milestone of computer science that blurred (but certainly did not obliterate) the distinction between code and data.  The ability to treat code as data is now so deeply ingrained, few computer practitioners ever think about the alternative, which seems like a quaint historical curiosity. 

Here’s a contemporary problem that provides a chance to think about the distinction afresh: Computational scientists should publish their code as well as their data.  Data dissemination—a pillar of scientific transparency—is inadequate without concomitant dissemination of the code that generates the computed/simulated results.

In sharp contrast we have incentives in computational research, strongly biased towards rapid publication of papers without any realistic requirement of validation, that lead to a completely different outcome.  Publications in computationally-based research (applied to any specific discipline) often lack any realistic hope of being reproduced, as the code behind them is not available, or if it is it rarely has any automated validation, history tracking, bug database, etc.

Abstract: Reproducible Software versus Reproducible Research (2011 AAAS Annual Meeting (17-21 February 2011))

Wednesday, February 16, 2011

Race, Religion and Other Perilous Ground -

The very existence of endeavors like the New York Times Public Editor column constitutes a victory for responsible information creation and dissemination.  The topic of this particular edition also serves as a potent reminder that information doesn’t just happen; it is the result of a manufacturing process. 

For decades, The Times has had clear policies warning reporters and editors to be careful about using ethnic, racial and religious labels. Only when “pertinent,” its stylebook says.

Making the judgment call of what is pertinent, though, is easier said than done. Even when the judgment call can be justified, errors of execution can provoke strong responses from readers, as two recent examples illustrate.

Race, Religion and Other Perilous Ground -

Tuesday, February 15, 2011

Megastore: Google's Answer to NoSQL Databases - ReadWriteCloud

Apparently, even genuinely web-scale endeavors like Google are realizing that NoSQL (“It’s web-scale!”) can sacrifice too much of the convenience of a traditional relational DBMS.

Last month Google released a paper on its high availability datastore Megastore. Megastore "blends the scalability of a NoSQL datastore with the convenience of a traditional RDBMS in a novel way, and provides both strong consistency guarantees and high availability," the paper says. Megastore is the technology behind Google's High Replication Datastore.

Megastore: Google's Answer to NoSQL Databases - ReadWriteCloud

Lymph Node Surgery for Breast Cancer Not Always Needed, Study Says -

“Emotional asymmetry” in response to scientific data:

Doctors and patients alike find it easy to accept more cancer treatment on the basis of a study, Dr. Morrow said, but get scared when the data favor less treatment.

In the loose categorization scheme in my noggin, I’ll throw this in the same shoebox as the experiment of behavior economics that shows asymmetrical responses to medical risk, depending on whether the data are presented positively (“treatment will save the lives of 80 percent of patients”) or negatively (“20 percent of the patients will die”).

This also reminds me of the potential pitfalls in “bias to action” approaches to getting things done.  “Bias to action” is indisputably beneficial in customer service, but there are some scenarios in which it yields negative results.

Lymph Node Surgery for Breast Cancer Not Always Needed, Study Says -

Friday, February 11, 2011

The Monkey Cage: The invisible American welfare state


Resist the temptation to call this “public ignorance”and leave it at that.  There may be room for legitimate disagreement about what constitutes a government program.  We all drive on the interstate—does that mean we all benefit from government programs? (Perhaps yes, perhaps no.  I’m entitled to ask the question without knowing the answer.  Even in the blogosphere, not all questions are leading questions.)  Nevertheless, it does seem noteworthy that over a quarter of the participants in Welfare—the program that Eleanor Rosch might deem the prototypical government program—deny participating in government programs.

Suzanne Mettler's piece in Perspectives on Politics (free access to PDF) has many fascinating arguments about the political consequences of public ignorance about the benefits that people receive from the state. But this table is jawdropping. It shows the percentage of people who (a) benefit from various programs, and (b) claim in response to a government survey that they 'have not used a government social program.'


Mettler's basic argument is that because the US welfare state is 'submerged' and sliced up among a variety of different programs, many of which operate indirectly rather than directly, it is mostly invisible to US citizens. This has obvious political consequences - 'government social programs' are equated to 'welfare' and stigmatized. The fact that nearly half of Social Security recipients do not believe that they have benefited from a government social program, and that the same is true of some 40% of G.I. Bill beneficiaries and Medicare recipients is a rather extraordinary one.

The Monkey Cage: The invisible American welfare state

Thursday, February 10, 2011

Language Log » Correction of the Year?

Anyone who follows this link will be rewarded with an amusing, almost farcical, example of miscommunication.  Beyond that, data modelers who follow the link and read the brief linguistic analysis will recognize the homonym problem. 

I think the misunderstanding between the reporter and piggery owner could have been amplified by the use of pig in its specific or "marked" form — that is, meaning 'a young member of the domesticated subspecies Sus scrofa domesticus' rather than the general or "unmarked" version, 'any member of Sus scrofa domesticus.' As the Oxford English Dictionary explains under sense 2a of pig, the more specific sense of the word is "chiefly used in periods when and regions where the usual words for an adult pig are swine, hog, sow, or boar."

Language Log » Correction of the Year?

Monday, February 7, 2011

Database Refactoring

Here’s a little something about Database Refactoring I found in the blogosphere.

Although the methodology of refactoring code has been adopted enthusiastically, the same has not really been the case with databases. Nick [Harrison] argues that the reason could lie in the extent of  the task of unpicking complex databases systems sufficiently to make them more efficient and effective: and this will only be ameliorated with better tools and planning to support the techniques.

Respectfully disagree.  Don’t get me wrong: There is no doubt that many implemented databases are inefficient and ineffective.  Likewise, better tools and planning would surely be a good thing. 

But the difficulty of Database Refactoring is more fundamental, and good tools won’t change the fundamental realities.  Most “database refactorings” alter the semantic expressiveness of the data model, and one way or another, such changes will always be evident to the users of the system.  That is, most database refactorings are not refactorings at all, because they do change the functional requirements of the database.

Remember, code refactoring should improve code without changing its functional requirements.  Here’s the Wikipedia entry for refactoring:

Code refactoring is "a disciplined way to restructure code",[1] undertaken in order to improve some of the nonfunctional attributes of the software. Typically, this is done by applying series of "refactorings", each of which is a (usually) tiny change in a computer program's source code that does not modify its functional requirements. Advantages include improved code readability and reduced complexity to improve the maintainability of the source code, as well as a more expressive internal architecture or object model to improve extensibility.

The real reason why database refactoring is used less than code refactoring is that database experts know that very few database modifications achieve this standard.  Almost all database modifications alter the functional capabilities of the database.

Database Refactoring

Wallflowers at the Revolution -

More on the theme from my previous post.  (The medium is the message?  The message outlives the medium.)

Perhaps the most revealing window into America’s media-fed isolation from this crisis — small an example as it may seem — is the default assumption that the Egyptian uprising, like every other paroxysm in the region since the Green Revolution in Iran 18 months ago, must be powered by the twin American-born phenomena of Twitter and Facebook. Television news — at once threatened by the power of the Internet and fearful of appearing unhip — can’t get enough of this cliché.

The social networking hype eventually had to subside for a simple reason: The Egyptian government pulled the plug on its four main Internet providers and yet the revolution only got stronger. “Let’s get a reality check here,” said Jim Clancy, a CNN International anchor, who broke through the bloviation on Jan. 29 by noting that the biggest demonstrations to date occurred on a day when the Internet was down. “There wasn’t any Twitter. There wasn’t any Facebook,” he said. No less exasperated was another knowledgeable on-the-scene journalist, Richard Engel, who set the record straight on MSNBC in a satellite hook-up with Rachel Maddow. “This didn’t have anything to do with Twitter and Facebook,” he said.

Wallflowers at the Revolution -

News Desk: Does Egypt Need Twitter? : The New Yorker

The medium is the message, except when it isn’t.

When Mao famously said that power springs from the barrel of a gun, it was assumed that he was talking about guns. There wasn’t much interest at the time in how he chose to communicate that sentiment: whether he said it in a speech, say, or whispered it to a friend, or wrote it in his diary or published it in a book. That would never happen today, of course. We now believe that the “how” of a communicative act is of huge importance. We would say that Mao posted that power comes from the barrel of a gun on his Facebook page, or we would say that he blogged about gun barrels on Tumblr—and eventually, as the apostles of new media wrestled with the implications of his comments, the verb would come to completely overcome the noun, the part about the gun would be forgotten, and the the big takeaway would be: Whoa. Did you see what Mao just tweeted?

News Desk: Does Egypt Need Twitter? : The New Yorker

Book Review - The Net Delusion - By Evgeny Morozov -

May you live in interesting times, eh? 

Okay, so cyber-utopians might have been a bit misty-eyed.  For the rest of us, saying “I told you so” and trying to disentangle ourselves from the grid is no answer.  Nor should we put our heads in the sand, which is tantamount to saying “let the grid have its way with me.” 

One way to lift your head from the sand is to become more responsible with information.

As Evgeny Morozov demonstrates in “The Net Delusion,” his brilliant and courageous book, the Internet’s contradictions and confusions are just becoming visible through the fading mist of Internet euphoria. Morozov is interested in the Internet’s political ramifications. “What if the liberating potential of the Internet also contains the seeds of depoliticization and thus dedemocratization?” he asks. The Net delusion of his title is just that. Contrary to the “cyberutopians,” as he calls them, who consider the Internet a powerful tool of political emancipation, Morozov convincingly argues that, in freedom’s name, the Internet more often than not constricts or even abolishes freedom.

Book Review - The Net Delusion - By Evgeny Morozov -

Saturday, February 5, 2011

If you value truth, prove yourself wrong.

Unwavering loyalty to an idea is incompatible with responsible information creation, dissemination, and consumption.  Or as Richard Feynman said (paraphrasing from a distant memory here…) A scientist is obliged to try to prove himself wrong.

Here’s an on-topic snippet from the home page of The Cultural Cognition Project

Are We Watching a Game?

In debates over climate change, gun control, the HPV vaccine, and myriad other risks, Americans respond to scientific data in much the same way sports fans react to disputed calls on the playing field--cheering or booing based on how the evidence affects their "team." A paper published in Nature links this dynamic to cultural cognition and addresses what can be done to counteract it.

And here is a chilling snippet from the aforementioned paper in Nature.

Cultural cognition also causes people to interpret new evidence in a biased way that reinforces their predispositions. As a result, groups with opposing values often become more polarized, not less, when exposed to scientifically sound information. - home

Thursday, February 3, 2011

NOVA | Magic and the Brain

A program on the telly last night got me thinking about writing.

Because the brain processes visual stimuli in particular ways, magicians can fool us.  Likewise, hypnotists succeed not through some mystical powers, but through a form of mind control that leverages the realities about how the mind works.  It is probably more accurate to call it “mind encouragement.”

There’s a lesson in here for those who work in the Department of Content Creation, Narrative Sub-Division.  That is, a lesson for those who want to write clearly. 

Writing is mind control.  (Yeah yeah, I should be calling it “mind encouragement” but mind control sounds so much cooler.)  Just as the magician wants to control what the audience sees, the writer wants to control what the reader thinks—the flow of ideas into the reader’s mind.  And just like the magician or the hypnotist, the writer does this by leveraging what we know about cognitive processes:

Readers (of English) seek out certain information in certain places within sentences and paragraphs.  Effective writers (of English) design their sentences accordingly. 

I’ll say more about this in subsequent posts.  But here is a start: The Science of Scientific Writing by George D. Gopen and Judith A. Swan.  The paper is aimed at those who write scientific papers, but the rhetorical principles apply to all non-fiction writing.

And here is a little something from the PBS website describing last night’s broadcast of Nova Science Now:

Program Description

Are the secrets behind the world's greatest magic tricks actually wired into the human brain? Eccentric magicians Penn and Teller and Las Vegas trickster Apollo Robbins team up with neuroscientists to reveal how our brains process visual information. Can you really believe your own eyes?

NOVA | Magic and the Brain

Tuesday, February 1, 2011

Google Accuses Microsoft's Bing of 'Cheating' -

Probably not the last we’ll hear on this topic. 

FWIW, I just searched for “Google: Bing is Cheating” with both search engines and the results differed.

Google Inc. accused rival Microsoft Corp. of copying its Internet search results, the latest salvo in the competition between the two technology behemoths.

Google made the claim Tuesday after releasing the results of a test it carried out purporting to show how Google's results for search terms were copied weeks later by Microsoft's Bing search engine. Amit Singhal, who helps oversee Google's search engine algorithm, called Bing's behavior "cheating."

In response, Harry Shum, a Microsoft corporate vice president, wrote in a blog post that Google's claims were misleading and amounted to a "spy-novelesque stunt to generate extreme outliers."

"We do not copy Google's results," a Microsoft spokesman said.

Google Accuses Microsoft's Bing of 'Cheating' -

The Times's Dealings With Julian Assange -

Bill Keller’s story about dealing with WikiLeaks founder Julian Assange is worth reading in its entirety.  A few paragraphs stand out.

First, the following paragraph contains an implicit message about producing information: try to be interesting and engaging.

Unlike most of the military dispatches, the embassy cables were written in clear English, sometimes with wit, color and an ear for dialogue. (“Who knew,” one of our English colleagues marveled, “that American diplomats could write?”)

Second, the following passage points out that being interesting and engaging does not mean that every publication, blog post, and utterance needs to be earth-shattering.  (Folks who believe otherwise do a lot of the shouting on TV, or worse, provide an audience for said shouters.)

I’m a little puzzled by the complaint that most of the embassy traffic we disclosed did not profoundly change our understanding of how the world works. Ninety-nine percent of what we read or hear on the news does not profoundly change our understanding of how the world works. News mostly advances by inches and feet, not in great leaps. The value of these documents — and I believe they have immense value — is not that they expose some deep, unsuspected perfidy in high places or that they upend your whole view of the world. For those who pay close attention to foreign policy, these documents provide texture, nuance and drama.

The Times's Dealings With Julian Assange -

Smart Meters, Science and Belief -

Why being a responsible information consumer is harder than it sounds.   You cannot control your amygdala, but you can at least limit the damage when it tries to control you. 

In researching Monday’s article about opposition to smart meters, I [Times reporter Felicity Barringer]  found myself once again facing a dilemma built into environmental reporting: how to evaluate whether claims of health effects caused by some environmental contaminant — chemicals, noise, radiation, whatever — are potentially valid? I turned, as usual, to the peer-reviewed science.

But some very intelligent people I interviewed had little use for the existing (if sparse) science. How, in a rational society, does one understand those who reject science, a common touchstone of what is real and verifiable?

The absence of scientific evidence doesn’t dissuade those who believe childhood vaccines are linked to autism, or those who believe their headaches, dizziness and other symptoms are caused by cellphones and smart meters. And the presence of large amounts of scientific evidence doesn’t convince those who reject the idea that human activities are disrupting the climate.

Smart Meters, Science and Belief -