Mindhacks discusses a surprising asymmetry. Journalists discussing sampling error almost always emphasize the possibility that the variable in question has been under-estimated.
For any individual study you can validly say that you think the estimate is too low, or indeed, too high, and give reasons for that… But when we look at reporting as a whole, it almost always says the condition is likely to be much more common than the estimate.
For example, have a look at the results of this Google search:
“the true number may be higher” 20,300 hits
“the true number may be lower” 3 hits
There are two parts to this. First, the reporter is trying to sell her story. So she is going to emphasize the direction of error that makes for the most interesting story. But that by itself doesn’t explain the asymmetry.
Let’s say we are talking about stories that report “condition X occurs Y% of the time.” There is always an equivalent way to say the same thing: “condition Z occurs (100-Y)% of the time” (Z is the negation of X.) If the selling point of the story is that X is more common than you might have thought, then the author could just as well say “The true frequency of Z may be lower” than the estimate.
So the big puzzle is why stories are always framed in one of two completely equivalent ways. I assume that a large part of this is
- News is usually about rare things/events.
- If you are writing about X and X is rare, then you make the story more interesting by pointing out that X might be less rare than the reader thought.
- It is more natural to frame a story about the rareness of X by saying “X is rare, but less rare than you think” rather than “the lack of X is common, but less common than you think.”
But the more I think about symmetry the less convinced I am by this argument. Anyway I am still amazed at the numbers from the google searches.
18 comments
Comments feed for this article
September 26, 2009 at 11:16 pm
Robert Wiblin
If you’re quoting a study or any data, then you would naturally quote the mean directly from the study (esp if your a journalist who isn’t confident playing with numbers), but then want to emphasise the potential for it to be more important using the margin of error. A study on crime, for the reasons you point out, would say 0.1% of people were murdered, not “99.9% of people weren’t murdered”.
September 26, 2009 at 11:33 pm
Paul
well, it’s up to 9 hits now, since you’ve now talked about it in your blog post and others have referenced the post–so maybe discussion of the topic will push the results in the opposite direction.
although the number of discussions on the topic is low now, perhaps the true number may be higher in the future….
September 27, 2009 at 3:22 pm
Erik Brynjolfsson
Yes, there appear to be only a handful of hits for “the true number may be lower”. However, its quite possible that Google missed some, so the true number may be higher.
September 27, 2009 at 3:35 pm
Taeyoung
But look at “may overstate the number.” I get 472,000 hits for that. So the asymmetry may just be an asymmetry in preferred phrasings.
September 27, 2009 at 3:53 pm
Stewart Ulm
In nature, the number of things are necessarily limited. We are much more likely to count things with an interest in how many there are, rather than how few there are. In telling that story, we have to under-emphasize the count, or else we are liable to be called exaggerators.
September 27, 2009 at 4:34 pm
mulp
The bias seems to be around the nature of “true” not higher or lower.
Results 1 – 9 of 9 for “The true number may be lower”.
Results 1 – 20 of about 805,000 for “The number may be lower”
Results 1 – 20 of about 41,000 for “The true number may be higher”.
Results 1 – 20 of about 397,000 for “The number may be higher”
September 27, 2009 at 6:48 pm
Frank
Whoa! Erik Brynjolfsson has a blog. Excellent
September 27, 2009 at 6:51 pm
Etl World News | Assorted links
[…] 5. "The true number may be lower" — you don't hear that one so much. […]
September 27, 2009 at 6:52 pm
Alex
As Paul points out, if we can verify X cases, then we would say that we know about X but “the true number may be higher”. For instance: we have found 300 cases of swine flu, but…
September 27, 2009 at 8:37 pm
Fred H Schlegel
Another explanation can be stories where under-reporting is a more-likely outcome, not just miss-reporting, as in crime statistics or even flu diagnoses. While the reporting of statistics is generally dicy, I’m more used to seeing reporters ignoring confidence levels rather than miss-reporting directional error bias.
September 28, 2009 at 6:04 am
Stuart
When you want to prove a point statistically, it’s quite common to use the lowest possible estimates so that your argument isn’t undermined by someone questioning the initial estimate. If your argument still stands with the lowest estimate then it acquires extra weight even if, in reality, the real number might be higher.
September 28, 2009 at 8:28 am
No results found for “true number may be lower or higher” « Knowledge Problem
[…] Hacks (via Cheap Talk and Marginal Revolution) points out news reports often stress when stating an estimated value that […]
September 28, 2009 at 9:18 am
Frank
Yeah — what Stuart said.
September 28, 2009 at 10:58 am
hanmeng
As of 4:48 UTC, Google UK had about 5,420 hits for “the true number may be lower”, but Google US only 18, so it’s true in the US.
March 20, 2014 at 5:32 pm
Liz
Short, sweet, to the point, FR-cEexaEtly as information should be!
September 28, 2009 at 2:46 pm
Calebe
I’m in Brazil and, interestingly, clicking on these links (which lead me to Google.co.uk) yielded the following results:
“true number may be higher” — 42,800
“true number may be lower” — 5,440
Then I did the search again, this time typing the keywords (quotes and all) myself, @Google.com.
“true number may be higher” — 43,100
“true number may be lower” — 5,330
One thing I noticed a few years ago is that Google searches yield different results depending not only on where you are, but which interface language you choose as well.
September 28, 2009 at 4:03 pm
Derek Jones
There is a huge assumption being made here, ie, that English speakers use the same phrasal form to express these two concepts. What about “actual number may be lower/higher” which occurs an order of magnitude more frequently (ratio 1/2 for lower/higher), or the use of less/more (yet another magnitude greater, ratio 1/2).
This recurring 1/2 ration is interesting given that variations on these word forms have similar frequencies.
What is the most common term used to express the concept of “may be lower”? An corpus lexical semantists here?
September 28, 2009 at 7:15 pm
Top Posts « WordPress.com
[…] The True Number May Be Higher Mindhacks discusses a surprising asymmetry. Journalists discussing sampling error almost always emphasize the […] […]