Skip to main content

Section 3.5 Error Analysis

Fake news is one type of “misinformation” or disinformation.” However, much of the “news” can be simply misleading. Some people or news outlets convey misleading quantitative information unconsciously; some do it deliberately. Researchers often present numerical results, and conclusions based on those results, without also putting forth enough effort to calculate and report the uncertainties and inaccuracies in those results. Perhaps more insidious, many professionals construct quantitative information to serve a pre-disposition, a political goal, or a business objective. What is an individual reader or consumer of this information to do?

One thing you can do is apply something called uncertainty and error analysis. This can be done informally as you read through a piece, but error analysis is also a rigorous mathematical discipline all to itself.

Let’s first consider a real-world situation in which uncertainty proved to be catastrophic. In 1985, the NASA spaceship Challenger, with the first civilian passenger, exploded shortly after takeoff. A post-mortem, or after-event, analysis was conducted and the NASA team struggled to explain what happened to the public, until Richard Feynman, a well-known physicist, dropped a rubber O-ring (something used to seal off an area) in a beaker of ice water and showed how the rubber became inflexible. It turned out that the failure of this O-ring was the root cause of the explosion.

The ability of that O-ring to hold up during the near-freezing temperatures during launch was an uncertainty which had propagated through to the launch event. It may have been the neglect of the NASA engineering team to think through all of the components and how they might fare under adverse conditions. It may have been that the launch team made a fatal assumption “in the moment” that the O-ring would remain stable enough. Whatever the deeper explanation, uncertainty surrounding that O-ring’s behavior at those temperatures led to a catastrophe. Uncertainty analysis, which is also more broadly categorized as risk assessment, is designed to flag such issues, especially for enormously complex engineering systems like a spaceship.

Now let’s turn to a error analysis as a discipline. Unless it is a highly theoretical treatment, most quantitative and numerical analysis begins with measurements of some kind. Temperature is a measurement. So is an answer to a question in an opinion poll. All measurements have error. Unfortunately, that error propagates through the analysis and the event the analysis supports, and can even get magnified.

Here is a simple example of how error gets magnified that also illustrates rigor: Suppose I want to calculate the volume of a large cubic concrete tank. Well, that’s pretty easy from a math perspective. The volume of a cube is the length of one side cubed, or \(V = L^3\text{,}\) where V is volume and L is length. So, I just have to measure the length of a side. I take a tape measure and measure the side. Let’s say I measure it at 36 inches. There is an inherent inaccuracy to the tape measure; let’s say it is \(/pm 1/8\) in. Oh, and did I make sure that the tape measure was completely straight and rigid from one side to the other? Maybe not. Maybe there’s a 1-inch error because of that. Or maybe I stretched my arms so far, I really only “eyeballed” the length.

Let’s assume there’s a total of 2 in. of error in my measurement. I calculate the volume as \(36 x 36 x 36 = 46656 in^3\text{.}\) Now, to report the error in this numerical result, I also have to concede that the length could be between 34 and 38. So the volume based on those two numbers are 39,304 and 54,872. Suddenly, that’s a pretty big spread!

Whether the error is significant or not depends on what you do with the number. The error could have a financial impact if you are buying the materials to construct such a concrete tank. The error could have little or no impact if you just need an estimate for something else.

The critical point is that the error in the original measurement has now been propagated and magnified in the final numerical result.

Now suppose I want to calculate the volume of the cylindrical fiberglass water reclamation tank I am staring at in my backyard as I write this. The equation for the volume of a cylindrical object is area of the base x height = volume and area of a circle is \(\pi r^2\) = Area. In this case, I have to make two measurements, height and diameter (diameter is twice the radius represented as r in the equation). I can measure the height with a tape measure, although that tank has been sitting there for 15 years so I’m not sure how much of bottom is under the gravel surrounding it. I can “back-calculate” the radius by measuring the circumference with the tape measure.

You probably can see where this is going. Two measurements, each with their own sources of error propagated and magnified into the final numerical result.

Now consider some numerical results with grave consequences. The forecasts for global climate change and its consequences involve some of the most sophisticated measurements and computer modeling humanity has ever devised. While it might seem overwhelming to consider the error in each of the measurements and how they are propagated, you can take comfort in the fact that thousands of researchers around the world are arriving at similar conclusions. Only a small percentage of scientists deny that human-induced climate change is advancing dangerously.

But here’s how politics come into play. Climate policy makers, and climate deniers, use a parameter called the “social cost of carbon” to “monetize” the effects of climate change. This is a very complex analysis. But realize that the Environmental Protection Agency in the Obama administration used a figure around $50/ton carbon, while Trump administration officials revised the analysis and came up with a figure of around $7/ton. How could such an important figure be so different? The answer is that the analysis involves several key assumptions. Change the assumptions and you can drastically change the result.

Many news reports containing quantitative results will give, at best, a cursory view of the error in those results. Journalists don’t have an infinite amount of column inches to devote to the story. Researchers often are limited by the number of pages they can publish for a journal article. But realize that every one of these numerical results has sources of error, often significant or even huge, and many of them will not be exposed. It is up to the reader to think through them.

So, one root cause of misinformation or misleading information is the error in the original measurements. Sadly, much of what we are presented as “information” is based on measurements which are not physical measurements, as in the examples above, but are data arrived at in various ways. If I want to understand public sentiment about gun violence, for example, I can commission an opinion poll. There are rigorous procedures for taking polls and then doing statistical analysis on the results. But statistical error is very different from, say, the biases inherent in the person formulating the questions for the poll.

Sadly, much news we consume is not even intended to pursue an “objective truth” (even as a direction), but instead is constructed to support a position. Consulting firms, government departments, policy shops, NGOs, and others spend or raise millions of dollars to generate reports which are consumed by elected officials, and breathtakingly reported by journalists, but that often amount to a string of assumptions wrapped around cherry-picked data presented in dazzling, colorful graphs (see next section).

Our message here is that fake news often has quantitative and numerical components to it, but you don’t need to know math or be a “quant type” to think a bit more deeply about those results. You do need to be a healthy skeptic, however, and simple “qualitative tools,” like error analysis, will help you.

In closing this section, it is important not to throw your hands up and say “all information is tainted, there is no truth.” “Facts” and “truth” are asymptotic, which means simply that we can get closeWindow()r and closeWindow()r even if there is always some doubt or “error.” Error and uncertainty, however, are compensated for as you consider the following “path” towards real knowledge and away from fake news:

Coincidence or randomness – one thing that happens (“I was assaulted in the park so this must be a high crime neighborhood”) or two things that happen around the same time (“my friend was assaulted in the same park the same week”) may constitute coincidence or randomness but not real information. Something similar can be said for a researcher or policy expert who reports results from an analysis or experiment that has not been repeated or validated by others.

  • Correlation – an association between two things (gun sales and violent crime, e.g., in a specific region) may show a weak or strong statistical correlation, or a significant association, which would have to be corroborated to constitute information you can rely on.

  • Causation – something caused by another thing (human industrial and consumption activity and global average temperature rises, as opposed to natural causes of those temperature rises) is a much higher bar to scale. This requires many independent and repeatable studies, perhaps coming at the problem from different directions.

  • Convergence – Specialists and experts collaborate and review each other’s work and begin to converge on a common “theory of the case” or explanation for why something is occurring.

  • Consensus – experts can gather in a room and nod their heads in agreement, but so what? Consensus among decision makers and their constituencies are then necessary for any action to be taken based on the information and knowledge.

Error and uncertainty in the information are progressively and incrementally reduced to low or insignificant levels as the analysis proceeds along the path of the “five C’s.”

Any one of these steps alone is insufficient. After all, the townspeople of Trent, Italy, achieved consensus around the fake news of the murder of Simon by the Jews in their community, and acted on the fake news. They did not withhold judgment until the “analysis” came out.