Sunday, March 17, 2013

Reporting Artifact

Every so often, I use what I consider a basic term in conversation or online and come across complete ignorance that the term even exists on the part of my audience. I will here attempt to define ‘reporting artifact’ for future reference.

One often sees in the news a report that ‘twice as many cases of cancer have been diagnosed as last year’. The natural conclusion is that twice as many people got cancer. This is, in fact, a conclusion COMPLETELY unsupported by the data presented. One way that it may be a false conclusion (there are several ways) is due to the possibility of a reporting artifact. In this specific case, suppose ten times as many people were screened for cancer as last year. Would the doubling of diagnoses in a such a case indicate an increase in cancer? Quite the opposite – that would indicate that the cancer is one fifth the prior year’s level. That is a reporting artifact. Broadly stated, if you find something more only because you LOOKED for it a LOT more, it is a reporting artifact. The flip side is too: if you stop looking for something and stop finding it that, again, is a reporting artifact.

Criminal statistics are particularly subject to reporting artifacts. The reason is quite simple: people do not always report crimes. Certain crimes are more likely to be reported than others; homicide, for example, produces a body which will usually be found at some point. Robbery produces no such gross physical evidence. You can not PROVE that a homicide has NOT occurred (unless you can produce everyone who might have been murdered), but the fact that bodies tend to turn up eventually and that people wonder what happened to the victim means we can reasonably conclude that the fact that a murder has happened becomes part of a statistical set most of the time. Not so with robbery – even if you could demonstrate that everything is where it should be that doesn’t prove there weren’t TWO robberies; the second reversing the perpetrator and victim of the first.

The situation is aggravated by the fact that most crimes go unsolved (especially minor ones) and reporting a crime has negative consequences even for the victim. If nothing else, paperwork and lost time. This produces a negative incentive to report a minor crime; if someone steals a small amount of your money you might be better off working to make more money rather than spending the same amount of time working with the police to solve the crime. The crime might never be solved, and even if it is you might not get your money back.

Rape is often recognized as an under-reported crime despite being much more serious. Some of the same factors are in play; what has been lost cannot be recovered, even if the assailant is found the “I say/you say” dynamic frequently prevents a conviction, and even a conviction provides no guarantee that the assailant will not rape someone else in the future. Toss in various cultural stigmata against the victim and the fact that to report the event they must discuss taboo issues with strangers. Yes, modern US culture has taboos.

Attempts to compare rape statistics are thus subject to a very high likelihood of reporting artifacts.

As noted above, disease and injury are another common place where reporting artifacts occur. Just about everyone who has a heart attack shows up in statistics somewhere (or so we can reasonably assume; again, we can’t prove it all that well). Just about anyone, however, knows someone who got a papercut. Most of them do not show up in statistics (ER visits, EMS calls, etc., though some people do in fact seek advanced medical treatment for papercuts – boy do I wish I was making that up). So if we have a statistic that shows that there are more reported heart attacks than paper cuts, does that mean that heart attacks are more likely? Probably not.

If we posted a reward for reporting paper cuts we would introduce yet another reporting artifact; people being people, someone will deliberately give themselves a paper cut in order to get the reward. If we are trying to collect data on accidental papercuts we have just distorted our data set.

Even polling can’t eliminate reporting artifacts; people will lie, or forget, or misunderstand what is being asked. So if we poll people about their papercuts, some will forget how many they’ve gotten, some will deny they went to the hospital, and some will assume that we were asking about all accidental cuts, not just ones from paper. Note this is for something with minimal emotional impact for most people. An issue with political or emotional overtones will usually be worse (gunshots, rape, etc.).

The root of a reporting artifact is that what happens is different from what is observed, and what is observed is different from what is recorded.

Reporting artifacts can be identified without allowing meaningful analysis, especially if multiple artifacts may be at work. Cracks in airplane wings happen. People often, but not always, find them. When they find them they usually, but not always, report them. If a regulatory agency tells operators to go look for specific cracks in specific places they are more likely to find those cracks. Big cracks are more likely to be reported than small cracks. 1st world airlines are more likely to find cracks than 3rd world airlines. 3rd world airlines are more likely to have cracks (older airplanes) than 1st world airlines. If a database lists ten times as many cracks in 1st world airplanes is it because 3rd world airlines aren’t finding them, or are finding them but not reporting them? Can we know the actual ratio of cracks? Can we prove that the 3rd world airlines have any significant number of unreported cracks? The last is possible, though only with extreme care could such information be used to determine a ‘real’ ratio.

That is the real problem of reporting artifacts: even if you can prove they exist, it is far harder to adjust for them.

Let all those who compare statistics beware.

Friday, March 15, 2013

Precedent

Lately, I’ve been noticing more and more appeals to precedent in arguments. Note, I’m not innocent of this myself.

The idea that precedent is definitive is, in my opinion, highly flawed. Precedent (assuming that it is, in fact, precedent and not something unrelated to the issue in question) is certainly relevant to a discussion or debate, but it should not be considered authoritative.

Precedent is what someone else, in an earlier time, in a similar but not identical situation (since no two situations can ever be TRULY identical), did. It does not mean, please note, that what they did was right or had a good outcome even then. If someone decided to round up six million Jews and kill them there’s a precedent for that. It would be a very, very bad precedent to follow.

Precedent is, however, worth noting. If you can extrapolate your future position from your current position without being informed by one or more precedents you are quite lucky. Most actions have unforeseen consequences. If you are able to look at the unforeseen consequences of similar actions to those you are contemplating in the past, then those particular unforeseen consequences won’t be unforeseen. They may still happen, of course, but at least you can try to prepare for them.

Precedent seems to be especially prevalent in the modern US judicial system, taking precedent (gotta love the English language) even over the written law. This often angers me. If nothing else, it should be noted that judicial precedents are often overturned. Surely something that is repeatedly shown to be flawed should not be relied upon?

As an engineer, of course, I use precedent every day. We call it ‘test results’ or something like that, but we operate on the basic premise that because something happened once (an object of such-and-such characteristics failed under such-and-such conditions) it can, and at some point probably will, happen again. If we have a lot of matching precedents (ten similar objects that all failed under similar conditions) we predict the future with confidence (another similar object will fail under the same conditions). Without precedent there would BE no engineering. You can’t do calculations without some basis to do them upon.

There is, however, a key difference. Engineering is quantified, and a chaotic element is hammered into the brains of up-and-coming engineers. Any good engineer allows a ‘factor of safety’ based on how good their precedents are and the consequences of failure. When precedents fail to be predictive we study them to learn why. And quantify that data and make it part of the NEW precedent.

Human lives are, at present, not subject to being quantified as individuals. The chaotic factors are too high, the variables too uncertain. Engineers can predict with very high confidence the minimum level of force and the application of it needed to destroy a piece of metal of known type. Not so human beings. Part of that is our variability – note the ‘of known type’ bit. If I don’t know whether I’m hitting aluminum or steel I have no idea how much force to apply. Without knowing the grade of steel and its characteristics (hardening? Temperature?) I likewise don’t know enough. Human beings are not so easily tabulated. Our actions are even less so.

Humans in groups are a little easier – as with any large system, the mistakes tend to cancel each other out. This is a basic principle of system engineering – you need not assume the worst case for all your material properties. Similarly you need not assume that all people are at the bottom end of the bell curve.

Again, though, there are parallels: good engineering insists on recognizing that some of your components WILL be at the bottom of spec (or even below it). We must also recognize that some humans in any large group will be at the bottom of the moral bell curve. They are our criminals.

There are, however, even more important divergences. Again, humans are not easily tabulated. If one is presented ten samples of a steel tempered to ten increasing levels of hardness, provided with data for all but the fifth and asked to predict the properties of the fifth an engineer can do this with high confidence. This is called interpolation, and is possible because one has carefully controlled all but a single variable. Multi-variable systems require many more data points, and have a strong tendency to have points of chaotic behavior. Many bridges had been built similar to the Tacoma Narrows Bridge (famously the ‘Galloping Girdy’). Only that EXACT bridge under those EXACT wind conditions was destroyed by resonance. The same bridge in another place might have lasted a hundred years. Had a dozen similar bridges been erected in the same place both ones of greater and lesser structural integrity would almost certainly have survived. A bridge of equivalent ‘strength’ but different design would have survived. Interpolation based on even very good data could easily have produced a failure like the GG (i.e. building multiple similar bridges in the same area in a range of stiffnesses, size, etc. bounding the intended design). A bridge is a complex system. So are people… except people are much worse.

Worst of all, humans, unlike any other material dealt with by engineers, are self willed. Steel does not WANT to break (or not break). A bridges does not CARE if it stands or falls. Again, I, like most engineers and non-engineers I know, routinely imply that this is not the case. We ascribe motivation to objects when they fail (or don’t fail when we expect them to), or blame it on the gods, take your pick. Humans may be aware they are being measured and their actions predicted and may choose to act on that information. Steel doesn’t know it is about to be cut, and can’t harden itself to resist (or soften itself to be ‘helpful’). Humans may choose to do either one.

And so we get back to my contempt for judicial precedent. Judicial precedent is an attempt to handle an individual by its nature – large groups are rarely involved. A single, original precedent is typically cited, rather than the body of decisions already based on that precedent. Again, note my contempt is not for the law – setting a standard is a necessary stage in gathering data. Worst of all, the huge range of variables present between any two human beings are routinely ignored in favor of a few similarities.

As yet, we cannot even truly deal with human bodies in engineering terms (most medicines, for all their extensive testing, would be considered to have unacceptably high failure rates in most engineering disciplines). Human brains are, from the limited evidence available, much worse.

I say, leave precedent to the engineers. The justice system can have it only after the medical system gets its error rate down.