The Journal of Heredity 75:501-502. 1984

**The Too-Good-to-be-True Paradox and Gregor Mendel
**by Ira Pilgrim

In 1936 R.A. Fisher(1) stated that, on the basis of a Chi Square analysis, he believed that Gregor Mendel's data had been falsified. Fisher's method and conclusion appear to have been accepted without challenge, save for Weiling(4), who points out that those who confirmed Mendel's results also obtained "too good" results; and Orel(2), who defends Mendel based upon what is known about his character. If Fisher's paper had been ignored, as Mendel's was, this rebuttal would be unnecessary.Unfortunately,as a consequence of its apparently universal acceptance by most geneticists, Fisher's conclusions now appear as fact in some textbooks. While much has been written about whether or not Mendel counted his peas correctly, no one has, to my knowledge, addressed the more important issue of whether it is possible, as Fisher believed, to detect falsified data by the use of statistical methods. The purpose of this paper is to demonstrate that Fisher's reasoning was faulty and to clear the name of an honest man.

The too-good-to-be-true paradox, like most paradoxes, exists because someone uses the wrong mathematical tool to solve a problem and, like most paradoxes, it sounds so plausible that it is readily believed despite its violating ones common sense notion of what is true. I have been able to detect four paradoxical elements in Fisher's reasoning:

Fisher contends that results can be "too good." The implication of this concept is that if a scientist is testing a theory that predicts a 50:50 ratio; and on testing, gets results of 500:500, that he should repeat the experiment in order to get worse results, lest he be accused of cheating.Mendel's results agreed with his theory. Why shouldn't they, since his theory was correct? Not so, says Fisher; the closer the data are to expectations, the lower the Chi Square derived probability values become, indicating that the deviation was probably not due to chance alone. Here is the paradox: The closer his results are to his expectations the less credible they become and the farther they are from his expectations, the more credible they become. In other words, if his results are excellent, he is accused of dishonesty, and if his results are poor, they do not support his theory. The argument appears still more spurious if one considers that, if his theory is correct(as Mendel's was), that 500:500 is more probable than any other result, even though it is unlikely that he will hit it exactly. Anyone who has performed a large number of numeric experiments has found that he occasionally obtains incredibly close agreement with his expectations. This is to be expected since the most frequently obtained statistics will fall near the middle of the normal curve of error.

Fisher's reasoning also implies that the smaller the sample, the more credible the results. If one does an experiment using a sample of ten, expects 50%, and obtains five of one kind and five of another, the results seem reasonable (probability = 0.25). However, with a sample of 1,000, a 500:500 result is less probable (probability =0.08).No matter what values Mendel obtained, he could have been accused of cheating, if they were fairly close to expectations, since they were improbable. Poor Mendel; if he was right, he had to be wrong -according to Fisher. It seems obvious that Mendel wished to prove his hypothesis without a shadow of a doubt. He, therefore, used many more data than would have been necessary to support an already established theory. His successors used considerably fewer data to support him. Mendel suffered from a common delusion among novice scientists that if he could prove his theory, it could not help but be accepted.

The argument has been advanced that the multiplication of the probabilities in all of Mendel's experiments yield a very low probability. Is Fisher justified in multiplying them? I think not. By multiplying them, no matter what results are obtained, the combined probabilities are affected less by their actual value than they are by the number of experiments that were performed. For example, suppose that an investigator does two experiments, each having a P of 0.5. The combined probability is 0.5 squared, or 0.25. Suppose that he obtains similar values in a series of twenty experiments, the combined probability is 0.5 to the twentieth power, or 0.000001.

Another paradoxical element which, if considered, might well do away with the practice of making ex post facto statistical inferences (or at least lead to their more cautious use) is the fact that an event cannot have two correct and simultaneous probabilities of occurrence. If an event has a probability of occurrence of 0.5 before the period of occurrence of the event, it can only have a probability of zero or one afterwards. For example: The weatherman predicts the probability of rain on a certain day as 80%. After that day has passed, it can only be zero (it didn't rain) or one (it did). It is absurd to say that the automobile accident which happened yesterday could not have been truly accidental because it had a probability of one in a million of occurring. This type of "reasoning" is responsible for the doctrine of divine intervention.

The assumption made by both Fisher and his supporters is tantamount to saying that if someone draws a royal flush in poker or makes eight passes in a row with a pair of dice, that he is cheating, or the game is rigged. Astronomers have no difficulty with the occurrence of an event with a probability of one in several million, while some geneticists have difficulty accepting the occurrence of an event occurring once in fifty thousand trials.

Lest the above arguments are not convincing enough, I would like to add a final reductio ad absurdum to demonstrate that Fisher was in error and could not, by his method, nor any other statistical method, demonstrate that Mendel had falsified his data. The argument is derived from a discussion by William Cosby on how fortunate a person is to even exist. I wish to demonstrate that you, the reader, probably never existed.

The number of sperm in a single ejaculate are approximately 250,000,000. The odds of the sperm which produced you having reached the egg are two hundred and fifty million to one. Had any other sperm reached the egg, you would not have existed. The actual odds are considerably more than that if one considers the number of ejaculations that are usually necessary to cause conception and the odds of your parents having met and married. If one also considers the probability of your father and mother having existed ( each at 250,000,000 : 1) the probability of you existing becomes one in two hundred and fifty million cubed. If we continue this reasoning back to some primordial microorganism, the probability of you existing at all is so small as to be, to all intents and purposes, impossible.

A number of authors have pointed out(some in the form of an accusation) that Mendel probably knew what results to expect from pilot experiments which he had performed. It seems likely that he did not expect ready acceptance of his theories and had decided to add the weight of numbers to bolster his argument. It would have been sheer insanity to have undertaken that difficult and tedious task without first knowing what he expected to find. Despite this, there is no reason to believe that he did not faithfully report all of the data from the series of experiments which he published. He also reported results which did not agree with his expectations and repeated the experiment to show that the aberrant results were probably due to chance deviation.If further evidence of Mendels integrity is needed, one need only refer to his experiments with Hieracium, the results of which were completely at odds with his hypothesis and with his observations in Pisum. He reported these faithfully, admitting that he could not explain them.

Much as we would like it to be so, Fisher's contention that "fictitious data can seldom survive a careful scrutiny"(1,p.129) is probably not true. Both honest men and liars can rest easy,since the truth or falsity of data can not be divined by statistical analysis. Only the time honored method of the critical repetition of the work can do that, as it did for Mendel's work.

In every experiment, there is a possibility that an investigator's bias may, in some way, skew the data. That anyone might consciously do this is anathema to the spirit of science. It is considered "bad science" if an experimenter doesn't set his experimental conditions to counteract his bias; that is what statistical methods are for -or supposed to be for. It is customary to give a scientist the benefit of the doubt -and there is always doubt where science is concerned. This does not seem to be the case with Mendel. He has been cruelly slandered, with little basis except the opinion of one scientist who believed that he could detect falsified data by analyzing it's probability of occurrence.

There is no evidence that Mendel did anything but report his data
with impeccable fidelity. It is to the discredit of science that
it did not recognize him during his lifetime. It is a disgrace
to slander him now.

REFERENCES

1.R.A.Fisher. Annals Sci.1.115(1936).(Reprinted in 3).

2.V. Orel. BioScience 18.776(1968)

3.C.Stern and E.R.SherwoodThe Origin of Genetics; A Mendel Source Book (W.H.Freeman & Co. San Francisco and London.1966).

4.F.Weiling.Der Zuchter36.359(1966).