Laserfiche WebLink
<br />82' <br /> <br />, <br />tions about exploratory findings is that of multiplicity. <br />The chance that, in multiple analyses of the same data, <br />some "effect" will be found to be "one-percent signifi- <br />cant" increases rapidly with the number of analyses per- <br />fonned, as well as with the number of other, a priori <br />reasonable, analyses which were not carried out be- <br />cause the data were counter-indicative. Running a <br />large number of analyses and treating them as a priori <br />single tests could well be described as artificial stimula- <br />tion of significance. A P-value of one percent for some <br />obs~rved pattern has quite a different import from a one- <br />percent significant result on the test of the principal <br />hypothesis of the experiment. And yet, it has some <br />weight, if only to lead one to reanalyze other experi- <br />ments for repetitions of the same pattern, or to design <br />new experiments for that purpose. <br />As regards methodology, one must be aware that the <br />multiplicity of explorat.ory analyses, mostly unforeseen <br />at the design stage, does not fit into the Neyman-Pearson <br />two-decision framework. Nor is it possible to "optimize" <br />all these analyses simultaneously, or even to ascertain <br />all the distributions involved. It is therefore necessary <br />to use robust techniques which are more generally valid <br />and give rough indications of the extent of statistical <br />support. Nonparametric tests, especially permutation <br />tests which do not rely on distributional assumptions or <br />independence of observations, are suitable here, as is <br />jackknifing. <br />Braham comments that meteorolgists tend to evaluate <br />findings in terms of the physical sense they seem to make, <br />rather than merely in terms of the statistical support <br />they have. :Meteorologists are completely right in doing <br />so. If an exploratory analysis reveals one pattern at <br />P = '0.10 that makes good physical sense and another <br />pattern at P ,; 0.05 that does not fit in with anything <br />the meteorologist knows, he will obviously, and justifi- <br />ably, concentrate on the first pattern rather than on the <br />second. A single experiment's P-value is only one element <br />of all the evidence the meteorologist must muster in <br />evaluating a finding. <br />To return to Braham's question, the extent to which a <br />statistician should be involved in the meteorology of a <br />cloud seeding experiment depends on the stage of experi- <br />mentation and analysis. The design stage must be a joint <br />venture in which the meteorologist determines the goals <br />of the experiment and proposes the inain variables and <br />hypotheses to be studied. The statistician can then bring <br />the methods of mathematical statistics to bear on efficient <br />experimental design. The more the statistician under- <br />stands the cloud physics involved and the characteristics <br />of meteorological data to be studied, the better he can <br />serve the meteorologist's purpose. If the statistician does <br />not understand the meteorological rationale of the experi- <br />ment, or if he is ignorant of the special features of the <br />variables considered, his design is likely to remain a <br />largely irrelevant txercise in mathematical statistics. An <br />example is the derivation of optimal parametric tests <br />under assumptions of unit-to-unit independence-even <br /> <br />, . <br /> <br />:t~._r~~J"r~-t~~~;.)~'o.-';,;d4:-'-,,_.~~r_~i~._,i_&_'~;;;',,;_ <br /> <br />t",,~,~_._~ <br /> <br /> <br />Journal of the American Statistical Association, March 1979 <br /> <br />though it is well-known that there is considerable serial <br />correIa.tion in precipitation. <br />After the design (1) comes the execution (2), followed <br />by calculation of the preordained test and estimate (3). <br />In theory, all this is determined at the design stage, and <br />the experimenter may follow an unequivocal protocol <br />through stage (2) to the conclusion of his experiment. In <br />practice, things do not work quite that way-measure- <br />ments often cannot be taken as planned, treatment <br />methods may vary, some definitions are likely to be <br />found impractical, some units must be changed, and test <br />statistiics can almost always be improved upon. Who is <br />to decide whether these changes are permissible within <br />the predetermined design? <br />To illustrate from the Israeli .rainfall stimulation ex- <br />periment: The 8 P.M. to 8 P.M. daily unit initially used <br />required continuous-time recording gages. When these <br />proved difficult to obtain, was it permissible to substitute <br />ordinary gages and thereafter continue the experiment <br />on an 8 A.M. to 8 A.M. basis (Gabriel and Neumann <br />1978)? When the seeding plane pilots were found to prefer <br />flying within sight of the coa.stline rather than farther <br />out, as originally planned, was it appropriate to shift the <br />definition of the target accordingly? When a suitable <br />concomitant variable was found, was it permissible to <br />redesi~:n the significance test to take that variable into <br />account? When local observations, as well as publica- <br />tions from abroad, suggested that simple rank analyses <br />would not efficiently take into account occasional large <br />seeding; effects, was it permissible to change from the <br />proposed Wilcoxon-Mann- Whitney test to one which <br />now looked more powerful? When the . pilots went on <br />strike, could the lost days be omitted from the analyses, <br />and how was the randomization to continue after they <br />returned to work-as allocated for that date or as al- <br />located for the first day they were on strike? <br />This role of the statistician is not foreseen in the texts. <br />He has to serve as an interpreter and arbiter on the <br />design and on changes made during experimentation. Ap- <br />parently his understanding of the function and rationale <br />of randomization is also required in supervising that this <br />is actually carried out correctly. For similar reasons, the <br />statistician may also be asked to supervise the collation <br />of datl~ and calculation of the teEt statistic. <br />This quality control on the statistical aspects of ex- <br />perimentationis a function which statisticians cannot <br />carry lOut if they are too deeply involved in an experi- <br />ment. If they are on the experimental team, or, worse yet, <br />employed by the meteorologists, their objectivity may be <br />suspect. Indeed, their ability to insist on exact adherence <br />to protocol requires not only independence but also <br />suitablle status-it may be difficult for a fresh statistics <br />Ph.D. to insist that an eminent cloud physicist stick to <br />some seemingly trivial but possibly crucial point in a <br />randomization procedure. And indeed the authority and <br />presti~:e of the statisticians involved may well be adduced <br />as evidence that the procedures were properly carried, <br />