Laserfiche WebLink
<br />. <br /> <br />SOME STATISTICAL TOOLS IN HYDROLOGY <br /> <br />19 <br /> <br />. <br /> <br />are expected to influence the dependent variable, <br />(2) describing these factors quantitatively, (3) <br />selecting the regression model, (4) computing <br />the regression equation, the standard error of <br />estimate, and the significance of the regression <br />coefficients, and (5) evaluating the results. <br />Selection of the appropriate factors should <br />not be a statistical problem, but statistical <br />concepts must enter into the process. If the <br />analyst merely wants to know the relation of <br />annual precipitation to annual runoff, he can <br />proceed directly to selection of a model. But if <br />his problem is to make the best possible estimate <br />of runoff, he will include other factors, some <br />of which may be related to each other as well <br />as to runoff. The problem of determining if <br />certain factors are related to the dependent <br />variable requires careful selection of indices <br />describing these factors quantitatively. These <br />indices should accurately reflect the effects, and <br />no two should describe the same thing. It is a <br />characteristic of regression that if a factor is <br />related to a dependent variable and this factor <br />is entered in the regression model twice (as <br />two different variables), the effect on the <br />dependent variable will be divided equally <br />between the two. Thus, if the total effect is <br />small, the result of dividing it in two parts <br />may be to produce nonsignificance in each <br />of the parts. Likewise, several closely related <br />variables may compute as nonsignificant, <br />whereas one properly selected index would <br />show a real effect. Thus, the independent <br />variables should be selected with considerable <br />care; the shotgun approach should not be used. <br />Another consideration in selection of var- <br />iables is to avoid having a variable, or a part <br />thereof,. on both sides of the equation. Such a <br />condition may be acceptable for certain prob- <br />lems, but the results must be evaluated <br />carefully. A spurious relation may result, or <br />the relation may be correct but its reliability <br />difficult to assess. Benson (1965) described <br />ways in which spurious relations may be built <br />into a regression. <br />The user of the regression method should <br />understand the effect of related independent <br />variables on the computed regression coeffi- <br />cients. If the independent variables are entirely <br />unrelated, the simple regression coefficients <br />and the corresponding partial regression co- <br /> <br />I. <br /> <br />. <br /> <br />efficients would be the same. However, such <br />conditions rarely occur in nature. The multiple <br />regression method provides a way of separating <br />the total effect of the independent variables <br />into the effect of each independent variable <br />and an unexplained effect. Consider the simple <br />regressIOn <br />Y=a+bIXd:error, (1) <br /> <br />where Y also is affected by another variable, <br />X" which is related to Xl' The regression using <br />Xl and X, will be <br /> <br />Y=a' + b;XI + b,x,:\: error, (2) <br /> <br />where b; .,..bl. If Xl and X, are the only variables <br />affecting Y (and the effects are linear), then <br />equation 2 completely describes Y, and b; <br />and b, are the true values of the regression <br />coefficients (except for sampling errors). If <br />Xl and X, are positively correlated with each <br />other and with Y, consider the effect on the <br />magnitude of bl. For each value of Xl in <br />equation 1, Y will appear to be more closely <br />related than it actually is because X, increases <br />with X, and its influence on Y is real though <br />unmeasured. Therefore the regression coefficient <br />bl is larger than its true value b;. <br />Similar changes in bl and b, would occur if <br />another factor, related to Xl and X, and Y, <br />were included in the regression. These changes <br />in the magnitudes of the regression coeffi- <br />cients due to addition or deletion of a variable <br />are characteristic of regression. They are some- <br />times interpreted as indicating that partial <br />regression coefficients have no physical meaning. <br />Such interpretations are not necessarily correct. <br />If the variables used in the regression are <br />selected on physical principles and the effects <br />of each of the variables is appreciable, then the <br />partial regression coefficients should be in <br />accJlrd with physical principles. In fact, it is <br />good practice to compare the sign and the <br />general magnitude of each partial regression <br />coefficient with that expected. Benson (1962, <br />p. 52-55) made a thorough comparison of <br />this kind. <br />The regression coefficients of certain var- <br />iables may change sign when another related <br />variable is added to or deleted from the re- <br />gression. This effect may result because (1) the <br />variable is not a good index of the physical <br />feature represented, (2) the effect of the var- <br />