Planning Surface Water Data Programs--Use of the flood-peak data file for defining flood-flow characteristics In Reply Refer To: October 17, 1969 EGS-Mail Stop 415 SURFACE WATER BRANCH TECHNICAL MEMORANDUM NO. 70.08 To: Regional Hydrologists and District Chiefs From: Chief, Surface Water Branch Subject: Planning Surface Water Data Programs--Use of the flood-peak data file for defining flood-flow characteristics Flood peak information discussed in the June 27, 1969, Surface Water Branch Technical Memorandum No. 69.11 is currently being placed in a computer data file. After the input information for a Part has been edited and entered into the file, each District will be sent printout sheets listing the stored information for its gaging stations. This information listing is to be examined and any desired corrections are to be entered in red directly on the printout sheets. It will be unnecessary to attempt to resolve any error messages that appear on the output sheets. The District's main concern is anticipated to be the correction of discharge figures that may have been revised since publication of the nationwide series of flood-frequency reports. In order to allow us to follow a rather tight schedule for corrections of stored information, please return the printout sheets and your comments to A. Rice Green, Code: 4027-5402, within two weeks from the date you receive them. Requests for using the flood-peak data file to obtain low-Pearson Type III frequency computations needed for the project of Planning Surface Water Data Programs should be sent with the returned printout sheets. To assure that proper computations are requested, the project chief will have to inspect the information printout and select the input data for frequency computations. In the remainder of this memorandum instructions and guidelines are given for selecting data for computations, for ordering computations, for interpreting output from the frequency computations, and for occasional adjusting of computed curves. Frequency computations based on information in the flood data file are obtained by means of a new computer program number C166, "Retrieve Peak Flow Data for Log-Pearson Analysis," by W. L. Isherwood. This program takes information from the file, prints the information as output, and then enters all the non-zero discharges, or a user-specified part of the non-zero discharges, into the "Log-Pearson Type III Analysis" program W4014. Computations and output of program W4014 are described in the preliminary Techniques of Water Resources Investigations manual "Log-Pearson Type III Frequency Analysis by Computer," (Thomas, 1968). Selection of discharges for computation A most important consideration in ordering frequency computations by program C166 is the selection of discharges to be used. Ideally, we wish to select those discharges representing an unbiased time sample for a period of homogeneous record. In order to avoid bias, it is preferable to use in the computations only those flood peaks that occurred during the time a gage was in operation. For example, if a gage was operated from 1930-67, but information search had established the stages and discharges of the 1903 and 1913 floods, these historic peaks will appear in the data file. We should include only the 1930-67 discharges in the frequency computations. The historic information available for the 1903 and 1913 floods can be used in a later analysis of the computer output. A less clearcut case of bias is involved in the so-called "outlier" problem, where a flood observed during the period of gaging is known to be the largest in a much longer time span. As an example, the 1955 flood is known to be the largest in 100 years at a New England site gaged since 1935. We don't know for such whether the 1955 flood should be included in frequency computations; so it is suggested that the frequency computations be performed twice for this flood record--once including the extreme peak and once omitting it. Later comparison of the two outputs along with consideration of the historic information will assist in defining the most useful frequency relation for the site. At many sites the construction of reservoirs or of extensive urban areas has significantly altered the flood characteristics during the period of gaging. In ordering flood-frequency computations for the project of Planning Surface Water Data Programs only the natural-flow portion of the record will be used. Ordering computations Log-Pearson Type III computations using program C166 are ordered by entering coded instructions on any 80-column coding form, such as 9-1633, 9-1633A, or 9-1506. One line of the coding form (representing one punch card) is required for each desired computation. Information to be coded is the station identification number and the water years of the annual floods to be used in computations when all floods are not used. Information is coded as follows: Column 1 - blank Columns 2-9 - the station identification number with the dash and decimal omitted. If no other information is coded, all non-zero discharges for the station will be used in the frequency computations, regardless of the number of breaks in the sequence of water years. A segment of the available discharge record may be used by coding as follows: Columns 10-13 - the first water year of the segment to be used. Columns 14-17 - the last water year of the segment to be used. If one or two peaks are considered as "outliers," that is, they are known to be the largest flood in a period of time far greater than the period of gaging, these peaks may be omitted from computation by coding in Columns 18-21 - the water year of the first flood discharge to be omitted from computations. and in Columns 22-25 - the water year of the second flood discharge to be omitted from computations. Any number of computations may be ordered in one job, and each computation requires about one second of CPU time ($0.20+). Station numbers may be in random sequence; i.e., they need not be arranged in sequential order. Interpretation and adjustment of computed frequency curve The flood-magnitude-frequency curve computed by program C166 is based on the assumption that the flood discharges are a representative time sample from a population of flood events described by a log-Pearson type III probability distribution. This assumption should be verified by three steps before accepting the computed frequency curve as applicable to a site. Step 1 of the verification is to check that all flood data during the sample period was used in computations. At some sites zero flows were recorded and at some crest-stage gage sites the annual peaks of a few years may not have reached the minimum recordable stage, so that the discharge is known only to be less than some specific value. These discharges were not used in the computations. The computed frequency curve is based, therefore, on conditional probabilities, and must be adjusted. The adjustment of conditional probabilities is made as explained by Jennings and Benson in "Frequency curves for annual flood series with some zero events or incomplete data" (Water Resources Research, v. 5, no. 1, p. 276-280, 1969). The adjustment is simple and requires only that the computed conditional probability of exceedance be multiplied by n/N, a ratio of the number of annual flood discharges used to the total number of years of flood record. After this adjustment the desired even-valued probabilities are no longer available and it is necessary to replot a frequency curve with adjusted probabilities versus discharge, and then pick off the discharges corresponding to the desired even-valued probabilities or recurrence intervals. Step 2 is to incorporate historical flood information into the frequency relation. The recurrence interval of the peaks with historical information can be computed and added to the plot. If the analyst finds that this information would significantly alter the computed curve, he should revise the upper part of the curve graphically. Comparison of curves computed with "outliers" included and omitted often is useful for defining the final frequency curve. Step 3 is to visually judge the goodness-of-fit of the computed curve to the data points on the plot. In comparing goodness-of- fit it should be kept in mind that sampling error always exists in an individual station record; this is larger the shorter the record. Therefore we cannot expect a good fit for every record. Goodness-of-fit should be judged on a regional basis rather than on the basis of individual curves. If the computed curves, for some ranges of flow, are higher than the data for some stations and lower than the data for others, and if this varies randomly, the Pearson curves should be considered satisfactory. If there is a tendency for the curves to depart from the data consistently over a whole state or over a large part of a state, this would be sufficient reason to consider the Pearson fitting to be unsatisfactory and to go to graphical fitting. An exception may exist for an individual curve that has a very low annual peak, resulting in a poor fit (usually too low) at the upper end. This is adequate reason to redefine the curve graphically. Rolland W. Carter WRD Distribution: A, B, S, FO