Planning Surface Water Data Programs--Use of the flood-peak data file for defining flood-flow characteristics

In Reply Refer To:                                         October 17, 1969
EGS-Mail Stop 415


To:       Regional Hydrologists and District Chiefs

From:     Chief, Surface Water Branch

Subject:  Planning Surface Water Data Programs--Use of the
          flood-peak data file for defining flood-flow

Flood peak information discussed in the June 27, 1969, Surface
Water Branch Technical Memorandum No. 69.11 is currently being
placed in a computer data file.  After the input information for a
Part has been edited and entered into the file, each District will
be sent printout sheets listing the stored information for its
gaging stations.  This information listing is to be examined and
any desired corrections are to be entered in red directly on the
printout sheets.  It will be unnecessary to attempt to resolve any
error messages that appear on the output sheets.  The District's
main concern is anticipated to be the correction of discharge
figures that may have been revised since publication of the
nationwide series of flood-frequency reports.

In order to allow us to follow a rather tight schedule for
corrections of stored information, please return the printout
sheets and your comments to A. Rice Green, Code:  4027-5402,
within two weeks from the date you receive them.

Requests for using the flood-peak data file to obtain low-Pearson
Type III frequency computations needed for the project of Planning
Surface Water Data Programs should be sent with the returned
printout sheets.  To assure that proper computations are
requested, the project chief will have to inspect the information
printout and select the input data for frequency computations.  In
the remainder of this memorandum instructions and guidelines are
given for selecting data for computations, for ordering
computations, for interpreting output from the frequency
computations, and for occasional adjusting of computed curves.

Frequency computations based on information in the flood data file
are obtained by means of a new computer program number C166,
"Retrieve Peak Flow Data for Log-Pearson Analysis," by W. L.
Isherwood.  This program takes information from the file, prints
the information as output, and then enters all the non-zero
discharges, or a user-specified part of the non-zero discharges,
into the "Log-Pearson Type III Analysis" program W4014.
Computations and output of program W4014 are described in the
preliminary Techniques of Water Resources Investigations manual
"Log-Pearson Type III Frequency Analysis by Computer," (Thomas,

Selection of discharges for computation

A most important consideration in ordering frequency computations
by program C166 is the selection of discharges to be used.
Ideally, we wish to select those discharges representing an
unbiased time sample for a period of homogeneous record.

In order to avoid bias, it is preferable to use in the
computations only those flood peaks that occurred during the time
a gage was in operation.  For example, if a gage was operated from
1930-67, but information search had established the stages and
discharges of the 1903 and 1913 floods, these historic peaks will
appear in the data file.  We should include only the 1930-67
discharges in the frequency computations.  The historic
information available for the 1903 and 1913 floods can be used in
a later analysis of the computer output.

A less clearcut case of bias is involved in the so-called
"outlier" problem, where a flood observed during the period of
gaging is known to be the largest in a much longer time span.  As
an example, the 1955 flood is known to be the largest in 100 years
at a New England site gaged since 1935.  We don't know for such
whether the 1955 flood should be included in frequency
computations; so it is suggested that the frequency computations
be performed twice for this flood record--once including the
extreme peak and once omitting it.  Later comparison of the two
outputs along with consideration of the historic information will
assist in defining the most useful frequency relation for the

At many sites the construction of reservoirs or of extensive urban
areas has significantly altered the flood characteristics during
the period of gaging.  In ordering flood-frequency computations
for the project of Planning Surface Water Data Programs only the
natural-flow portion of the record will be used.

Ordering computations

Log-Pearson Type III computations using program C166 are ordered
by entering coded instructions on any 80-column coding form, such
as 9-1633, 9-1633A, or 9-1506.  One line of the coding form
(representing one punch card) is required for each desired
computation.  Information to be coded is the station
identification number and the water years of the annual floods to
be used in computations when all floods are not used.  Information
is coded as follows:

Column 1 - blank

Columns 2-9 - the station identification number with the dash and
decimal omitted.

If no other information is coded, all non-zero discharges for the
station will be used in the frequency computations, regardless of
the number of breaks in the sequence of water years.

A segment of the available discharge record may be used by coding
as follows:

Columns 10-13 - the first water year of the segment to be used.

Columns 14-17 - the last water year of the segment to be used.

If one or two peaks are considered as "outliers," that is, they
are known to be the largest flood in a period of time far greater
than the period of gaging, these peaks may be omitted from
computation by coding in

Columns 18-21 - the water year of the first flood discharge to be
omitted from computations.

and in Columns 22-25 - the water year of the second flood
discharge to be omitted from computations.

Any number of computations may be ordered in one job, and each
computation requires about one second of CPU time ($0.20+).
Station numbers may be in random sequence; i.e., they need not be
arranged in sequential order.

Interpretation and adjustment of computed frequency curve

The flood-magnitude-frequency curve computed by program C166 is
based on the assumption that the flood discharges are a
representative time sample from a population of flood events
described by a log-Pearson type III probability distribution.
This assumption should be verified by three steps before accepting
the computed frequency curve as applicable to a site.

Step 1 of the verification is to check that all flood data during
the sample period was used in computations.  At some sites zero
flows were recorded and at some crest-stage gage sites the annual
peaks of a few years may not have reached the minimum recordable
stage, so that the discharge is known only to be less than some
specific value.  These discharges were not used in the
computations.  The computed frequency curve is based, therefore,
on conditional probabilities, and must be adjusted.  The
adjustment of conditional probabilities is made as explained by
Jennings and Benson in "Frequency curves for annual flood series
with some zero events or incomplete data" (Water Resources
Research, v. 5, no. 1, p. 276-280, 1969).  The adjustment is
simple and requires only that the computed conditional probability
of exceedance be multiplied by n/N, a ratio of the number of
annual flood discharges used to the total number of years of flood
record.  After this adjustment the desired even-valued
probabilities are no longer available and it is necessary to
replot a frequency curve with adjusted probabilities versus
discharge, and then pick off the discharges corresponding to the
desired even-valued probabilities or recurrence intervals.

Step 2 is to incorporate historical flood information into the
frequency relation.  The recurrence interval of the peaks with
historical information can be computed and added to the plot.  If
the analyst finds that this information would significantly alter
the computed curve, he should revise the upper part of the curve
graphically.  Comparison of curves computed with "outliers"
included and omitted often is useful for defining the final
frequency curve.

Step 3 is to visually judge the goodness-of-fit of the computed
curve to the data points on the plot.  In comparing goodness-of-
fit it should be kept in mind that sampling error always exists in
an individual station record; this is larger the shorter the
record.  Therefore we cannot expect a good fit for every record.
Goodness-of-fit should be judged on a regional basis rather than
on the basis of individual curves.  If the computed curves, for
some ranges of flow, are higher than the data for some stations
and lower than the data for others, and if this varies randomly,
the Pearson curves should be considered satisfactory.  If there is
a tendency for the curves to depart from the data consistently
over a whole state or over a large part of a state, this would be
sufficient reason to consider the Pearson fitting to be
unsatisfactory and to go to graphical fitting.  An exception may
exist for an individual curve that has a very low annual peak,
resulting in a poor fit (usually too low) at the upper end.  This
is adequate reason to redefine the curve graphically.

                                       Rolland W. Carter

WRD Distribution:  A, B, S, FO