Technical Resources for Regionalization Studies
For questions and comments, contact email:
h2osoft@usgs.gov.
USGS regionalization studies typically rely on multiple linear
regression to develop equations that can be used to estimate flow
statistics at ungaged locations. Regression estimates can also be used
to improve estimates of flow statistics at streamgages by taking a
weighted average of the at-site statistic with the regression estimate.
This webpage provides links to various resources. For additional information, please contact us at h2osoft@usgs.gov.
Policy Memos
SW 11.03 Regionalization studies - resources and suggestions
SW 10.05 Weighted estimates of peak flow frequency analysis (WIE Program)
SW 10.04 Availability of the Weighted multiple linear REGression (WREG) program (version 1.0) and User's Guide
SW09.03 Selection of basin characteristic and streamflow statistic labels for regionalization studies
SW08.04 Availability of the National Streamflow Statistics (NSS) program (Version 4.0) and accompanying documentation
SW 05.06 StreamStats Advisory Committee and Water Science Center Requirements for StreamStats Proposal Preparation
SW 02.05 Availability of StreamStats informational web page and StreamStatsDB data base
SW 95.02
Announces the availability of WRIR 94-4002, "Nationwide summary of U.S.
Geological Survey regional regression equations for estimating
magnitude and frequency of floods for ungaged sites." The report
contains a PC diskette of version 1.1 of the USGS National Flood
Frequency program, including regression equations and procedures for
estimating flood hydrographs.
SW 93.08
Makes recommendations for use of retransformation methods in regression
models used to estimate sediment loads ("The bias correction problem").
Corrections from SW 93.13 are incorporated here.
SW 89.11
Announces the availability of generalized least squares regression
program for regionalization of low flow characteristics. Recommends low
flow regionalization techniques and describes techniques for undertaking
the analysis.
Software
WREG: http://water.usgs.gov/software/WREG/
The Weighted multiple linear Regression (WREG) program is currently
recommended for USGS regionalization studies. It allows the user to set
up OLS, WLS, and GLS regression equations using text input files. It
adds to the capabilities offered by GLSNET by providing routines for
weighted least squares and for incorporating uncertainty in the skew in
generalized least squares regressions.
WREGShell: Improves input/output handling; WREGShell2013-forWREGv1.05.zip / WREGShellReadMe.v2013.pdf
GLSNET: http://water.usgs.gov/software/GLSNet/
GLSNET was developed by Gary Tasker in the early 1990's (?) and uses WDM
files for input and data management. Identical results are produced by
WREG and GLSNET using the GLS option. GLSNET is being phased out
because the new WREG program has additional capabilities and is easier
for OSW to maintain.
STREAMSTATS: http://water.usgs.gov/osw/streamstats/
StreamStats is a Web-based Geographic Information System (GIS) that
allows users to easily obtain streamflow statistics, drainage-basin
characteristics, and other information for user-selected sites on
streams. StreamStats users can choose locations of interest from an
interactive map and obtain information for these locations. StreamStats
provides a web-based interface for using regression equations to
estimate streamflow statistics.
NSS: http://water.usgs.gov/osw/programs/nss/ or http://water.usgs.gov/software/NSS/
WIE - Weighted Independent Estimates:
ReadMe ||
FAQ ||
Version Info ||
Instructions ||
WIE Distribution Zip (all files included)
Training Classes
SW1523 Regionalization of Surface Water Statistics
covers multiple linear regression techniques and use of the WREG
software program. (Please contact us for information on the next
scheduled offering of this class.)
Suggested Report Outline
OSW has prepared a suggested report outline that includes information
that is important to include in a report. This outline can be used as a
starting point for individual study reports.
Sample Workplan
As suggested in OSW Technical Memo 2011.03,
a formal workplan can be a helpful tool for project planning. By
scheduling a review of the workplan, potential issues can be identified
early in the project.
Symbol Glossary
Symbol Glossary (word document)
FAQs
- How do I transform my regression equation from log units into real units?
See file Logarithms and Exponents (word document)
- How do I convert from log variances to percent?
See file Variance Percent-to-Log (pdf file)
- How do I choose between OLS, WLS, and GLS?
Ordinary least squares (OLS) is the simplest of these regression methods and treats every observation in the analysis equally. In most regionalization studies, the observations actually vary in quality. Some streamgages will have a longer periods of record than others, and statistics calculated from long records are generally more accurate. Weighted least squares (WLS) can account for these differences by giving more weight to observations for which we have high confidence and low weight to those for which we have low confidence. The default WLS used by WREG implements WLS specifically for frequency statistics. The user-defined WLS is more general and can be used for any type of statistic, but requires that the user develop the weighting matrix.
Hydrologic records are often cross-correlated because nearby basins experience similar meteorological conditions. Properly accounting for this cross-correlation avoids giving too much weight to neighboring sites that are not independent. This can be done with generalized least squares (GLS), as implemented for peak flow frequency statistics by Jery Stedinger and Gary Tasker and programmed into GLSNET and WREG.
The bottom line:
GLS (WREG or GLSNET): Use for frequency statistics where there is a reasonable relationship between distance and correlation between sites. GLS is almost always preferred for peak flow frequency statistics.
WLS (WREG default): Use for frequency statistics when there is an ill-defined or unknown relationship between distance and correlation between sites. This is often the case for low-flow frequency statistics.
WLS (WREG user-defined): Use for non-frequency statistics with varying record length. For example, flow duration type statistics.
OLS (WREG, SPlus, other stats tools): Generally, should only be used if similar record lengths are used to calculate the statistics. OLS is generally the least preferred option.
- Should nested basins be included when developing regression equations?
Basins on the same stream or river system, where the smaller basin is completely contained, or nested, within the larger basin, can contain redundant information. This redundancy can negatively affect the regression analysis, so redundancy of information in nested basins must be considered when selecting streamgages for inclusion in the regression analysis. If two streamgages are immediately upstream or downstream from one another and drain similar basin areas, they should not both be included in the development of a regression equation. The further apart the streamgages are, the more dissimilar the basin attributes and flow are likely to become. A headwaters streamgage may be quite different from a mainstem streamgage in terms of both flow and basin attributes, so both could be included in development of a regression equation.
Some judgment is required in identifying closely related, redundant basins. One possible criterion is to include streamgages on the same river system only if their drainage areas differ by at least a factor of 2. (The smaller basin would be no more than 50% of the larger basin.) More stringent criteria are also reasonable, but less stringent criteria are not recommended. The simplest method for deciding which streamgage to retain in the analysis is to choose the one with the longest period of record. Other factors may also be considered, such as which streamgage best extends the range of basin attributes used in the analysis.
- What are USGS naming conventions for regression variables, and do I have to use them in the report?
Office of Surface Water Technical Memorandum No. 2009.03 requires the use of consistent naming conventions for basin characteristics and streamflow statistics that are published in reports that contain regional regression equations. This practice is intended to eliminate conflicts between the names and definitions of basin characteristics and streamflow statistics in different regionalization reports. The Memorandum states that report authors must refer to the official listings that can be found at http://water.usgs.gov/osw/streamstats/def.html to determine the names to be used in their reports. Although the official names must be identified in the reports, those names often are too long to conveniently use in listings of the regression equations. As a result, it is acceptable to use a shortened name in the report so long as the official name is identified, as well. This can be done by including the names used in the report as well as the official names in separate columns in a table that describes the streamflow statistics and (or) basin characteristics. Alternately, the official names can be identified in the text at the locations where the shortened names are first defined. An example of this latter approach is:
Regression equations were developed for estimating average 7-day and 30-day low-flows that occur, on average, once in 2, 5, 10, and 20 years. Standard USGS naming conventions for these statistics follow the pattern MdDyY, where d is the number of days and y is the number of years. For brevity, however, a shortened naming convention of dQy was used in this report. For example, the standard USGS naming convention for the 7-day, 10-year low flow is M7D10Y, whereas the name used in this report is 7Q10.
Contact Info
Julie Kiang
Office of Surface Water
jkiang@usgs.gov
703-648-5364
|