# Obtaining Streamflow Statistics for Ungaged Sites

Estimates of streamflow statistics for ungaged sites can be obtained by two methods in StreamStats. The Watershed Delineation from a Point tool must be used first to obtain the drainage-basin boundary for the point of interest before either of the flow-estimation tools may be used. The Estimate Flows Using Regression Equations tool provides estimates by measuring needed basin characteristics and solving USGS-developed regression equations. The Estimate Flows Based on Similar Streamgaging Stations tool estimates streamflow statistics by applying the flows per unit area for streamflow statistics at a nearby gaging station to the drainage area for the ungaged site. Directions for use of both of these tools are provided on the User Instructions page.

## Estimate Flows Using Regression Equations Tool Output

When the *Estimate Flows Using Regression Equations* tool is used, StreamStats first measures whatever basin or climatic characteristics are used as explanatory variables in the regression equations that are available for the selected location. StreamStats then uses the National Streamflow Statistics (NSS) program to solve the equations. The NSS report by Ries (2006) provides a general description of the development and application of regression equations. The NSS Web site contains links to all reports that contain regression equations included in the software. The reports are also listed on the StreamStats introductory Web page for each State. These reports describe how the equations were developed and their limitations. Users should read and understand the limitations described in these reports before attempting to use the *Estimate Flows Using Regression Equations* tool to obtain flow estimates for ungaged sites.

The output from the *Estimate Flows Using Regression Equations* tool appears in a pop-up Web browser window. At the top is a banner identifying the output as a product of the USGS StreamStats program. The title, "StreamStats Ungaged Site Report" is below the banner. Following the title are several lines of text that give the processing date, the name of the state in which the ungaged site is located, the total drainage area, and the latitude and longitude for the site. Below this information is a series of two or more tables, described below.

Most states have been subdivided into hydrologic regions based on similarity of climate and physical characteristics, and regression equations have been developed separately for each region. The ungaged site reports list only the basin characteristics that are used in the regression equations for any hydrologic regions in which the site has drainage area.

The reports will always include at least one pair of tables; one for basin characteristics and one for streamflow statistics. One pair of tables will be provided for peak-flow statistics and the basin characteristics needed to solve the equations for peak-flow statistics. Another pair of tables will be provided for all other types of statistics and the basin characteristics needed to solve the equations for those statistics. Tables of basin characteristics are always presented before the tables of streamflow statistics.

## Basin Characteristics Tables:

- The top line in the table identifies the type of streamflow statistics for which the basin characteristics were measured.
- If the drainage basin for the ungaged site is within two or more regions, header lines appear above the basin characteristics listed for each region. This line contains the percentage of the basin area that is within the region, the name of the region and the drainage area, in square miles, in the region.
- The Parameter column gives short names for the basin characteristics, with units of measure shown in parentheses.
- The Value column contains the measured values of the basin characteristics. When a selected site has drainage area in multiple regions, the values shown are for the entire drainage area, not just the specific region.
- The Min and Max columns contain the minimum and maximum values of the basin characteristics that were measured for the streamgages that were used to develop the regression equations. Estimates of streamflow statistics for sites with basin characteristics that are not within the given minimum and maximum values have errors that are of unknown magnitude.
- In the example above, the mean basin elevation computed for the site is lower than the minimum value shown for region 2. Because of this, the message “below min value 2966.3” is shown along with the value in the Value column in the table. In addition, the message “Warning: some parameters are outside the suggested range. Estimates will be extrapolations with unknown errors” appears below the table.

## Streamflow Statistics Tables:

- The top line in the table identifies the hydrologic region for the selected site. If more than one type of statistic is available in the region, header lines are provided in the table for each statistic type, followed by lines for each statistic. If the drainage area for the site encompasses more than one hydrologic region, separate tables are presented for each region.
- The Statistic column provides the names of the statistics. Definitions for all basin characteristics and streamflow statistics also can be found on the Streamflow Statistics page. Names given in the output table correspond to the StatLabel field on the Streamflow Statistics page.
- The Flow field contains the estimated streamflow statistics. The values usually are in units of cubic feet per second.
- The third column will contain either the average standard errors of estimate, named Standard Error, or the average standard errors of prediction, named Prediction Error. Both errors are given in percent. Because percentage errors are not normally distributed, negative percentage errors tend to be smaller than positive percentage errors.
- The average standard error of estimate measures the average variation between the regression estimates and estimates derived from the station data for those stations used to develop the regression equations. About two-thirds of the regression estimates for the gaging stations used in the regression analyses have errors less than the average standard error of estimate. About one-third of the estimates have errors larger than the average standard error of estimate.
- The average standard error of prediction measures the average accuracy of the regression equations when predicting values for ungaged sites, which is the condition under which regression equations are most often applied for StreamStats. The average standard error of prediction is usually a few percent larger than the average standard error of estimate. About two-thirds of the regression estimates for ungaged sites will have errors less than the given average standard errors of prediction, and about one-third of estimates will have errors larger than the given standard errors of prediction.
- The equivalent years of record are shown in the fourth column, if available. These values indicate the length of time that a streamgaging station would need to be operated at the ungaged site to obtain an estimate of the streamflow statistic that is equal in accuracy to the estimate provided by the regression equation.
- The fifth and sixth columns contain the minimum and maximum values of the 90-percent prediction interval, if available. Ungaged sites with the same basin characteristics as the user-selected site will have actual flows that are within the given minimum and maximum values 90 percent of the time.
- Values in the fourth through sixth columns are not available for some equations in some regions. Indicators of errors will not be shown for estimates in a region when the values for any of the basin characteristics are outside of the Min and Max values shown in the Basin Characteristics tables.

## Area-Averaged Streamflow Statistics Tables:

StreamStats provides area-averaged estimates of streamflow statistics when the drainage basin for an ungaged site is in more than one region. The area-averaged estimates will appear below the basin characteristics tables and above the tables of estimates for individual regions.

- The information shown in the area-averaged table is very similar to a normal streamflow statistics table.
- The estimates are obtained by multiplying the estimated flow for each region by the drainage area for each region, summing these values, and then dividing by the total drainage area. Prediction errors and equivalent years of record are computed by the same weighting method.
- Prediction errors and equivalent years of record will be provided for area-averaged estimates only if all regions have this information available.

## Obtaining Estimates for User-Selected Sites With Drainage Area in More Than One State

Flow estimates obtained from regression equations for watersheds that span state boundaries may give different results depending on which state’s equations are used. Each state’s regression equations typically are applicable only within the state for which the equations were developed. Ries (2006, p. 8) indicates that in cases where a delineated watershed has area in multiple states, flow estimates should be determined using the regression equations for each state, and then final estimates should be determined by weighting the separate sets of flow estimates according to the proportion of the drainage area that is in each state. However, because of programing and data limitations, StreamStats typically will only provide estimates using the regression equations for the state in which the selected site is located. In cases where StreamStats is available for each state, it may be possible to determine weighted estimates by use of the following process:

- Determine the state in which the site of interest is located, and using that state’s application, (a) select the site of interest, (b) delineate the drainage basin using the Watershed Delineation from a Point tool, (c) obtain flow estimates using the Estimate Flows Using Regression Equations tool, and (d) then save the output.
- Open the application for the upstream state, and then follow the same process as in step 1, except select the point for delineation just upstream from where the stream of interest crosses the state border.
- It will be necessary to adjust the estimates that were determined using the upstream state’s application to represent the full drainage area at the initial point of interest. Such adjustments may be possible by one of the following approaches:
- If the output from the downstream state’s application has provided all of the basin characteristics needed to solve the upstream state’s regression equations, then use the Edit Parameters and Recompute Flow tool for the upstream state’s application, changing the computed basin characteristics to be those from the output for the downstream state.
- If the output from the downstream state’s application has not provided all of the basin characteristics needed to solve the upstream state’s regression equations, then determine if the downstream state’s Basin Characteristics tool will provide the additional basin characteristics needed to solve the upstream states’ equations. If so, compute the additional basin characteristics using the downstream state’s application, and then use the Edit Parameters and Recompute Flow tool for the upstream state’s application, as described above.
- If not all of the basin characteristics needed to solve the regression equations for the upstream state are available from the downstream state’s application, and if the proportion of the drainage area that is in the downstream state is small, then it may be reasonable to use the values of some basin characteristics that were obtained from the upstream state’s application to determine the final weighted estimates. For example, if the upstream state’s application requires the percent forest, but the downstream state’s application does not provide it, then if the area in the downstream state is small and the percent forest within the downstream state appears similar to that for the upstream state, then when using the Edit Parameters and Recompute Flow tool for the upstream state, edit the drainage area to be that from the output for downstream state, but do not edit the percent forest before recomputing the flow estimates.
- If none of the above three conditions exists, then it will not be possible to weight the flow estimates from the two state applications.
- If it was possible to adjust the estimates that were determined using the upstream state’s application to represent the full drainage area at the initial point of interest, then manually weight the adjusted flow estimate from the upstream state’s application with those from the downstream state’s application according to the proportion of the total drainage area that is in each state. An example manual computation is provided by Ries (2006, p. 8).

*Estimate Flows Based on Similar Streamgaging Stations* Tool Output:

The *Estimate Flows Based on Similar Streamgaging Stations* tool uses stream-network navigation to search upstream and downstream from a user-selected ungaged site to identify streamgages along the same stream or its upstream tributaries. The drainage-area ratio is computed for all streamgages identified in the search by dividing the drainage area for the streamge by the drainage area for the ungaged site. Normally, the method is applied only if the drainage area ratio is between 0.5 and 1.5, but the ratios can be set differently for each state if information is available to support changing them. The equation used to determine the drainage-area ratio (DAR) estimates, modified from Ries (2006), is:

**Q**

_{u}= (A_{u}/A_{g})^{b}Q_{g}where Q_{u} is the estimated flow statistic for the ungaged site,
A_{u} is the drainage area for the ungaged site,
A_{g} is the drainage area for the streamgage,
Q_{g} is the flow statistic for the streamgage, and
b, depending on the state, may be the exponent of drainage area from the appropriate regression equation, a value determined by the author of the state report, or 1 where not defined in the state report. In the above equation, the drainage area for the streamgage, A_{g}, typically is the value that is stored in NWIS-Web site information database.

The flow statistic for the streamgaging station, Q_{g}, may be computed from the systematic record for the station or it may be a weighted estimate that combines the estimate from the systematic record with an estimate obtained from a regression equation. The NSS report by Ries (2006) explains how weighted estimates for streamgages can be computed. StreamStats does not compute these weighted estimates, but if weighted estimates were computed previously and stored in the StreamStats database, then StreamStats can use them to compute the DAR estimates.

StreamStats will generate DAR estimates based on both the closest upstream and downstream streamgage if both of the streamgages have drainage-area ratios within the set limits, usually 0.5 to 1.5. For any flow statistics that were estimated based on drainage-area ratios from both streamgages, StreamStats will then obtain final weighted estimates for the ungaged site based on linear interpolation between the streamgages.

The output from the *Estimate Flows Based on Similar Gages* tool appears in a pop-up Web browser window. Across the top is a banner identifying the output as a product of the USGS StreamStats program. The title, "Flow estimates based on flows at nearby streamgages" is below the banner. Following the title is the processing date, the name of the state in which the ungaged site in located, and its latitude, longitude, the NHD reach code and measure, the total drainage area, and an indication of whether or not regulated streamgages were allowed to be used in the estimation process. Below this information is a series of two or more tables.

- Streamgages located upstream and downstream from the ungaged site will be shown in separate tables, sorted by drainage area. The USGS station numbers, station names, drainage areas, drainage-area ratios, and a field indicating whether flows at the station are regulation are provided in the tables. The station numbers contain hyperlinks that access the NWIS-Web pages for the stations. The drainage-area ratios are computed by dividing the drainage area for the ungaged site by the drainage area for the streamgage.
- If a gaging station was found within the set drainage-area ratio limits, then a table of estimated flow statistics is provided. The table contains columns for flow statistic labels, brief definitions, flow factors (reciprocals of drainage-area ratios), the flows at the streamgage, estimated ungaged flows, and the years of record used to compute the statistics at the streamgage, if available. The tables are separated into groups of statistics based on statistic type, such as peak-flow statistics and flow-duration statistics.
- If estimates for the ungaged site were determined from regression equations using the Estimate Flows Using Regression Equations tool before the Estimate Flows Based on Similar Gages tool was used, then a table of weighted estimates will be provided. The table contains columns for flow statistic labels, brief definitions, regression-based flow estimates, DAR estimates, weighted estimates, and weighted equivalent years of record. The weighted equivalent years of record will only be provided if both the years of record are available for the streamgage statistics and the equivalent years of record are available for the regression equation estimates. The methods for computing the weighted estimates and weighted equivalent years of record are described on page 9 of the report by Ries (2006).
- If both an upstream station and a downstream station have drainage-area ratios that are between 0.5 and 1.5, then tables of estimates will be provided based on the flows at each station, and a separate table will provide final DAR estimates determined by weighting the separate estimates by the relative proximity of the streamgages to the ungaged site. Equations 24 and 25 from the report by Ries and Dillow (2006) are used to determine the weighted flow estimates and weighted equivalent years of record, respectively.

**NOTICE 1:** Reports with regression equations for some states recommend the use of different weighting methods than those that are used in StreamStats, although the results usually are very similar. StreamStats users should refer to the individual reports to determine if different weighting methods should be used. In some cases, report authors have provided spreadsheets or programs in which the basin characteristics from StreamStats can be inserted and estimates can be obtained according to the methods that are described in those reports. Links to the applicable reports are provided on the StreamStats introductory page for each individual state.

**NOTICE 2:** Current StreamStats programming limitations prevent the computation of flow estimates based on the drainage-area ratio for an upstream or downstream gage to be done precisely in the manner as described by Ries (2006). The limitations primarily affect peak-flow estimates. Peak-flow statistics that are computed from the systematic records are labeled in StreamStats as PK1_25 to PK500, for peaks at the 1.25- to 500-year recurrence intervals (80 to 0.2 percent exceedance probabilities). The corresponding weighted peak-flow statistics are labeled as PK1_25W to PK500W. Theoretically, the best possible DAR estimate of a flow statistic would be determined by use of the above equation and the weighted flow estimate from the nearby streamgage. For example, PK100W should be used instead of PK100. However, the publication of weighted estimates of streamflow statistics for the streamgages has only recently become common practice in USGS reports. As a result, StreamStats programming simply extracts all available streamflow statistics for a streamgage from a database and then applies the above equation to determine the DAR estimates for the ungaged site. The b exponents are stored in the database with the regression equations, not with the computed streamflow statistics for the streamgages. In computing the DAR estimates, the StreamStats program checks to determine if a b exponent is stored with the regression equation for each flow statistic, and if so, then the exponent is used in the above equation; one (1) is used otherwise. In the case of peak-flow statistics, the dependent variables in the equations are labeled as PK1_25 to PK500, which correspond to the labels for the systematic rather than the weighted estimates for the streamgages. As a result, the b exponents are applied only to the peak-flow estimates from the systematic records, and not to the weighted estimates, where an exponent of one (1) is used automatically. Modifications to the StreamStats program are planned to assure that the correct exponents are applied when computing the weighted estimates, and to provide flexibility in choosing which upstream or downstream streamgage to use as the basis for estimating flows at the ungaged site when more than one upstream or downstream streamgage has a drainage-area ratio that is between 0.5 and 1.5. The National Streamflow Statistics (NSS) program may be used to determine correct weighted peak-flow estimates until the modifications are completed to the StreamStats program.