Link to USGS home
Surface Water Information--Data Scripts

This script is not yet supported (and may never be), but is provided in the interim because many users may find it valuable. Questions about this script should be referred to the script author. The status of the script is noted.

hydcheck

Name:
hydcheck - An Excel based utility for reviewing and estimated daily values discharge data by hydrographic and statistical comparison with other stations.

Status:
Still in development but we are holding additional changes for several months in order to get user feedback. Comments and suggestions welcome!

General Description:

This is not a script but rather an Excel workbook for use in reviewing daily discharge data by hydrographic comparison. For a given station the workbook will -

  1. Find the 10 most statistically correlated index stations for that station.
  2. Use multiple regression techniques to estimate flows for the selected station as a function of flows (including lagged flows) at the index station of choice from the 10 most correlated index stations. Periods affected by ice can be excluded from the regression.
  3. Plot flows at the selected station against those predicted by the regression and the 95 % confidence limits of those predictions.

The spreadsheet provides the user several options including:

The following worksheets are included in the Excel workbook:

Other worksheets are included primarily for internal use within hydcheck computations.

History:
This program is something that Lamar Sanders (South Carolina WSC) developed in retirement. It is similar to parts of his Hydcomp SAS program, but this one works in Microsoft Excel on a desktop and is easier to install and use. It is part of the package of scripts used in every Surface Water Review.

Installation:
- Download the zip file and unzip. It will unzip into two files - hydcheck.xls and hydpull
- move hydcheck.xls to the directory of your choice on your PC
- move hydpull to your NWIS Sun machine in the directory of your choice and make it executable (chmod +x hydpull)

Operation:
Hydcheck is intended to be primarily self explanatory. General operational instructions are included below, but can also be reviewed within the workbook itself by looking at the notes included each worksheet as test or cell comments.

Step 1. On the NWIS Sun, run hydpull to retrieve NWIS data for input into hydcheck. As shown below, station lists can be automatically retrieved from all those configured for real-time discharge in NWISWeb or from a list supplied by the user, or both. You can also get instructions for hydpull by hovering over the "HOW TO GET HYDPULL FILES." cell (E1) in the "input data" worksheet in the hydpull.xls workbook. Hydcheck can use up to 600 stations and 21 years of daily values record. At least 5 years of data prior to the period of analysis are recommended.

Usage: hydpull begindate enddate retrieval_flag sitefile
where:
begindate is initial retrieval date (YYYYMMDD)
enddate is ending retrieval date (YYYYMMDD)
retrieval flag is 0 or 1 (default=0)
0 = Use site list of all discharge sites on NWISWeb, in addition to sitefile if provided.
1 = Use only sitefile provided
sitefile is an optional tab delimted file of stations in the format:
    stationID    source    DD
where:
  'station ID' is 8-15 digit station ID
  'source' is N for NWIS and W for NWISWeb
  'DD' is the DD in NWIS (not needed when source=W)
Data from the automated site list is retrieved directly from NWIS (more effecient). Data from the optional sitefile is retreived from either NWIS or NWISWeb as desired, which allows for use of sites not in the local database.

The following files will be created in the same Sun directory as hydpull and should be moved to the PC directory containing hydcheck.
hydpull.stinfo.out - site file info (name and drainage area)
hydpull.file.count - a count of how many daily value and Qm files exist
hydpull1.out - daily values for first 200 stations
hydpull2.out - daily values for second 200 stations (if necessary)
hydpull3.out - daily values for third 200 stations (if necessary)
hydpull4.out - daily values for fourth 200 stations (if necessary) - note: currently hydcheck only supports up to 600 stations!
hydpull.Qm.1out - discharge measurements for first 200 stations
hydpull.Qm.2out - discharge measurements for second 200 stations (if necessary)
hydpull.Qm.3out - discharge measurements for third 200 stations (if necessary)
hydpull.Qm.4out - discharge measurements for fourth 200 stations (if necessary) - note: currently hydcheck only supports up to 600 stations!

Step 2. Launch hydcheck.xls. As noted above the data files must reside in the same directory as the workbook.

Step 3. Enter WSC specific information as needed in the yellow highlighted fields on the "input data" (first) worksheet. For more detailed instructions, look at the "hidden" notes by hovering over the "HOW TO PLOT SEPARATE STAS." cell (E2).

Step 4. Click "Button 1" to load the data into hydcheck. This will take several seconds to over a minute to complete.

Step 5. Choose a station to review using the drop-down list and click "Button 2" to generate the 10 best station correlations. Note that the first station loaded is already run at the time "Button 1" is pressed so no action need be taken to review the first station.

Step 6. Choose the correlation station of choice using the drop-down list and click "Button 3" to generate the resulting output plots. Note that the best correlation station is already selected and run whenever "Button 2" is pressed so "Button 3" is only needed when a different index station is desired.

Step 7: Review the output for the selected station-index combination.

Alternately, to more efficiently review all of the stations loaded in one session, the following process can be used (also shown in the notes to the "HOW TO PLOT WHOLE DISTRICT." cell).

Step 1 -4. The same as above.

Step 5: Switch to the "hydcheck_all_stas" worksheet and click "Button 4." This action may take anywhere from 15 minutest to 4 hours to complete.

Step 6. Return to the "input data" sheet and use the "first plot" "next plot" and "previous plot" buttons to scroll the output from study station to study station. The stations will be plotted in the order of the number of days outside the correlated 95% confidence limits, with the worst station plotted first. When looking at a given station it is always initially plotted with the best correlation station, but the user can still switch the index station from the index-station dropdown list and use "Button 3" to see those results, before using the "next" or "previous" buttons to move to a different study station.

Step 7: Review the measurement information in the "meas_counter" worksheet (only available after using "Button 4") for potential problems regarding lack of sufficient discharge measurements.

To estimate missing record using hydcheck, see the instructions in the notes to cell E5 "HOW TO ESTIMATE MISSING DAILY VALUE DISCHARGES" in the "input data" worksheet. The Kalman smoothing method is used to smooth the estimated data into the recorded data at both ends of the data gap.

Interpretation:
Much could be written about how to interpret or use the hydcheck output. At a basic level, however, the intent is to identify stations that have correlated well in the past but do not do so in the current period. This is seen whenever the current data plots outside the 95% confidence intervals of the regression for any extended length of time. When this occurs it shows that something has changed in the current period that affects the correlation. That change could be a natural process unrelated to the records process itself, but it could also (and more likely will) point to a problem with the records themselves, such as a bad or poorly-timed shift or data correction, improper rating, or poorly estimated period. Note that the it will not be immediately clear whether the problem is at the current study station or the index station. It could be either. The easiest way to check that is to change to one or more other index stations for the same study station. If all index stations for a study station show the same issue then it is likely the study station that has then problem. However if only one index station shows the problem then it is likely a problem at that index station itself and the study station can be switched to that index station for further exploration.

The residual plots and other worksheets can be used to evaluate the appropriateness of the index site used or to further evaluate the potential problem. but that discussion is beyond the scope of this already overly long writeup. This is just meant to get you started!

Program Language:
hydcheck: VBA and Excel Macros
hydpull: ksh

Requirements:

A decent desktop with Excel 2007 (Excel 2003 will not work) with macros enabled.
Read access to ADAPS to get needed data with hydpull.

Author(s): (for unsupported scripts, please address question to the author)
Hydcheck: Lamar Sanders (LAMARSANDERS823@sc.rr.com), South Carolina WSC (Retired);
Hydpull: Joe Nielsen (jnielsen@usgs.gov) Office of Surface Water

Known Problems/Limitations/Need Improvements:
- We need to add a bit more documentation, but you can figure it out! Email any questions.
- Its a biiiig spreadsheet!

Scripts:

Hydcheck Version 8/30/10
Hydpull Version 1.4 (3/22/10) (Posted 3/24/10):
- Fixed bug retrieving data from NWISWeb (Hydcheck v 8/30/10)
- Fixed issue with Hydpull retrievals from NWISWeb (v1.4)
hydcheck download (Excel file = 6.3M)
hydcheck download (Zip file = 6.2K)

Back to Scripts
[an error occurred while processing this directive]